CONFIDENCE LEVELS AND/VS. STATISTICAL HYPOTHESIS TESTING IN STATISTICAL ANALYSIS. CASE STUDY
Directory of Open Access Journals (Sweden)
ILEANA BRUDIU
2009-05-01
Full Text Available Estimated parameters with confidence intervals and testing statistical assumptions used in statistical analysis to obtain conclusions on research from a sample extracted from the population. Paper to the case study presented aims to highlight the importance of volume of sample taken in the study and how this reflects on the results obtained when using confidence intervals and testing for pregnant. If statistical testing hypotheses not only give an answer "yes" or "no" to some questions of statistical estimation using statistical confidence intervals provides more information than a test statistic, show high degree of uncertainty arising from small samples and findings build in the "marginally significant" or "almost significant (p very close to 0.05.
Normality Tests for Statistical Analysis: A Guide for Non-Statisticians
Ghasemi, Asghar; Zahediasl, Saleh
2012-01-01
Statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it. The aim of this commentary is to overview checking for normality in statistical analysis using SPSS. PMID:23843808
Shaikh, Masood Ali
2017-09-01
Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.
Operational statistical analysis of the results of computer-based testing of students
Directory of Open Access Journals (Sweden)
Виктор Иванович Нардюжев
2018-12-01
Full Text Available The article is devoted to the issues of statistical analysis of results of computer-based testing for evaluation of educational achievements of students. The issues are relevant due to the fact that computerbased testing in Russian universities has become an important method for evaluation of educational achievements of students and quality of modern educational process. Usage of modern methods and programs for statistical analysis of results of computer-based testing and assessment of quality of developed tests is an actual problem for every university teacher. The article shows how the authors solve this problem using their own program “StatInfo”. For several years the program has been successfully applied in a credit system of education at such technological stages as loading computerbased testing protocols into a database, formation of queries, generation of reports, lists, and matrices of answers for statistical analysis of quality of test items. Methodology, experience and some results of its usage by university teachers are described in the article. Related topics of a test development, models, algorithms, technologies, and software for large scale computer-based testing has been discussed by the authors in their previous publications which are presented in the reference list.
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.
Lin, Johnny; Bentler, Peter M
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
[The research protocol VI: How to choose the appropriate statistical test. Inferential statistics].
Flores-Ruiz, Eric; Miranda-Novales, María Guadalupe; Villasís-Keever, Miguel Ángel
2017-01-01
The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
The research protocol VI: How to choose the appropriate statistical test. Inferential statistics
Directory of Open Access Journals (Sweden)
Eric Flores-Ruiz
2017-10-01
Full Text Available The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
Analysis of Preference Data Using Intermediate Test Statistic Abstract
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... West African Journal of Industrial and Academic Research Vol.7 No. 1 June ... Keywords:-Preference data, Friedman statistic, multinomial test statistic, intermediate test statistic. ... new method and consequently a new statistic ...
Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
Improved Test Planning and Analysis Through the Use of Advanced Statistical Methods
Green, Lawrence L.; Maxwell, Katherine A.; Glass, David E.; Vaughn, Wallace L.; Barger, Weston; Cook, Mylan
2016-01-01
The goal of this work is, through computational simulations, to provide statistically-based evidence to convince the testing community that a distributed testing approach is superior to a clustered testing approach for most situations. For clustered testing, numerous, repeated test points are acquired at a limited number of test conditions. For distributed testing, only one or a few test points are requested at many different conditions. The statistical techniques of Analysis of Variance (ANOVA), Design of Experiments (DOE) and Response Surface Methods (RSM) are applied to enable distributed test planning, data analysis and test augmentation. The D-Optimal class of DOE is used to plan an optimally efficient single- and multi-factor test. The resulting simulated test data are analyzed via ANOVA and a parametric model is constructed using RSM. Finally, ANOVA can be used to plan a second round of testing to augment the existing data set with new data points. The use of these techniques is demonstrated through several illustrative examples. To date, many thousands of comparisons have been performed and the results strongly support the conclusion that the distributed testing approach outperforms the clustered testing approach.
A statistical test for outlier identification in data envelopment analysis
Directory of Open Access Journals (Sweden)
Morteza Khodabin
2010-09-01
Full Text Available In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the presented method, each observation is deleted from the sample once and the resulting linear program is solved, leading to a distribution of efficiency estimates. Based on the achieved distribution, a pared test is designed to identify the potential outlier(s. We illustrate the method through a real data set. The method could be used in a first step, as an exploratory data analysis, before using any frontier estimation.
Modified Distribution-Free Goodness-of-Fit Test Statistic.
Chun, So Yeon; Browne, Michael W; Shapiro, Alexander
2018-03-01
Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62-83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices.
Statistical testing and power analysis for brain-wide association study.
Gong, Weikang; Wan, Lin; Lu, Wenlian; Ma, Liang; Cheng, Fan; Cheng, Wei; Grünewald, Stefan; Feng, Jianfeng
2018-04-05
The identification of connexel-wise associations, which involves examining functional connectivities between pairwise voxels across the whole brain, is both statistically and computationally challenging. Although such a connexel-wise methodology has recently been adopted by brain-wide association studies (BWAS) to identify connectivity changes in several mental disorders, such as schizophrenia, autism and depression, the multiple correction and power analysis methods designed specifically for connexel-wise analysis are still lacking. Therefore, we herein report the development of a rigorous statistical framework for connexel-wise significance testing based on the Gaussian random field theory. It includes controlling the family-wise error rate (FWER) of multiple hypothesis testings using topological inference methods, and calculating power and sample size for a connexel-wise study. Our theoretical framework can control the false-positive rate accurately, as validated empirically using two resting-state fMRI datasets. Compared with Bonferroni correction and false discovery rate (FDR), it can reduce false-positive rate and increase statistical power by appropriately utilizing the spatial information of fMRI data. Importantly, our method bypasses the need of non-parametric permutation to correct for multiple comparison, thus, it can efficiently tackle large datasets with high resolution fMRI images. The utility of our method is shown in a case-control study. Our approach can identify altered functional connectivities in a major depression disorder dataset, whereas existing methods fail. A software package is available at https://github.com/weikanggong/BWAS. Copyright © 2018 Elsevier B.V. All rights reserved.
The insignificance of statistical significance testing
Johnson, Douglas H.
1999-01-01
Despite their use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.
Common pitfalls in statistical analysis: Understanding the properties of diagnostic tests - Part 1.
Ranganathan, Priya; Aggarwal, Rakesh
2018-01-01
In this article in our series on common pitfalls in statistical analysis, we look at some of the attributes of diagnostic tests (i.e., tests which are used to determine whether an individual does or does not have disease). The next article in this series will focus on further issues related to diagnostic tests.
Statistical analysis and planning of multihundred-watt impact tests
International Nuclear Information System (INIS)
Martz, H.F. Jr.; Waterman, M.S.
1977-10-01
Modular multihundred-watt (MHW) radioisotope thermoelectric generators (RTG's) are used as a power source for spacecraft. Due to possible environmental contamination by radioactive materials, numerous tests are required to determine and verify the safety of the RTG. There are results available from 27 fueled MHW impact tests regarding hoop failure, fingerprint failure, and fuel failure. Data from the 27 tests are statistically analyzed for relationships that exist between the test design variables and the failure types. Next, these relationships are used to develop a statistical procedure for planning and conducting either future MHW impact tests or similar tests on other RTG fuel sources. Finally, some conclusions are given
Kanji, Gopal K
2006-01-01
This expanded and updated Third Edition of Gopal K. Kanji's best-selling resource on statistical tests covers all the most commonly used tests with information on how to calculate and interpret results with simple datasets. Each entry begins with a short summary statement about the test's purpose, and contains details of the test objective, the limitations (or assumptions) involved, a brief outline of the method, a worked example, and the numerical calculation. 100 Statistical Tests, Third Edition is the one indispensable guide for users of statistical materials and consumers of statistical information at all levels and across all disciplines.
Statistical Analysis of the Polarimetric Cloud Analysis and Seeding Test (POLCAST) Field Projects
Ekness, Jamie Lynn
The North Dakota farming industry brings in more than $4.1 billion annually in cash receipts. Unfortunately, agriculture sales vary significantly from year to year, which is due in large part to weather events such as hail storms and droughts. One method to mitigate drought is to use hygroscopic seeding to increase the precipitation efficiency of clouds. The North Dakota Atmospheric Research Board (NDARB) sponsored the Polarimetric Cloud Analysis and Seeding Test (POLCAST) research project to determine the effectiveness of hygroscopic seeding in North Dakota. The POLCAST field projects obtained airborne and radar observations, while conducting randomized cloud seeding. The Thunderstorm Identification Tracking and Nowcasting (TITAN) program is used to analyze radar data (33 usable cases) in determining differences in the duration of the storm, rain rate and total rain amount between seeded and non-seeded clouds. The single ratio of seeded to non-seeded cases is 1.56 (0.28 mm/0.18 mm) or 56% increase for the average hourly rainfall during the first 60 minutes after target selection. A seeding effect is indicated with the lifetime of the storms increasing by 41 % between seeded and non-seeded clouds for the first 60 minutes past seeding decision. A double ratio statistic, a comparison of radar derived rain amount of the last 40 minutes of a case (seed/non-seed), compared to the first 20 minutes (seed/non-seed), is used to account for the natural variability of the cloud system and gives a double ratio of 1.85. The Mann-Whitney test on the double ratio of seeded to non-seeded cases (33 cases) gives a significance (p-value) of 0.063. Bootstrapping analysis of the POLCAST set indicates that 50 cases would provide statistically significant results based on the Mann-Whitney test of the double ratio. All the statistical analysis conducted on the POLCAST data set show that hygroscopic seeding in North Dakota does increase precipitation. While an additional POLCAST field
Statistical power analysis a simple and general model for traditional and modern hypothesis tests
Murphy, Kevin R; Wolach, Allen
2014-01-01
Noted for its accessible approach, this text applies the latest approaches of power analysis to both null hypothesis and minimum-effect testing using the same basic unified model. Through the use of a few simple procedures and examples, the authors show readers with little expertise in statistical analysis how to obtain the values needed to carry out the power analysis for their research. Illustrations of how these analyses work and how they can be used to choose the appropriate criterion for defining statistically significant outcomes are sprinkled throughout. The book presents a simple and g
DEFF Research Database (Denmark)
Jones, Allan; Sommerlund, Bo
2007-01-01
The uses of null hypothesis significance testing (NHST) and statistical power analysis within psychological research are critically discussed. The article looks at the problems of relying solely on NHST when dealing with small and large sample sizes. The use of power-analysis in estimating...... the potential error introduced by small and large samples is advocated. Power analysis is not recommended as a replacement to NHST but as an additional source of information about the phenomena under investigation. Moreover, the importance of conceptual analysis in relation to statistical analysis of hypothesis...
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
A Statistical Toolkit for Data Analysis
International Nuclear Information System (INIS)
Donadio, S.; Guatelli, S.; Mascialino, B.; Pfeiffer, A.; Pia, M.G.; Ribon, A.; Viarengo, P.
2006-01-01
The present project aims to develop an open-source and object-oriented software Toolkit for statistical data analysis. Its statistical testing component contains a variety of Goodness-of-Fit tests, from Chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Goodman, Fisz-Cramer-von Mises, Kuiper, Tiku. Thanks to the component-based design and the usage of the standard abstract interfaces for data analysis, this tool can be used by other data analysis systems or integrated in experimental software frameworks. This Toolkit has been released and is downloadable from the web. In this paper we describe the statistical details of the algorithms, the computational features of the Toolkit and describe the code validation
Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie
2013-01-01
Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.
Observations in the statistical analysis of NBG-18 nuclear graphite strength tests
International Nuclear Information System (INIS)
Hindley, Michael P.; Mitchell, Mark N.; Blaine, Deborah C.; Groenwold, Albert A.
2012-01-01
Highlights: ► Statistical analysis of NBG-18 nuclear graphite strength test. ► A Weibull distribution and normal distribution is tested for all data. ► A Bimodal distribution in the CS data is confirmed. ► The CS data set has the lowest variance. ► A Combined data set is formed and has Weibull distribution. - Abstract: The purpose of this paper is to report on the selection of a statistical distribution chosen to represent the experimental material strength of NBG-18 nuclear graphite. Three large sets of samples were tested during the material characterisation of the Pebble Bed Modular Reactor and Core Structure Ceramics materials. These sets of samples are tensile strength, flexural strength and compressive strength (CS) measurements. A relevant statistical fit is determined and the goodness of fit is also evaluated for each data set. The data sets are also normalised for ease of comparison, and combined into one representative data set. The validity of this approach is demonstrated. A second failure mode distribution is found on the CS test data. Identifying this failure mode supports the similar observations made in the past. The success of fitting the Weibull distribution through the normalised data sets allows us to improve the basis for the estimates of the variability. This could also imply that the variability on the graphite strength for the different strength measures is based on the same flaw distribution and thus a property of the material.
Rigby, A S
2001-11-10
The odds ratio is an appropriate method of analysis for data in 2 x 2 contingency tables. However, other methods of analysis exist. One such method is based on the chi2 test of goodness-of-fit. Key players in the development of statistical theory include Pearson, Fisher and Yates. Data are presented in the form of 2 x 2 contingency tables and a method of analysis based on the chi2 test is introduced. There are many variations of the basic test statistic, one of which is the chi2 test with Yates' continuity correction. The usefulness (or not) of Yates' continuity correction is discussed. Problems of interpretation when the method is applied to k x m tables are highlighted. Some properties of the chi2 the test are illustrated by taking examples from the author's teaching experiences. Journal editors should be encouraged to give both observed and expected cell frequencies so that better information comes out of the chi2 test statistic.
Luh, Wei-Ming; Guo, Jiin-Huarng
2005-01-01
To deal with nonnormal and heterogeneous data for the one-way fixed effect analysis of variance model, the authors adopted a trimmed means method in conjunction with Hall's invertible transformation into a heteroscedastic test statistic (Alexander-Govern test or Welch test). The results of simulation experiments showed that the proposed technique…
Kuretzki, Carlos Henrique; Campos, Antônio Carlos Ligocki; Malafaia, Osvaldo; Soares, Sandramara Scandelari Kusano de Paula; Tenório, Sérgio Bernardo; Timi, Jorge Rufino Ribas
2016-03-01
The use of information technology is often applied in healthcare. With regard to scientific research, the SINPE(c) - Integrated Electronic Protocols was created as a tool to support researchers, offering clinical data standardization. By the time, SINPE(c) lacked statistical tests obtained by automatic analysis. Add to SINPE(c) features for automatic realization of the main statistical methods used in medicine . The study was divided into four topics: check the interest of users towards the implementation of the tests; search the frequency of their use in health care; carry out the implementation; and validate the results with researchers and their protocols. It was applied in a group of users of this software in their thesis in the strict sensu master and doctorate degrees in one postgraduate program in surgery. To assess the reliability of the statistics was compared the data obtained both automatically by SINPE(c) as manually held by a professional in statistics with experience with this type of study. There was concern for the use of automatic statistical tests, with good acceptance. The chi-square, Mann-Whitney, Fisher and t-Student were considered as tests frequently used by participants in medical studies. These methods have been implemented and thereafter approved as expected. The incorporation of the automatic SINPE (c) Statistical Analysis was shown to be reliable and equal to the manually done, validating its use as a research tool for medical research.
Testing statistical hypotheses
Lehmann, E L
2005-01-01
The third edition of Testing Statistical Hypotheses updates and expands upon the classic graduate text, emphasizing optimality theory for hypothesis testing and confidence sets. The principal additions include a rigorous treatment of large sample optimality, together with the requisite tools. In addition, an introduction to the theory of resampling methods such as the bootstrap is developed. The sections on multiple testing and goodness of fit testing are expanded. The text is suitable for Ph.D. students in statistics and includes over 300 new problems out of a total of more than 760. E.L. Lehmann is Professor of Statistics Emeritus at the University of California, Berkeley. He is a member of the National Academy of Sciences and the American Academy of Arts and Sciences, and the recipient of honorary degrees from the University of Leiden, The Netherlands and the University of Chicago. He is the author of Elements of Large-Sample Theory and (with George Casella) he is also the author of Theory of Point Estimat...
Jokhio, Gul A.; Syed Mohsin, Sharifah M.; Gul, Yasmeen
2018-04-01
It has been established that Adobe provides, in addition to being sustainable and economic, a better indoor air quality without spending extensive amounts of energy as opposed to the modern synthetic materials. The material, however, suffers from weak structural behaviour when subjected to adverse loading conditions. A wide range of mechanical properties has been reported in literature owing to lack of research and standardization. The present paper presents the statistical analysis of the results that were obtained through compressive and flexural tests on Adobe samples. Adobe specimens with and without wire mesh reinforcement were tested and the results were reported. The statistical analysis of these results presents an interesting read. It has been found that the compressive strength of adobe increases by about 43% after adding a single layer of wire mesh reinforcement. This increase is statistically significant. The flexural response of Adobe has also shown improvement with the addition of wire mesh reinforcement, however, the statistical significance of the same cannot be established.
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Noel, Jean; Prieto, Juan C.; Styner, Martin
2017-03-01
Functional Analysis of Diffusion Tensor Tract Statistics (FADTTS) is a toolbox for analysis of white matter (WM) fiber tracts. It allows associating diffusion properties along major WM bundles with a set of covariates of interest, such as age, diagnostic status and gender, and the structure of the variability of these WM tract properties. However, to use this toolbox, a user must have an intermediate knowledge in scripting languages (MATLAB). FADTTSter was created to overcome this issue and make the statistical analysis accessible to any non-technical researcher. FADTTSter is actively being used by researchers at the University of North Carolina. FADTTSter guides non-technical users through a series of steps including quality control of subjects and fibers in order to setup the necessary parameters to run FADTTS. Additionally, FADTTSter implements interactive charts for FADTTS' outputs. This interactive chart enhances the researcher experience and facilitates the analysis of the results. FADTTSter's motivation is to improve usability and provide a new analysis tool to the community that complements FADTTS. Ultimately, by enabling FADTTS to a broader audience, FADTTSter seeks to accelerate hypothesis testing in neuroimaging studies involving heterogeneous clinical data and diffusion tensor imaging. This work is submitted to the Biomedical Applications in Molecular, Structural, and Functional Imaging conference. The source code of this application is available in NITRC.
A statistical design for testing apomictic diversification through linkage analysis.
Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling
2014-03-01
The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.
HOW TO SELECT APPROPRIATE STATISTICAL TEST IN SCIENTIFIC ARTICLES
Directory of Open Access Journals (Sweden)
Vladimir TRAJKOVSKI
2016-09-01
Full Text Available Statistics is mathematical science dealing with the collection, analysis, interpretation, and presentation of masses of numerical data in order to draw relevant conclusions. Statistics is a form of mathematical analysis that uses quantified models, representations and synopses for a given set of experimental data or real-life studies. The students and young researchers in biomedical sciences and in special education and rehabilitation often declare that they have chosen to enroll that study program because they have lack of knowledge or interest in mathematics. This is a sad statement, but there is much truth in it. The aim of this editorial is to help young researchers to select statistics or statistical techniques and statistical software appropriate for the purposes and conditions of a particular analysis. The most important statistical tests are reviewed in the article. Knowing how to choose right statistical test is an important asset and decision in the research data processing and in the writing of scientific papers. Young researchers and authors should know how to choose and how to use statistical methods. The competent researcher will need knowledge in statistical procedures. That might include an introductory statistics course, and it most certainly includes using a good statistics textbook. For this purpose, there is need to return of Statistics mandatory subject in the curriculum of the Institute of Special Education and Rehabilitation at Faculty of Philosophy in Skopje. Young researchers have a need of additional courses in statistics. They need to train themselves to use statistical software on appropriate way.
Common pitfalls in statistical analysis: The perils of multiple testing
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2016-01-01
Multiple testing refers to situations where a dataset is subjected to statistical testing multiple times - either at multiple time-points or through multiple subgroups or for multiple end-points. This amplifies the probability of a false-positive finding. In this article, we look at the consequences of multiple testing and explore various methods to deal with this issue. PMID:27141478
Basic statistical tools in research and data analysis
Directory of Open Access Journals (Sweden)
Zulfiqar Ali
2016-01-01
Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Statistical analysis applied to safety culture self-assessment
International Nuclear Information System (INIS)
Macedo Soares, P.P.
2002-01-01
Interviews and opinion surveys are instruments used to assess the safety culture in an organization as part of the Safety Culture Enhancement Programme. Specific statistical tools are used to analyse the survey results. This paper presents an example of an opinion survey with the corresponding application of the statistical analysis and the conclusions obtained. Survey validation, Frequency statistics, Kolmogorov-Smirnov non-parametric test, Student (T-test) and ANOVA means comparison tests and LSD post-hoc multiple comparison test, are discussed. (author)
Statistical methods for astronomical data analysis
Chattopadhyay, Asis Kumar
2014-01-01
This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...
Statistical treatment of fatigue test data
International Nuclear Information System (INIS)
Raske, D.T.
1980-01-01
This report discussed several aspects of fatigue data analysis in order to provide a basis for the development of statistically sound design curves. Included is a discussion on the choice of the dependent variable, the assumptions associated with least squares regression models, the variability of fatigue data, the treatment of data from suspended tests and outlying observations, and various strain-life relations
Statistical Analysis of Zebrafish Locomotor Response.
Liu, Yiwen; Carmer, Robert; Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai
2015-01-01
Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling's T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling's T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure.
Festing, Michael F W
2014-01-01
The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error) due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs) a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL) was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.
Directory of Open Access Journals (Sweden)
Michael F W Festing
Full Text Available The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.
A statistical approach to plasma profile analysis
International Nuclear Information System (INIS)
Kardaun, O.J.W.F.; McCarthy, P.J.; Lackner, K.; Riedel, K.S.
1990-05-01
A general statistical approach to the parameterisation and analysis of tokamak profiles is presented. The modelling of the profile dependence on both the radius and the plasma parameters is discussed, and pertinent, classical as well as robust, methods of estimation are reviewed. Special attention is given to statistical tests for discriminating between the various models, and to the construction of confidence intervals for the parameterised profiles and the associated global quantities. The statistical approach is shown to provide a rigorous approach to the empirical testing of plasma profile invariance. (orig.)
Statistical methods for the analysis of a screening test for chronic beryllium disease
Energy Technology Data Exchange (ETDEWEB)
Frome, E.L.; Neubert, R.L. [Oak Ridge National Lab., TN (United States). Mathematical Sciences Section; Smith, M.H.; Littlefield, L.G.; Colyer, S.P. [Oak Ridge Inst. for Science and Education, TN (United States). Medical Sciences Div.
1994-10-01
The lymphocyte proliferation test (LPT) is a noninvasive screening procedure used to identify persons who may have chronic beryllium disease. A practical problem in the analysis of LPT well counts is the occurrence of outlying data values (approximately 7% of the time). A log-linear regression model is used to describe the expected well counts for each set of test conditions. The variance of the well counts is proportional to the square of the expected counts, and two resistant regression methods are used to estimate the parameters of interest. The first approach uses least absolute values (LAV) on the log of the well counts to estimate beryllium stimulation indices (SIs) and the coefficient of variation. The second approach uses a resistant regression version of maximum quasi-likelihood estimation. A major advantage of the resistant regression methods is that it is not necessary to identify and delete outliers. These two new methods for the statistical analysis of the LPT data and the outlier rejection method that is currently being used are applied to 173 LPT assays. The authors strongly recommend the LAV method for routine analysis of the LPT.
Reliability Evaluation of Concentric Butterfly Valve Using Statistical Hypothesis Test
Energy Technology Data Exchange (ETDEWEB)
Chang, Mu Seong; Choi, Jong Sik; Choi, Byung Oh; Kim, Do Sik [Korea Institute of Machinery and Materials, Daejeon (Korea, Republic of)
2015-12-15
A butterfly valve is a type of flow-control device typically used to regulate a fluid flow. This paper presents an estimation of the shape parameter of the Weibull distribution, characteristic life, and B10 life for a concentric butterfly valve based on a statistical analysis of the reliability test data taken before and after the valve improvement. The difference in the shape and scale parameters between the existing and improved valves is reviewed using a statistical hypothesis test. The test results indicate that the shape parameter of the improved valve is similar to that of the existing valve, and that the scale parameter of the improved valve is found to have increased. These analysis results are particularly useful for a reliability qualification test and the determination of the service life cycles.
Reliability Evaluation of Concentric Butterfly Valve Using Statistical Hypothesis Test
International Nuclear Information System (INIS)
Chang, Mu Seong; Choi, Jong Sik; Choi, Byung Oh; Kim, Do Sik
2015-01-01
A butterfly valve is a type of flow-control device typically used to regulate a fluid flow. This paper presents an estimation of the shape parameter of the Weibull distribution, characteristic life, and B10 life for a concentric butterfly valve based on a statistical analysis of the reliability test data taken before and after the valve improvement. The difference in the shape and scale parameters between the existing and improved valves is reviewed using a statistical hypothesis test. The test results indicate that the shape parameter of the improved valve is similar to that of the existing valve, and that the scale parameter of the improved valve is found to have increased. These analysis results are particularly useful for a reliability qualification test and the determination of the service life cycles
Testing statistical hypotheses of equivalence
Wellek, Stefan
2010-01-01
Equivalence testing has grown significantly in importance over the last two decades, especially as its relevance to a variety of applications has become understood. Yet published work on the general methodology remains scattered in specialists' journals, and for the most part, it focuses on the relatively narrow topic of bioequivalence assessment.With a far broader perspective, Testing Statistical Hypotheses of Equivalence provides the first comprehensive treatment of statistical equivalence testing. The author addresses a spectrum of specific, two-sided equivalence testing problems, from the
Meijer, Rob R.; van Krimpen-Stoop, Edith M. L. A.
In this study a cumulative-sum (CUSUM) procedure from the theory of Statistical Process Control was modified and applied in the context of person-fit analysis in a computerized adaptive testing (CAT) environment. Six person-fit statistics were proposed using the CUSUM procedure, and three of them could be used to investigate the CAT in online test…
Statistical hypothesis testing with SAS and R
Taeger, Dirk
2014-01-01
A comprehensive guide to statistical hypothesis testing with examples in SAS and R When analyzing datasets the following questions often arise:Is there a short hand procedure for a statistical test available in SAS or R?If so, how do I use it?If not, how do I program the test myself? This book answers these questions and provides an overview of the most commonstatistical test problems in a comprehensive way, making it easy to find and performan appropriate statistical test. A general summary of statistical test theory is presented, along with a basicdescription for each test, including the
Directory of Open Access Journals (Sweden)
Elżbieta Sandurska
2016-12-01
Full Text Available Introduction: Application of statistical software typically does not require extensive statistical knowledge, allowing to easily perform even complex analyses. Consequently, test selection criteria and important assumptions may be easily overlooked or given insufficient consideration. In such cases, the results may likely lead to wrong conclusions. Aim: To discuss issues related to assumption violations in the case of Student's t-test and one-way ANOVA, two parametric tests frequently used in the field of sports science, and to recommend solutions. Description of the state of knowledge: Student's t-test and ANOVA are parametric tests, and therefore some of the assumptions that need to be satisfied include normal distribution of the data and homogeneity of variances in groups. If the assumptions are violated, the original design of the test is impaired, and the test may then be compromised giving spurious results. A simple method to normalize the data and to stabilize the variance is to use transformations. If such approach fails, a good alternative to consider is a nonparametric test, such as Mann-Whitney, the Kruskal-Wallis or Wilcoxon signed-rank tests. Summary: Thorough verification of the parametric tests assumptions allows for correct selection of statistical tools, which is the basis of well-grounded statistical analysis. With a few simple rules, testing patterns in the data characteristic for the study of sports science comes down to a straightforward procedure.
Statistical trend analysis methods for temporal phenomena
Energy Technology Data Exchange (ETDEWEB)
Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)
1997-04-01
We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.
Statistical trend analysis methods for temporal phenomena
International Nuclear Information System (INIS)
Lehtinen, E.; Pulkkinen, U.; Poern, K.
1997-04-01
We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods
Statistical analysis of long term spatial and temporal trends of ...
Indian Academy of Sciences (India)
Statistical analysis of long term spatial and temporal trends of temperature ... CGCM3; HadCM3; modified Mann–Kendall test; statistical analysis; Sutlej basin. ... Water Resources Systems Division, National Institute of Hydrology, Roorkee 247 ...
Statistical alignment: computational properties, homology testing and goodness-of-fit
DEFF Research Database (Denmark)
Hein, J; Wiuf, Carsten; Møller, Martin
2000-01-01
The model of insertions and deletions in biological sequences, first formulated by Thorne, Kishino, and Felsenstein in 1991 (the TKF91 model), provides a basis for performing alignment within a statistical framework. Here we investigate this model.Firstly, we show how to accelerate the statistical...... alignment algorithms several orders of magnitude. The main innovations are to confine likelihood calculations to a band close to the similarity based alignment, to get good initial guesses of the evolutionary parameters and to apply an efficient numerical optimisation algorithm for finding the maximum...... analysis.Secondly, we propose a new homology test based on this model, where homology means that an ancestor to a sequence pair can be found finitely far back in time. This test has statistical advantages relative to the traditional shuffle test for proteins.Finally, we describe a goodness-of-fit test...
A novel statistic for genome-wide interaction analysis.
Directory of Open Access Journals (Sweden)
Xuesen Wu
2010-09-01
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Coelho, Carlos A.; Marques, Filipe J.
2013-09-01
In this paper the authors combine the equicorrelation and equivariance test introduced by Wilks [13] with the likelihood ratio test (l.r.t.) for independence of groups of variables to obtain the l.r.t. of block equicorrelation and equivariance. This test or its single block version may find applications in many areas as in psychology, education, medicine, genetics and they are important "in many tests of multivariate analysis, e.g. in MANOVA, Profile Analysis, Growth Curve analysis, etc" [12, 9]. By decomposing the overall hypothesis into the hypotheses of independence of groups of variables and the hypothesis of equicorrelation and equivariance we are able to obtain the expressions for the overall l.r.t. statistic and its moments. From these we obtain a suitable factorization of the characteristic function (c.f.) of the logarithm of the l.r.t. statistic, which enables us to develop highly manageable and precise near-exact distributions for the test statistic.
Polarimetric Segmentation Using Wishart Test Statistic
DEFF Research Database (Denmark)
Skriver, Henning; Schou, Jesper; Nielsen, Allan Aasbjerg
2002-01-01
A newly developed test statistic for equality of two complex covariance matrices following the complex Wishart distribution and an associated asymptotic probability for the test statistic has been used in a segmentation algorithm. The segmentation algorithm is based on the MUM (merge using moments......) approach, which is a merging algorithm for single channel SAR images. The polarimetric version described in this paper uses the above-mentioned test statistic for merging. The segmentation algorithm has been applied to polarimetric SAR data from the Danish dual-frequency, airborne polarimetric SAR, EMISAR...
Statistical Analysis for Test Papers with Software SPSS
Institute of Scientific and Technical Information of China (English)
张燕君
2012-01-01
Test paper evaluation is an important work for the management of tests, which results are significant bases for scientific summation of teaching and learning. Taking an English test paper of high students’monthly examination as the object, it focuses on the interpretation of SPSS output concerning item and whole quantitative analysis of papers. By analyzing and evaluating the papers, it can be a feedback for teachers to check the students’progress and adjust their teaching process.
EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.
Tong, Xiaoxiao; Bentler, Peter M
2013-01-01
Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
DWPF Sample Vial Insert Study-Statistical Analysis of DWPF Mock-Up Test Data
International Nuclear Information System (INIS)
Harris, S.P.
1997-01-01
This report is prepared as part of Technical/QA Task Plan WSRC-RP-97-351 which was issued in response to Technical Task Request HLW/DWPF/TTR-970132 submitted by DWPF. Presented in this report is a statistical analysis of DWPF Mock-up test data for evaluation of two new analytical methods which use insert samples from the existing HydragardTM sampler. The first is a new hydrofluoric acid based method called the Cold Chemical Method (Cold Chem) and the second is a modified fusion method.Both new methods use the existing HydragardTM sampler to collect a smaller insert sample from the process sampling system. The insert testing methodology applies to the DWPF Slurry Mix Evaporator (SME) and the Melter Feed Tank (MFT) samples. Samples in small 3 ml containers (Inserts) are analyzed by either the cold chemical method or a modified fusion method. The current analytical method uses a HydragardTM sample station to obtain nearly full 15 ml peanut vials. The samples are prepared by a multi-step process for Inductively Coupled Plasma (ICP) analysis by drying, vitrification, grinding and finally dissolution by either mixed acid or fusion. In contrast, the insert sample is placed directly in the dissolution vessel, thus eliminating the drying, vitrification and grinding operations for the Cold chem method. Although the modified fusion still requires drying and calcine conversion, the process is rapid due to the decreased sample size and that no vitrification step is required.A slurry feed simulant material was acquired from the TNX pilot facility from the test run designated as PX-7.The Mock-up test data were gathered on the basis of a statistical design presented in SRT-SCS-97004 (Rev. 0). Simulant PX-7 samples were taken in the DWPF Analytical Cell Mock-up Facility using 3 ml inserts and 15 ml peanut vials. A number of the insert samples were analyzed by Cold Chem and compared with full peanut vial samples analyzed by the current methods. The remaining inserts were analyzed by
A simplification of the likelihood ratio test statistic for testing ...
African Journals Online (AJOL)
The traditional likelihood ratio test statistic for testing hypothesis about goodness of fit of multinomial probabilities in one, two and multi – dimensional contingency table was simplified. Advantageously, using the simplified version of the statistic to test the null hypothesis is easier and faster because calculating the expected ...
Statistical hot spot analysis of reactor cores
International Nuclear Information System (INIS)
Schaefer, H.
1974-05-01
This report is an introduction into statistical hot spot analysis. After the definition of the term 'hot spot' a statistical analysis is outlined. The mathematical method is presented, especially the formula concerning the probability of no hot spots in a reactor core is evaluated. A discussion with the boundary conditions of a statistical hot spot analysis is given (technological limits, nominal situation, uncertainties). The application of the hot spot analysis to the linear power of pellets and the temperature rise in cooling channels is demonstrated with respect to the test zone of KNK II. Basic values, such as probability of no hot spots, hot spot potential, expected hot spot diagram and cumulative distribution function of hot spots, are discussed. It is shown, that the risk of hot channels can be dispersed equally over all subassemblies by an adequate choice of the nominal temperature distribution in the core
Statistical evaluation of diagnostic performance topics in ROC analysis
Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E
2016-01-01
Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...
DWPF Sample Vial Insert Study-Statistical Analysis of DWPF Mock-Up Test Data
Energy Technology Data Exchange (ETDEWEB)
Harris, S.P. [Westinghouse Savannah River Company, AIKEN, SC (United States)
1997-09-18
This report is prepared as part of Technical/QA Task Plan WSRC-RP-97-351 which was issued in response to Technical Task Request HLW/DWPF/TTR-970132 submitted by DWPF. Presented in this report is a statistical analysis of DWPF Mock-up test data for evaluation of two new analytical methods which use insert samples from the existing HydragardTM sampler. The first is a new hydrofluoric acid based method called the Cold Chemical Method (Cold Chem) and the second is a modified fusion method.Either new DWPF analytical method could result in a two to three fold improvement in sample analysis time.Both new methods use the existing HydragardTM sampler to collect a smaller insert sample from the process sampling system. The insert testing methodology applies to the DWPF Slurry Mix Evaporator (SME) and the Melter Feed Tank (MFT) samples.The insert sample is named after the initial trials which placed the container inside the sample (peanut) vials. Samples in small 3 ml containers (Inserts) are analyzed by either the cold chemical method or a modified fusion method. The current analytical method uses a HydragardTM sample station to obtain nearly full 15 ml peanut vials. The samples are prepared by a multi-step process for Inductively Coupled Plasma (ICP) analysis by drying, vitrification, grinding and finally dissolution by either mixed acid or fusion. In contrast, the insert sample is placed directly in the dissolution vessel, thus eliminating the drying, vitrification and grinding operations for the Cold chem method. Although the modified fusion still requires drying and calcine conversion, the process is rapid due to the decreased sample size and that no vitrification step is required.A slurry feed simulant material was acquired from the TNX pilot facility from the test run designated as PX-7.The Mock-up test data were gathered on the basis of a statistical design presented in SRT-SCS-97004 (Rev. 0). Simulant PX-7 samples were taken in the DWPF Analytical Cell Mock
Explorations in Statistics: Hypothesis Tests and P Values
Curran-Everett, Douglas
2009-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This second installment of "Explorations in Statistics" delves into test statistics and P values, two concepts fundamental to the test of a scientific null hypothesis. The essence of a test statistic is that it compares what…
Statistical data analysis using SAS intermediate statistical methods
Marasinghe, Mervyn G
2018-01-01
The aim of this textbook (previously titled SAS for Data Analytics) is to teach the use of SAS for statistical analysis of data for advanced undergraduate and graduate students in statistics, data science, and disciplines involving analyzing data. The book begins with an introduction beyond the basics of SAS, illustrated with non-trivial, real-world, worked examples. It proceeds to SAS programming and applications, SAS graphics, statistical analysis of regression models, analysis of variance models, analysis of variance with random and mixed effects models, and then takes the discussion beyond regression and analysis of variance to conclude. Pedagogically, the authors introduce theory and methodological basis topic by topic, present a problem as an application, followed by a SAS analysis of the data provided and a discussion of results. The text focuses on applied statistical problems and methods. Key features include: end of chapter exercises, downloadable SAS code and data sets, and advanced material suitab...
The statistical analysis of anisotropies
International Nuclear Information System (INIS)
Webster, A.
1977-01-01
One of the many uses to which a radio survey may be put is an analysis of the distribution of the radio sources on the celestial sphere to find out whether they are bunched into clusters or lie in preferred regions of space. There are many methods of testing for clustering in point processes and since they are not all equally good this contribution is presented as a brief guide to what seems to be the best of them. The radio sources certainly do not show very strong clusering and may well be entirely unclustered so if a statistical method is to be useful it must be both powerful and flexible. A statistic is powerful in this context if it can efficiently distinguish a weakly clustered distribution of sources from an unclustered one, and it is flexible if it can be applied in a way which avoids mistaking defects in the survey for true peculiarities in the distribution of sources. The paper divides clustering statistics into two classes: number density statistics and log N/log S statistics. (Auth.)
Distinguish Dynamic Basic Blocks by Structural Statistical Testing
DEFF Research Database (Denmark)
Petit, Matthieu; Gotlieb, Arnaud
Statistical testing aims at generating random test data that respect selected probabilistic properties. A distribution probability is associated with the program input space in order to achieve statistical test purpose: to test the most frequent usage of software or to maximize the probability of...... control flow path) during the test data selection. We implemented this algorithm in a statistical test data generator for Java programs. A first experimental validation is presented...
Tuuli, Methodius G; Odibo, Anthony O
2011-08-01
The objective of this article is to discuss the rationale for common statistical tests used for the analysis and interpretation of prenatal diagnostic imaging studies. Examples from the literature are used to illustrate descriptive and inferential statistics. The uses and limitations of linear and logistic regression analyses are discussed in detail.
Analysis of statistical misconception in terms of statistical reasoning
Maryati, I.; Priatna, N.
2018-05-01
Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.
Statistical analysis of brake squeal noise
Oberst, S.; Lai, J. C. S.
2011-06-01
Despite substantial research efforts applied to the prediction of brake squeal noise since the early 20th century, the mechanisms behind its generation are still not fully understood. Squealing brakes are of significant concern to the automobile industry, mainly because of the costs associated with warranty claims. In order to remedy the problems inherent in designing quieter brakes and, therefore, to understand the mechanisms, a design of experiments study, using a noise dynamometer, was performed by a brake system manufacturer to determine the influence of geometrical parameters (namely, the number and location of slots) of brake pads on brake squeal noise. The experimental results were evaluated with a noise index and ranked for warm and cold brake stops. These data are analysed here using statistical descriptors based on population distributions, and a correlation analysis, to gain greater insight into the functional dependency between the time-averaged friction coefficient as the input and the peak sound pressure level data as the output quantity. The correlation analysis between the time-averaged friction coefficient and peak sound pressure data is performed by applying a semblance analysis and a joint recurrence quantification analysis. Linear measures are compared with complexity measures (nonlinear) based on statistics from the underlying joint recurrence plots. Results show that linear measures cannot be used to rank the noise performance of the four test pad configurations. On the other hand, the ranking of the noise performance of the test pad configurations based on the noise index agrees with that based on nonlinear measures: the higher the nonlinearity between the time-averaged friction coefficient and peak sound pressure, the worse the squeal. These results highlight the nonlinear character of brake squeal and indicate the potential of using nonlinear statistical analysis tools to analyse disc brake squeal.
Testing statistical isotropy in cosmic microwave background polarization maps
Rath, Pranati K.; Samal, Pramoda Kumar; Panda, Srikanta; Mishra, Debesh D.; Aluri, Pavan K.
2018-04-01
We apply our symmetry based Power tensor technique to test conformity of PLANCK Polarization maps with statistical isotropy. On a wide range of angular scales (l = 40 - 150), our preliminary analysis detects many statistically anisotropic multipoles in foreground cleaned full sky PLANCK polarization maps viz., COMMANDER and NILC. We also study the effect of residual foregrounds that may still be present in the Galactic plane using both common UPB77 polarization mask, as well as the individual component separation method specific polarization masks. However, some of the statistically anisotropic modes still persist, albeit significantly in NILC map. We further probed the data for any coherent alignments across multipoles in several bins from the chosen multipole range.
HistFitter software framework for statistical data analysis
Energy Technology Data Exchange (ETDEWEB)
Baak, M. [CERN, Geneva (Switzerland); Besjes, G.J. [Radboud University Nijmegen, Nijmegen (Netherlands); Nikhef, Amsterdam (Netherlands); Cote, D. [University of Texas, Arlington (United States); Koutsman, A. [TRIUMF, Vancouver (Canada); Lorenz, J. [Ludwig-Maximilians-Universitaet Muenchen, Munich (Germany); Excellence Cluster Universe, Garching (Germany); Short, D. [University of Oxford, Oxford (United Kingdom)
2015-04-15
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface. (orig.)
HistFitter software framework for statistical data analysis
International Nuclear Information System (INIS)
Baak, M.; Besjes, G.J.; Cote, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface. (orig.)
Assessment of the beryllium lymphocyte proliferation test using statistical process control.
Cher, Daniel J; Deubner, David C; Kelsh, Michael A; Chapman, Pamela S; Ray, Rose M
2006-10-01
Despite more than 20 years of surveillance and epidemiologic studies using the beryllium blood lymphocyte proliferation test (BeBLPT) as a measure of beryllium sensitization (BeS) and as an aid for diagnosing subclinical chronic beryllium disease (CBD), improvements in specific understanding of the inhalation toxicology of CBD have been limited. Although epidemiologic data suggest that BeS and CBD risks vary by process/work activity, it has proven difficult to reach specific conclusions regarding the dose-response relationship between workplace beryllium exposure and BeS or subclinical CBD. One possible reason for this uncertainty could be misclassification of BeS resulting from variation in BeBLPT testing performance. The reliability of the BeBLPT, a biological assay that measures beryllium sensitization, is unknown. To assess the performance of four laboratories that conducted this test, we used data from a medical surveillance program that offered testing for beryllium sensitization with the BeBLPT. The study population was workers exposed to beryllium at various facilities over a 10-year period (1992-2001). Workers with abnormal results were offered diagnostic workups for CBD. Our analyses used a standard statistical technique, statistical process control (SPC), to evaluate test reliability. The study design involved a repeated measures analysis of BeBLPT results generated from the company-wide, longitudinal testing. Analytical methods included use of (1) statistical process control charts that examined temporal patterns of variation for the stimulation index, a measure of cell reactivity to beryllium; (2) correlation analysis that compared prior perceptions of BeBLPT instability to the statistical measures of test variation; and (3) assessment of the variation in the proportion of missing test results and how time periods with more missing data influenced SPC findings. During the period of this study, all laboratories displayed variation in test results that
Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics.
Rudert, Thomas; Lohmann, Gabriele
2008-12-01
To evaluate logical expressions over different effects in data analyses using the general linear model (GLM) and to evaluate logical expressions over different posterior probability maps (PPMs). In functional magnetic resonance imaging (fMRI) data analysis, the GLM was applied to estimate unknown regression parameters. Based on the GLM, Bayesian statistics can be used to determine the probability of conjunction, disjunction, implication, or any other arbitrary logical expression over different effects or contrast. For second-level inferences, PPMs from individual sessions or subjects are utilized. These PPMs can be combined to a logical expression and its probability can be computed. The methods proposed in this article are applied to data from a STROOP experiment and the methods are compared to conjunction analysis approaches for test-statistics. The combination of Bayesian statistics with propositional logic provides a new approach for data analyses in fMRI. Two different methods are introduced for propositional logic: the first for analyses using the GLM and the second for common inferences about different probability maps. The methods introduced extend the idea of conjunction analysis to a full propositional logic and adapt it from test-statistics to Bayesian statistics. The new approaches allow inferences that are not possible with known standard methods in fMRI. (c) 2008 Wiley-Liss, Inc.
Statistical analysis of questionnaires a unified approach based on R and Stata
Bartolucci, Francesco; Gnaldi, Michela
2015-01-01
Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata presents special statistical methods for analyzing data collected by questionnaires. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields. It is suitable for graduate students in applied statistics and psychometrics and practitioners in education, health, and marketing.The book covers the foundations of classical test theory (CTT), test reliability, va
Pestman, Wiebe R
2009-01-01
This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.
Statistical approach for collaborative tests, reference material certification procedures
International Nuclear Information System (INIS)
Fangmeyer, H.; Haemers, L.; Larisse, J.
1977-01-01
The first part introduces the different aspects in organizing and executing intercomparison tests of chemical or physical quantities. It follows a description of a statistical procedure to handle the data collected in a circular analysis. Finally, an example demonstrates how the tool can be applied and which conclusion can be drawn of the results obtained
Statistical tests for the Gaussian nature of primordial fluctuations through CBR experiments
International Nuclear Information System (INIS)
Luo, X.
1994-01-01
Information about the physical processes that generate the primordial fluctuations in the early Universe can be gained by testing the Gaussian nature of the fluctuations through cosmic microwave background radiation (CBR) temperature anisotropy experiments. One of the crucial aspects of density perturbations that are produced by the standard inflation scenario is that they are Gaussian, whereas seeds produced by topological defects left over from an early cosmic phase transition tend to be non-Gaussian. To carry out this test, sophisticated statistical tools are required. In this paper, we will discuss several such statistical tools, including multivariant skewness and kurtosis, Euler-Poincare characteristics, the three-point temperature correlation function, and Hotelling's T 2 statistic defined through bispectral estimates of a one-dimensional data set. The effect of noise present in the current data is discussed in detail and the COBE 53 GHz data set is analyzed. Our analysis shows that, on the large angular scale to which COBE is sensitive, the statistics are probably Gaussian. On the small angular scales, the importance of Hotelling's T 2 statistic is stressed, and the minimum sample size required to test Gaussianity is estimated. Although the current data set available from various experiments at half-degree scales is still too small, improvement of the data set by roughly a factor of 2 will be enough to test the Gaussianity statistically. On the arc min scale, we analyze the recent RING data through bispectral analysis, and the result indicates possible deviation from Gaussianity. Effects of point sources are also discussed. It is pointed out that the Gaussianity problem can be resolved in the near future by ground-based or balloon-borne experiments
Simplified Freeman-Tukey test statistics for testing probabilities in ...
African Journals Online (AJOL)
This paper presents the simplified version of the Freeman-Tukey test statistic for testing hypothesis about multinomial probabilities in one, two and multidimensional contingency tables that does not require calculating the expected cell frequencies before test of significance. The simplified method established new criteria of ...
New Graphical Methods and Test Statistics for Testing Composite Normality
Directory of Open Access Journals (Sweden)
Marc S. Paolella
2015-07-01
Full Text Available Several graphical methods for testing univariate composite normality from an i.i.d. sample are presented. They are endowed with correct simultaneous error bounds and yield size-correct tests. As all are based on the empirical CDF, they are also consistent for all alternatives. For one test, called the modified stabilized probability test, or MSP, a highly simplified computational method is derived, which delivers the test statistic and also a highly accurate p-value approximation, essentially instantaneously. The MSP test is demonstrated to have higher power against asymmetric alternatives than the well-known and powerful Jarque-Bera test. A further size-correct test, based on combining two test statistics, is shown to have yet higher power. The methodology employed is fully general and can be applied to any i.i.d. univariate continuous distribution setting.
Kepler Planet Detection Metrics: Statistical Bootstrap Test
Jenkins, Jon M.; Burke, Christopher J.
2016-01-01
This document describes the data produced by the Statistical Bootstrap Test over the final three Threshold Crossing Event (TCE) deliveries to NExScI: SOC 9.1 (Q1Q16)1 (Tenenbaum et al. 2014), SOC 9.2 (Q1Q17) aka DR242 (Seader et al. 2015), and SOC 9.3 (Q1Q17) aka DR253 (Twicken et al. 2016). The last few years have seen significant improvements in the SOC science data processing pipeline, leading to higher quality light curves and more sensitive transit searches. The statistical bootstrap analysis results presented here and the numerical results archived at NASAs Exoplanet Science Institute (NExScI) bear witness to these software improvements. This document attempts to introduce and describe the main features and differences between these three data sets as a consequence of the software changes.
Van Bockstaele, Femke; Janssens, Ann; Piette, Anne; Callewaert, Filip; Pede, Valerie; Offner, Fritz; Verhasselt, Bruno; Philippé, Jan
2006-07-15
ZAP-70 has been proposed as a surrogate marker for immunoglobulin heavy-chain variable region (IgV(H)) mutation status, which is known as a prognostic marker in B-cell chronic lymphocytic leukemia (CLL). The flow cytometric analysis of ZAP-70 suffers from difficulties in standardization and interpretation. We applied the Kolmogorov-Smirnov (KS) statistical test to make analysis more straightforward. We examined ZAP-70 expression by flow cytometry in 53 patients with CLL. Analysis was performed as initially described by Crespo et al. (New England J Med 2003; 348:1764-1775) and alternatively by application of the KS statistical test comparing T cells with B cells. Receiver-operating-characteristics (ROC)-curve analyses were performed to determine the optimal cut-off values for ZAP-70 measured by the two approaches. ZAP-70 protein expression was compared with ZAP-70 mRNA expression measured by a quantitative PCR (qPCR) and with the IgV(H) mutation status. Both flow cytometric analyses correlated well with the molecular technique and proved to be of equal value in predicting the IgV(H) mutation status. Applying the KS test is reproducible, simple, straightforward, and overcomes a number of difficulties encountered in the Crespo-method. The KS statistical test is an essential part of the software delivered with modern routine analytical flow cytometers and is well suited for analysis of ZAP-70 expression in CLL. (c) 2006 International Society for Analytical Cytology.
Statistical testing of association between menstruation and migraine.
Barra, Mathias; Dahl, Fredrik A; Vetvik, Kjersti G
2015-02-01
To repair and refine a previously proposed method for statistical analysis of association between migraine and menstruation. Menstrually related migraine (MRM) affects about 20% of female migraineurs in the general population. The exact pathophysiological link from menstruation to migraine is hypothesized to be through fluctuations in female reproductive hormones, but the exact mechanisms remain unknown. Therefore, the main diagnostic criterion today is concurrency of migraine attacks with menstruation. Methods aiming to exclude spurious associations are wanted, so that further research into these mechanisms can be performed on a population with a true association. The statistical method is based on a simple two-parameter null model of MRM (which allows for simulation modeling), and Fisher's exact test (with mid-p correction) applied to standard 2 × 2 contingency tables derived from the patients' headache diaries. Our method is a corrected version of a previously published flawed framework. To our best knowledge, no other published methods for establishing a menstruation-migraine association by statistical means exist today. The probabilistic methodology shows good performance when subjected to receiver operator characteristic curve analysis. Quick reference cutoff values for the clinical setting were tabulated for assessing association given a patient's headache history. In this paper, we correct a proposed method for establishing association between menstruation and migraine by statistical methods. We conclude that the proposed standard of 3-cycle observations prior to setting an MRM diagnosis should be extended with at least one perimenstrual window to obtain sufficient information for statistical processing. © 2014 American Headache Society.
Precision Statistical Analysis of Images Based on Brightness Distribution
Directory of Open Access Journals (Sweden)
Muzhir Shaban Al-Ani
2017-07-01
Full Text Available Study the content of images is considered an important topic in which reasonable and accurate analysis of images are generated. Recently image analysis becomes a vital field because of huge number of images transferred via transmission media in our daily life. These crowded media with images lead to highlight in research area of image analysis. In this paper, the implemented system is passed into many steps to perform the statistical measures of standard deviation and mean values of both color and grey images. Whereas the last step of the proposed method concerns to compare the obtained results in different cases of the test phase. In this paper, the statistical parameters are implemented to characterize the content of an image and its texture. Standard deviation, mean and correlation values are used to study the intensity distribution of the tested images. Reasonable results are obtained for both standard deviation and mean value via the implementation of the system. The major issue addressed in the work is concentrated on brightness distribution via statistical measures applying different types of lighting.
Statistical considerations on safety analysis
International Nuclear Information System (INIS)
Pal, L.; Makai, M.
2004-01-01
statement is true. In some cases statistical aspects of safety are misused, where the number of runs for several outputs is correct only for statistically independent outputs, or misinterpreted. We do not know the probability distribution of the output variables subjected to safety limitations. At the same time in some asymmetric distributions the 0.95/0.95 methodology simply fails: if we repeat the calculations in many cases we would get a value higher than the basic value, which means the limit violation in the calculation becomes more and more probable in the repeated analysis. Consequent application of order statistics or the application of the sign test may offer a way out of the present situation. The authors are also convinced that efforts should be made to study the statistics of the output variables, and to study the occurrence of chaos in the analyzed cases. All these observations should influence, in safety analysis, the application of best estimate methods, and underline the opinion that any realistic modeling and simulation of complex systems must include the probabilistic features of the system and the environment
Log-concave Probability Distributions: Theory and Statistical Testing
DEFF Research Database (Denmark)
An, Mark Yuing
1996-01-01
This paper studies the broad class of log-concave probability distributions that arise in economics of uncertainty and information. For univariate, continuous, and log-concave random variables we prove useful properties without imposing the differentiability of density functions. Discrete...... and multivariate distributions are also discussed. We propose simple non-parametric testing procedures for log-concavity. The test statistics are constructed to test one of the two implicati ons of log-concavity: increasing hazard rates and new-is-better-than-used (NBU) property. The test for increasing hazard...... rates are based on normalized spacing of the sample order statistics. The tests for NBU property fall into the category of Hoeffding's U-statistics...
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments
International Nuclear Information System (INIS)
Pham, Bihn T.; Einerson, Jeffrey J.
2010-01-01
This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory's Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automated processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.
The statistical analysis techniques to support the NGNP fuel performance experiments
Energy Technology Data Exchange (ETDEWEB)
Pham, Binh T., E-mail: Binh.Pham@inl.gov; Einerson, Jeffrey J.
2013-10-15
This paper describes the development and application of statistical analysis techniques to support the Advanced Gas Reactor (AGR) experimental program on Next Generation Nuclear Plant (NGNP) fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel temperature) is regulated by the He–Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the NGNP Data Management and Analysis System for automated processing and qualification of the AGR measured data. The neutronic and thermal code simulation results are used for comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the fuel temperature within a given range.
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Similar tests and the standardized log likelihood ratio statistic
DEFF Research Database (Denmark)
Jensen, Jens Ledet
1986-01-01
When testing an affine hypothesis in an exponential family the 'ideal' procedure is to calculate the exact similar test, or an approximation to this, based on the conditional distribution given the minimal sufficient statistic under the null hypothesis. By contrast to this there is a 'primitive......' approach in which the marginal distribution of a test statistic considered and any nuisance parameter appearing in the test statistic is replaced by an estimate. We show here that when using standardized likelihood ratio statistics the 'primitive' procedure is in fact an 'ideal' procedure to order O(n -3...
Statistical analysis of metallicity in spiral galaxies
Energy Technology Data Exchange (ETDEWEB)
Galeotti, P [Consiglio Nazionale delle Ricerche, Turin (Italy). Lab. di Cosmo-Geofisica; Turin Univ. (Italy). Ist. di Fisica Generale)
1981-04-01
A principal component analysis of metallicity and other integral properties of 33 spiral galaxies is presented; the involved parameters are: morphological type, diameter, luminosity and metallicity. From the statistical analysis it is concluded that the sample has only two significant dimensions and additonal tests, involving different parameters, show similar results. Thus it seems that only type and luminosity are independent variables, being the other integral properties of spiral galaxies correlated with them.
IEEE Std 101-1987: IEEE guide for the statistical analysis of thermal life test data
International Nuclear Information System (INIS)
Anon.
1992-01-01
This revision of IEEE Std 101-1972 describes statistical analyses for data from thermally accelerated aging tests. It explains the basis and use of statistical calculations for an engineer or scientist. Accelerated test procedures usually call for a number of specimens to be aged at each of several temperatures appreciably above normal operating temperatures. High temperatures are chosen to produce specimen failures (according to specified failure criteria) in typically one week to one year. The test objective is to determine the dependence of median life on temperature from the data, and to estimate, by extrapolation, the median life to be expected at service temperature. This guide presents methods for analyzing such data and for comparing test data on different materials
Kleibergen, F.R.
2002-01-01
We extend the novel pivotal statistics for testing the parameters in the instrumental variables regression model. We show that these statistics result from a decomposition of the Anderson-Rubin statistic into two independent pivotal statistics. The first statistic is a score statistic that tests
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
HistFitter software framework for statistical data analysis
Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...
International Nuclear Information System (INIS)
Coleman, S.Y.; Nicholls, J.R.
2006-01-01
Cyclic oxidation testing at elevated temperatures requires careful experimental design and the adoption of standard procedures to ensure reliable data. This is a major aim of the 'COTEST' research programme. Further, as such tests are both time consuming and costly, in terms of human effort, to take measurements over a large number of cycles, it is important to gain maximum information from a minimum number of tests (trials). This search for standardisation of cyclic oxidation conditions leads to a series of tests to determine the relative effects of cyclic parameters on the oxidation process. Following a review of the available literature, databases and the experience of partners to the COTEST project, the most influential parameters, upper dwell temperature (oxidation temperature) and time (hot time), lower dwell time (cold time) and environment, were investigated in partners' laboratories. It was decided to test upper dwell temperature at 3 levels, at and equidistant from a reference temperature; to test upper dwell time at a reference, a higher and a lower time; to test lower dwell time at a reference and a higher time and wet and dry environments. Thus an experiment, consisting of nine trials, was designed according to statistical criteria. The results of the trial were analysed statistically, to test the main linear and quadratic effects of upper dwell temperature and hot time and the main effects of lower dwell time (cold time) and environment. The nine trials are a quarter fraction of the 36 possible combinations of parameter levels that could have been studied. The results have been analysed by half Normal plots as there are only 2 degrees of freedom for the experimental error variance, which is rather low for a standard analysis of variance. Half Normal plots give a visual indication of which factors are statistically significant. In this experiment each trial has 3 replications, and the data are analysed in terms of mean mass change, oxidation kinetics
Beginning statistics with data analysis
Mosteller, Frederick; Rourke, Robert EK
2013-01-01
This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.
Caveats for using statistical significance tests in research assessments
DEFF Research Database (Denmark)
Schneider, Jesper Wiborg
2013-01-01
controversial and numerous criticisms have been leveled against their use. Based on examples from articles by proponents of the use statistical significance tests in research assessments, we address some of the numerous problems with such tests. The issues specifically discussed are the ritual practice......This article raises concerns about the advantages of using statistical significance tests in research assessments as has recently been suggested in the debate about proper normalization procedures for citation indicators by Opthof and Leydesdorff (2010). Statistical significance tests are highly...... argue that applying statistical significance tests and mechanically adhering to their results are highly problematic and detrimental to critical thinking. We claim that the use of such tests do not provide any advantages in relation to deciding whether differences between citation indicators...
Purves, L.; Strang, R. F.; Dube, M. P.; Alea, P.; Ferragut, N.; Hershfeld, D.
1983-01-01
The software and procedures of a system of programs used to generate a report of the statistical correlation between NASTRAN modal analysis results and physical tests results from modal surveys are described. Topics discussed include: a mathematical description of statistical correlation, a user's guide for generating a statistical correlation report, a programmer's guide describing the organization and functions of individual programs leading to a statistical correlation report, and a set of examples including complete listings of programs, and input and output data.
Teaching Statistics in Language Testing Courses
Brown, James Dean
2013-01-01
The purpose of this article is to examine the literature on teaching statistics for useful ideas that teachers of language testing courses can draw on and incorporate into their teaching toolkits as they see fit. To those ends, the article addresses eight questions: What is known generally about teaching statistics? Why are students so anxious…
Statistical analysis of thermal conductivity of nanofluid containing ...
Indian Academy of Sciences (India)
Thermal conductivity measurements of nanofluids were analysed via two-factor completely randomized design and comparison of data means is carried out with Duncan's multiple-range test. Statistical analysis of experimental data show that temperature and weight fraction have a reasonable impact on the thermal ...
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Quantitative analysis and IBM SPSS statistics a guide for business and finance
Aljandali, Abdulkader
2016-01-01
This guide is for practicing statisticians and data scientists who use IBM SPSS for statistical analysis of big data in business and finance. This is the first of a two-part guide to SPSS for Windows, introducing data entry into SPSS, along with elementary statistical and graphical methods for summarizing and presenting data. Part I also covers the rudiments of hypothesis testing and business forecasting while Part II will present multivariate statistical methods, more advanced forecasting methods, and multivariate methods. IBM SPSS Statistics offers a powerful set of statistical and information analysis systems that run on a wide variety of personal computers. The software is built around routines that have been developed, tested, and widely used for more than 20 years. As such, IBM SPSS Statistics is extensively used in industry, commerce, banking, local and national governments, and education. Just a small subset of users of the package include the major clearing banks, the BBC, British Gas, British Airway...
State analysis of BOP using statistical and heuristic methods
International Nuclear Information System (INIS)
Heo, Gyun Young; Chang, Soon Heung
2003-01-01
Under the deregulation environment, the performance enhancement of BOP in nuclear power plants is being highlighted. To analyze performance level of BOP, we use the performance test procedures provided from an authorized institution such as ASME. However, through plant investigation, it was proved that the requirements of the performance test procedures about the reliability and quantity of sensors was difficult to be satisfied. As a solution of this, state analysis method that are the expanded concept of signal validation, was proposed on the basis of the statistical and heuristic approaches. Authors recommended the statistical linear regression model by analyzing correlation among BOP parameters as a reference state analysis method. Its advantage is that its derivation is not heuristic, it is possible to calculate model uncertainty, and it is easy to apply to an actual plant. The error of the statistical linear regression model is below 3% under normal as well as abnormal system states. Additionally a neural network model was recommended since the statistical model is impossible to apply to the validation of all of the sensors and is sensitive to the outlier that is the signal located out of a statistical distribution. Because there are a lot of sensors need to be validated in BOP, wavelet analysis (WA) were applied as a pre-processor for the reduction of input dimension and for the enhancement of training accuracy. The outlier localization capability of WA enhanced the robustness of the neural network. The trained neural network restored the degraded signals to the values within ±3% of the true signals
Mathur, Sunil; Sadana, Ajit
2015-12-01
We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.
Research design and statistical analysis
Myers, Jerome L; Lorch Jr, Robert F
2013-01-01
Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data. The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations. Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions. Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations
Significance levels for studies with correlated test statistics.
Shi, Jianxin; Levinson, Douglas F; Whittemore, Alice S
2008-07-01
When testing large numbers of null hypotheses, one needs to assess the evidence against the global null hypothesis that none of the hypotheses is false. Such evidence typically is based on the test statistic of the largest magnitude, whose statistical significance is evaluated by permuting the sample units to simulate its null distribution. Efron (2007) has noted that correlation among the test statistics can induce substantial interstudy variation in the shapes of their histograms, which may cause misleading tail counts. Here, we show that permutation-based estimates of the overall significance level also can be misleading when the test statistics are correlated. We propose that such estimates be conditioned on a simple measure of the spread of the observed histogram, and we provide a method for obtaining conditional significance levels. We justify this conditioning using the conditionality principle described by Cox and Hinkley (1974). Application of the method to gene expression data illustrates the circumstances when conditional significance levels are needed.
SPSS for applied sciences basic statistical testing
Davis, Cole
2013-01-01
This book offers a quick and basic guide to using SPSS and provides a general approach to solving problems using statistical tests. It is both comprehensive in terms of the tests covered and the applied settings it refers to, and yet is short and easy to understand. Whether you are a beginner or an intermediate level test user, this book will help you to analyse different types of data in applied settings. It will also give you the confidence to use other statistical software and to extend your expertise to more specific scientific settings as required.The author does not use mathematical form
Study of relationship between MUF correlation and detection sensitivity of statistical analysis
International Nuclear Information System (INIS)
Tamura, Toshiaki; Ihara, Hitoshi; Yamamoto, Yoichi; Ikawa, Koji
1989-11-01
Various kinds of statistical analysis are proposed to NRTA (Near Real Time Materials Accountancy) which was devised to satisfy the timeliness goal of one of the detection goals of IAEA. It will be presumed that different statistical analysis results will occur between the case of considered rigorous error propagation (with MUF correlation) and the case of simplified error propagation (without MUF correlation). Therefore, measurement simulation and decision analysis were done using flow simulation of 800 MTHM/Y model reprocessing plant, and relationship between MUF correlation and detection sensitivity and false alarm of statistical analysis was studied. Specific character of material accountancy for 800 MTHM/Y model reprocessing plant was grasped by this simulation. It also became clear that MUF correlation decreases not only false alarm but also detection probability for protracted loss in case of CUMUF test and Page's test applied to NRTA. (author)
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
Statistical data analysis handbook
National Research Council Canada - National Science Library
Wall, Francis J
1986-01-01
It must be emphasized that this is not a text book on statistics. Instead it is a working tool that presents data analysis in clear, concise terms which can be readily understood even by those without formal training in statistics...
A comparison of test statistics for the recovery of rapid growth-based enumeration tests
van den Heuvel, Edwin R.; IJzerman-Boon, Pieta C.
This paper considers five test statistics for comparing the recovery of a rapid growth-based enumeration test with respect to the compendial microbiological method using a specific nonserial dilution experiment. The finite sample distributions of these test statistics are unknown, because they are
International Nuclear Information System (INIS)
Lacombe, J.P.
1985-12-01
Statistic study of Poisson non-homogeneous and spatial processes is the first part of this thesis. A Neyman-Pearson type test is defined concerning the intensity measurement of these processes. Conditions are given for which consistency of the test is assured, and others giving the asymptotic normality of the test statistics. Then some techniques of statistic processing of Poisson fields and their applications to a particle multidetector study are given. Quality tests of the device are proposed togetherwith signal extraction methods [fr
Directory of Open Access Journals (Sweden)
Hilary I. Okagbue
2018-04-01
Full Text Available This data article contains the statistical analysis of the total, percentage and distribution of editorial board composition of 111 Hindawi journals indexed in Emerging Sources Citation Index (ESCI across the continents. The reliability of the data was shown using correlation, goodness-of-fit test, analysis of variance and statistical variability tests. Keywords: Hindawi, Bibliometrics, Data analysis, ESCI, Random, Smart campus, Web of science, Ranking analytics, Statistics
Statistical analysis of the count and profitability of air conditioners.
Rady, El Houssainy A; Mohamed, Salah M; Abd Elmegaly, Alaa A
2018-08-01
This article presents the statistical analysis of the number and profitability of air conditioners in an Egyptian company. Checking the same distribution for each categorical variable has been made using Kruskal-Wallis test.
International Nuclear Information System (INIS)
Hirao, Keiichi; Yamane, Toshimi; Minamino, Yoritoshi
1991-01-01
This report is to show how the life due to stress corrosion cracking breakdown of fuel cladding tubes is evaluated by applying the statistical techniques to that examined by a few testing methods. The statistical distribution of the limiting values of constant load stress corrosion cracking life, the statistical analysis by making the probabilistic interpretation of constant load stress corrosion cracking life, and the statistical analysis of stress corrosion cracking life by the slow strain rate test (SSRT) method are described. (K.I.)
FADTTS: functional analysis of diffusion tensor tract statistics.
Zhu, Hongtu; Kong, Linglong; Li, Runze; Styner, Martin; Gerig, Guido; Lin, Weili; Gilmore, John H
2011-06-01
The aim of this paper is to present a functional analysis of a diffusion tensor tract statistics (FADTTS) pipeline for delineating the association between multiple diffusion properties along major white matter fiber bundles with a set of covariates of interest, such as age, diagnostic status and gender, and the structure of the variability of these white matter tract properties in various diffusion tensor imaging studies. The FADTTS integrates five statistical tools: (i) a multivariate varying coefficient model for allowing the varying coefficient functions in terms of arc length to characterize the varying associations between fiber bundle diffusion properties and a set of covariates, (ii) a weighted least squares estimation of the varying coefficient functions, (iii) a functional principal component analysis to delineate the structure of the variability in fiber bundle diffusion properties, (iv) a global test statistic to test hypotheses of interest, and (v) a simultaneous confidence band to quantify the uncertainty in the estimated coefficient functions. Simulated data are used to evaluate the finite sample performance of FADTTS. We apply FADTTS to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment. FADTTS can be used to facilitate the understanding of normal brain development, the neural bases of neuropsychiatric disorders, and the joint effects of environmental and genetic factors on white matter fiber bundles. The advantages of FADTTS compared with the other existing approaches are that they are capable of modeling the structured inter-subject variability, testing the joint effects, and constructing their simultaneous confidence bands. However, FADTTS is not crucial for estimation and reduces to the functional analysis method for the single measure. Copyright © 2011 Elsevier Inc. All rights reserved.
Rivoirard, Romain; Duplay, Vianney; Oriol, Mathieu; Tinquaut, Fabien; Chauvin, Franck; Magne, Nicolas; Bourmaud, Aurelie
2016-01-01
Quality of reporting for Randomized Clinical Trials (RCTs) in oncology was analyzed in several systematic reviews, but, in this setting, there is paucity of data for the outcomes definitions and consistency of reporting for statistical tests in RCTs and Observational Studies (OBS). The objective of this review was to describe those two reporting aspects, for OBS and RCTs in oncology. From a list of 19 medical journals, three were retained for analysis, after a random selection: British Medical Journal (BMJ), Annals of Oncology (AoO) and British Journal of Cancer (BJC). All original articles published between March 2009 and March 2014 were screened. Only studies whose main outcome was accompanied by a corresponding statistical test were included in the analysis. Studies based on censored data were excluded. Primary outcome was to assess quality of reporting for description of primary outcome measure in RCTs and of variables of interest in OBS. A logistic regression was performed to identify covariates of studies potentially associated with concordance of tests between Methods and Results parts. 826 studies were included in the review, and 698 were OBS. Variables were described in Methods section for all OBS studies and primary endpoint was clearly detailed in Methods section for 109 RCTs (85.2%). 295 OBS (42.2%) and 43 RCTs (33.6%) had perfect agreement for reported statistical test between Methods and Results parts. In multivariable analysis, variable "number of included patients in study" was associated with test consistency: aOR (adjusted Odds Ratio) for third group compared to first group was equal to: aOR Grp3 = 0.52 [0.31-0.89] (P value = 0.009). Variables in OBS and primary endpoint in RCTs are reported and described with a high frequency. However, statistical tests consistency between methods and Results sections of OBS is not always noted. Therefore, we encourage authors and peer reviewers to verify consistency of statistical tests in oncology studies.
Hendikawati, P.; Arifudin, R.; Zahid, M. Z.
2018-03-01
This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.
International Nuclear Information System (INIS)
Sood, Avnet; Forster, R. Arthur; Parsons, D. Kent
2001-01-01
Monte Carlo simulations of nuclear criticality eigenvalue problems are often performed by general purpose radiation transport codes such as MCNP. MCNP performs detailed statistical analysis of the criticality calculation and provides feedback to the user with warning messages, tables, and graphs. The purpose of the analysis is to provide the user with sufficient information to assess spatial convergence of the eigenfunction and thus the validity of the criticality calculation. As a test of this statistical analysis package in MCNP, analytic criticality verification benchmark problems have been used for the first time to assess the performance of the criticality convergence tests in MCNP. The MCNP statistical analysis capability has been recently assessed using the 75 multigroup criticality verification analytic problem test set. MCNP was verified with these problems at the 10 -4 to 10 -5 statistical error level using 40 000 histories per cycle and 2000 active cycles. In all cases, the final boxed combined k eff answer was given with the standard deviation and three confidence intervals that contained the analytic k eff . To test the effectiveness of the statistical analysis checks in identifying poor eigenfunction convergence, ten problems from the test set were deliberately run incorrectly using 1000 histories per cycle, 200 active cycles, and 10 inactive cycles. Six problems with large dominance ratios were chosen from the test set because they do not achieve the normal spatial mode in the beginning of the calculation. To further stress the convergence tests, these problems were also started with an initial fission source point 1 cm from the boundary thus increasing the likelihood of a poorly converged initial fission source distribution. The final combined k eff confidence intervals for these deliberately ill-posed problems did not include the analytic k eff value. In no case did a bad confidence interval go undetected. Warning messages were given signaling that
Using Pre-Statistical Analysis to Streamline Monitoring Assessments
International Nuclear Information System (INIS)
Reed, J.K.
1999-01-01
A variety of statistical methods exist to aid evaluation of groundwater quality and subsequent decision making in regulatory programs. These methods are applied because of large temporal and spatial extrapolations commonly applied to these data. In short, statistical conclusions often serve as a surrogate for knowledge. However, facilities with mature monitoring programs that have generated abundant data have inherently less uncertainty because of the sheer quantity of analytical results. In these cases, statistical tests can be less important, and ''expert'' data analysis should assume an important screening role.The WSRC Environmental Protection Department, working with the General Separations Area BSRI Environmental Restoration project team has developed a method for an Integrated Hydrogeological Analysis (IHA) of historical water quality data from the F and H Seepage Basins groundwater remediation project. The IHA combines common sense analytical techniques and a GIS presentation that force direct interactive evaluation of the data. The IHA can perform multiple data analysis tasks required by the RCRA permit. These include: (1) Development of a groundwater quality baseline prior to remediation startup, (2) Targeting of constituents for removal from RCRA GWPS, (3) Targeting of constituents for removal from UIC, permit, (4) Targeting of constituents for reduced, (5)Targeting of monitoring wells not producing representative samples, (6) Reduction in statistical evaluation, and (7) Identification of contamination from other facilities
A robust statistical method for association-based eQTL analysis.
Directory of Open Access Journals (Sweden)
Ning Jiang
Full Text Available It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS is statistical inference of linkage disequilibrium (LD between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation.We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations.The analyses show that the new method confers an improved statistical power for detecting genuine genetic association in subpopulations and an effective control of spurious associations stemmed from population structure when compared with other two popularly implemented methods in the literature of GWAS.
Effect of non-normality on test statistics for one-way independent groups designs.
Cribbie, Robert A; Fiksenbaum, Lisa; Keselman, H J; Wilcox, Rand R
2012-02-01
The data obtained from one-way independent groups designs is typically non-normal in form and rarely equally variable across treatment populations (i.e., population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e., the analysis of variance F test) typically provides invalid results (e.g., too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non-normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e., trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non-normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non-normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non-normal. © 2011 The British Psychological Society.
Uncertainty Analysis of In leakage Test for Pressurized Control Room Envelop
Energy Technology Data Exchange (ETDEWEB)
Lee, J. B. [KHNP Central Research Institute, Daejeon (Korea, Republic of)
2013-10-15
In leakage tests for control room envelops(CRE) of newly constructed nuclear power plants are required to prove the control room habitability. Results of the in leakage tests should be analyzed using an uncertainty analysis. Test uncertainty can be an issue if the test results for pressurized CREs show low in leakage. To have a better knowledge of the test uncertainty, a statistical model for the uncertainty analysis is described here and a representative uncertainty analysis of a sample in leakage test is presented. A statistical method for analyzing the uncertainty of the in leakage test is presented here and a representative uncertainty analysis of a sample in leakage test was performed. By using the statistical method we can evaluate the test result with certain level of significance. This method can be more helpful when the difference of the two mean values of the test result is small.
Uncertainty Analysis of In leakage Test for Pressurized Control Room Envelop
International Nuclear Information System (INIS)
Lee, J. B.
2013-01-01
In leakage tests for control room envelops(CRE) of newly constructed nuclear power plants are required to prove the control room habitability. Results of the in leakage tests should be analyzed using an uncertainty analysis. Test uncertainty can be an issue if the test results for pressurized CREs show low in leakage. To have a better knowledge of the test uncertainty, a statistical model for the uncertainty analysis is described here and a representative uncertainty analysis of a sample in leakage test is presented. A statistical method for analyzing the uncertainty of the in leakage test is presented here and a representative uncertainty analysis of a sample in leakage test was performed. By using the statistical method we can evaluate the test result with certain level of significance. This method can be more helpful when the difference of the two mean values of the test result is small
Statistical trend analysis methodology for rare failures in changing technical systems
International Nuclear Information System (INIS)
Ott, K.O.; Hoffmann, H.J.
1983-07-01
A methodology for a statistical trend analysis (STA) in failure rates is presented. It applies primarily to relatively rare events in changing technologies or components. The formulation is more general and the assumptions are less restrictive than in a previously published version. Relations of the statistical analysis and probabilistic assessment (PRA) are discussed in terms of categorization of decisions for action following particular failure events. The significance of tentatively identified trends is explored. In addition to statistical tests for trend significance, a combination of STA and PRA results quantifying the trend complement is proposed. The STA approach is compared with other concepts for trend characterization. (orig.)
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...... contains a multivariate test for the existence of I(2) variables. This test is illustrated using a data set consisting of U.K. and foreign prices and interest rates as well as the exchange rate....
Integrated Data Collection Analysis (IDCA) Program - Statistical Analysis of RDX Standard Data Sets
Energy Technology Data Exchange (ETDEWEB)
Sandstrom, Mary M. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Brown, Geoffrey W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Preston, Daniel N. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Pollard, Colin J. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Warner, Kirstin F. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Sorensen, Daniel N. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Remmers, Daniel L. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Phillips, Jason J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Shelley, Timothy J. [Air Force Research Lab. (AFRL), Tyndall AFB, FL (United States); Reyes, Jose A. [Applied Research Associates, Tyndall AFB, FL (United States); Hsu, Peter C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Reynolds, John G. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
2015-10-30
The Integrated Data Collection Analysis (IDCA) program is conducting a Proficiency Test for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Type II Class 5 standard. The material was tested as a well-characterized standard several times during the proficiency study to assess differences among participants and the range of results that may arise for well-behaved explosive materials. The analyses show that there are detectable differences among the results from IDCA participants. While these differences are statistically significant, most of them can be disregarded for comparison purposes to assess potential variability when laboratories attempt to measure identical samples using methods assumed to be nominally the same. The results presented in this report include the average sensitivity results for the IDCA participants and the ranges of values obtained. The ranges represent variation about the mean values of the tests of between 26% and 42%. The magnitude of this variation is attributed to differences in operator, method, and environment as well as the use of different instruments that are also of varying age. The results appear to be a good representation of the broader safety testing community based on the range of methods, instruments, and environments included in the IDCA Proficiency Test.
Critical analysis of adsorption data statistically
Kaushal, Achla; Singh, S. K.
2017-10-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are mango leaf powder.
Ensuring Positiveness of the Scaled Difference Chi-square Test Statistic.
Satorra, Albert; Bentler, Peter M
2010-06-01
A scaled difference test statistic [Formula: see text] that can be computed from standard software of structural equation models (SEM) by hand calculations was proposed in Satorra and Bentler (2001). The statistic [Formula: see text] is asymptotically equivalent to the scaled difference test statistic T̄(d) introduced in Satorra (2000), which requires more involved computations beyond standard output of SEM software. The test statistic [Formula: see text] has been widely used in practice, but in some applications it is negative due to negativity of its associated scaling correction. Using the implicit function theorem, this note develops an improved scaling correction leading to a new scaled difference statistic T̄(d) that avoids negative chi-square values.
Rweb:Web-based Statistical Analysis
Directory of Open Access Journals (Sweden)
Jeff Banfield
1999-03-01
Full Text Available Rweb is a freely accessible statistical analysis environment that is delivered through the World Wide Web (WWW. It is based on R, a well known statistical analysis package. The only requirement to run the basic Rweb interface is a WWW browser that supports forms. If you want graphical output you must, of course, have a browser that supports graphics. The interface provides access to WWW accessible data sets, so you may run Rweb on your own data. Rweb can provide a four window statistical computing environment (code input, text output, graphical output, and error information through browsers that support Javascript. There is also a set of point and click modules under development for use in introductory statistics courses.
Statistical analysis of angular correlation measurements
International Nuclear Information System (INIS)
Oliveira, R.A.A.M. de.
1986-01-01
Obtaining the multipole mixing ratio, δ, of γ transitions in angular correlation measurements is a statistical problem characterized by the small number of angles in which the observation is made and by the limited statistic of counting, α. The inexistence of a sufficient statistics for the estimator of δ, is shown. Three different estimators for δ were constructed and their properties of consistency, bias and efficiency were tested. Tests were also performed in experimental results obtained in γ-γ directional correlation measurements. (Author) [pt
Regularized Statistical Analysis of Anatomy
DEFF Research Database (Denmark)
Sjöstrand, Karl
2007-01-01
This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus...... and mind. Statistics represents a quintessential part of such investigations as they are preluded by a clinical hypothesis that must be verified based on observed data. The massive amounts of image data produced in each examination pose an important and interesting statistical challenge...... efficient algorithms which make the analysis of large data sets feasible, and gives examples of applications....
Cornillon, Pierre-Andre; Husson, Francois; Jegou, Nicolas; Josse, Julie; Kloareg, Maela; Matzner-Lober, Eric; Rouviere, Laurent
2012-01-01
An Overview of RMain ConceptsInstalling RWork SessionHelpR ObjectsFunctionsPackagesExercisesPreparing DataReading Data from FileExporting ResultsManipulating VariablesManipulating IndividualsConcatenating Data TablesCross-TabulationExercisesR GraphicsConventional Graphical FunctionsGraphical Functions with latticeExercisesMaking Programs with RControl FlowsPredefined FunctionsCreating a FunctionExercisesStatistical MethodsIntroduction to the Statistical MethodsA Quick Start with RInstalling ROpening and Closing RThe Command PromptAttribution, Objects, and FunctionSelectionOther Rcmdr PackageImporting (or Inputting) DataGraphsStatistical AnalysisHypothesis TestConfidence Intervals for a MeanChi-Square Test of IndependenceComparison of Two MeansTesting Conformity of a ProportionComparing Several ProportionsThe Power of a TestRegressionSimple Linear RegressionMultiple Linear RegressionPartial Least Squares (PLS) RegressionAnalysis of Variance and CovarianceOne-Way Analysis of VarianceMulti-Way Analysis of Varian...
Statistical tests for power-law cross-correlated processes
Podobnik, Boris; Jiang, Zhi-Qiang; Zhou, Wei-Xing; Stanley, H. Eugene
2011-12-01
For stationary time series, the cross-covariance and the cross-correlation as functions of time lag n serve to quantify the similarity of two time series. The latter measure is also used to assess whether the cross-correlations are statistically significant. For nonstationary time series, the analogous measures are detrended cross-correlations analysis (DCCA) and the recently proposed detrended cross-correlation coefficient, ρDCCA(T,n), where T is the total length of the time series and n the window size. For ρDCCA(T,n), we numerically calculated the Cauchy inequality -1≤ρDCCA(T,n)≤1. Here we derive -1≤ρDCCA(T,n)≤1 for a standard variance-covariance approach and for a detrending approach. For overlapping windows, we find the range of ρDCCA within which the cross-correlations become statistically significant. For overlapping windows we numerically determine—and for nonoverlapping windows we derive—that the standard deviation of ρDCCA(T,n) tends with increasing T to 1/T. Using ρDCCA(T,n) we show that the Chinese financial market's tendency to follow the U.S. market is extremely weak. We also propose an additional statistical test that can be used to quantify the existence of cross-correlations between two power-law correlated time series.
Statistical tests for person misfit in computerized adaptive testing
Glas, Cornelis A.W.; Meijer, R.R.; van Krimpen-Stoop, Edith
1998-01-01
Recently, several person-fit statistics have been proposed to detect nonfitting response patterns. This study is designed to generalize an approach followed by Klauer (1995) to an adaptive testing system using the two-parameter logistic model (2PL) as a null model. The approach developed by Klauer
CORSSA: Community Online Resource for Statistical Seismicity Analysis
Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.
2011-12-01
Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.
Sandurska, Elżbieta; Szulc, Aleksandra
2016-01-01
Sandurska Elżbieta, Szulc Aleksandra. A method of statistical analysis in the field of sports science when assumptions of parametric tests are not violated. Journal of Education Health and Sport. 2016;6(13):275-287. eISSN 2391-8306. DOI http://dx.doi.org/10.5281/zenodo.293762 http://ojs.ukw.edu.pl/index.php/johs/article/view/4278 The journal has had 7 points in Ministry of Science and Higher Education parametric evaluation. Part B item 754 (09.12.2016). 754 Journal...
[Clinical research IV. Relevancy of the statistical test chosen].
Talavera, Juan O; Rivas-Ruiz, Rodolfo
2011-01-01
When we look at the difference between two therapies or the association of a risk factor or prognostic indicator with its outcome, we need to evaluate the accuracy of the result. This assessment is based on a judgment that uses information about the study design and statistical management of the information. This paper specifically mentions the relevance of the statistical test selected. Statistical tests are chosen mainly from two characteristics: the objective of the study and type of variables. The objective can be divided into three test groups: a) those in which you want to show differences between groups or inside a group before and after a maneuver, b) those that seek to show the relationship (correlation) between variables, and c) those that aim to predict an outcome. The types of variables are divided in two: quantitative (continuous and discontinuous) and qualitative (ordinal and dichotomous). For example, if we seek to demonstrate differences in age (quantitative variable) among patients with systemic lupus erythematosus (SLE) with and without neurological disease (two groups), the appropriate test is the "Student t test for independent samples." But if the comparison is about the frequency of females (binomial variable), then the appropriate statistical test is the χ(2).
PROSA: A computer program for statistical analysis of near-real-time-accountancy (NRTA) data
International Nuclear Information System (INIS)
Beedgen, R.; Bicking, U.
1987-04-01
The computer program PROSA (Program for Statistical Analysis of NRTA Data) is a tool to decide on the basis of statistical considerations if, in a given sequence of materials balance periods, a loss of material might have occurred or not. The evaluation of the material balance data is based on statistical test procedures. In PROSA three truncated sequential tests are applied to a sequence of material balances. The manual describes the statistical background of PROSA and how to use the computer program on an IBM-PC with DOS 3.1. (orig.) [de
Statistical analysis of tourism destination competitiveness
Directory of Open Access Journals (Sweden)
Attilio Gardini
2013-05-01
Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.
Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F
2013-08-01
To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches - for example, analysis of variance (ANOVA) - are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will
Fisher statistics for analysis of diffusion tensor directional information.
Hutchinson, Elizabeth B; Rutecki, Paul A; Alexander, Andrew L; Sutula, Thomas P
2012-04-30
A statistical approach is presented for the quantitative analysis of diffusion tensor imaging (DTI) directional information using Fisher statistics, which were originally developed for the analysis of vectors in the field of paleomagnetism. In this framework, descriptive and inferential statistics have been formulated based on the Fisher probability density function, a spherical analogue of the normal distribution. The Fisher approach was evaluated for investigation of rat brain DTI maps to characterize tissue orientation in the corpus callosum, fornix, and hilus of the dorsal hippocampal dentate gyrus, and to compare directional properties in these regions following status epilepticus (SE) or traumatic brain injury (TBI) with values in healthy brains. Direction vectors were determined for each region of interest (ROI) for each brain sample and Fisher statistics were applied to calculate the mean direction vector and variance parameters in the corpus callosum, fornix, and dentate gyrus of normal rats and rats that experienced TBI or SE. Hypothesis testing was performed by calculation of Watson's F-statistic and associated p-value giving the likelihood that grouped observations were from the same directional distribution. In the fornix and midline corpus callosum, no directional differences were detected between groups, however in the hilus, significant (pstatistical comparison of tissue structural orientation. Copyright © 2012 Elsevier B.V. All rights reserved.
Variability analysis of AGN: a review of results using new statistical criteria
Zibecchi, L.; Andruchow, I.; Cellone, S. A.; Romero, G. E.; Combi, J. A.
We present here a re-analysis of the variability results of a sample of active galactic nuclei (AGN), which have been observed on several sessions with the 2.15 m "Jorge Sahade" telescope (CASLEO), San Juan, Argentina, and whose results are published (Romero et al. 1999, 2000, 2002; Cellone et al. 2000). The motivation for this new analysis is the implementation, dur- ing the last years, of improvements in the statistical criteria applied, taking quantitatively into account the incidence of the photometric errors (Cellone et al. 2007). This work is framed as a first step in an integral study on the statistical estimators of AGN variability. This study is motivated by the great diversity of statistical tests that have been proposed to analyze the variability of these objects. Since we note that, in some cases, the results of the object variability depend on the test used, we attempt to make a com- parative study of the various tests and analyze, under the given conditions, which of them is the most efficient and reliable.
Statistical tests to compare motif count exceptionalities
Directory of Open Access Journals (Sweden)
Vandewalle Vincent
2007-03-01
Full Text Available Abstract Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use.
Testing the statistical compatibility of independent data sets
International Nuclear Information System (INIS)
Maltoni, M.; Schwetz, T.
2003-01-01
We discuss a goodness-of-fit method which tests the compatibility between statistically independent data sets. The method gives sensible results even in cases where the χ 2 minima of the individual data sets are very low or when several parameters are fitted to a large number of data points. In particular, it avoids the problem that a possible disagreement between data sets becomes diluted by data points which are insensitive to the crucial parameters. A formal derivation of the probability distribution function for the proposed test statistics is given, based on standard theorems of statistics. The application of the method is illustrated on data from neutrino oscillation experiments, and its complementarity to the standard goodness-of-fit is discussed
Monte Carlo based statistical power analysis for mediation models: methods and software.
Zhang, Zhiyong
2014-12-01
The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.
Monte Carlo testing in spatial statistics, with applications to spatial residuals
DEFF Research Database (Denmark)
Mrkvička, Tomáš; Soubeyrand, Samuel; Myllymäki, Mari
2016-01-01
This paper reviews recent advances made in testing in spatial statistics and discussed at the Spatial Statistics conference in Avignon 2015. The rank and directional quantile envelope tests are discussed and practical rules for their use are provided. These tests are global envelope tests...... with an appropriate type I error probability. Two novel examples are given on their usage. First, in addition to the test based on a classical one-dimensional summary function, the goodness-of-fit of a point process model is evaluated by means of the test based on a higher dimensional functional statistic, namely...
Diagnosis checking of statistical analysis in RCTs indexed in PubMed.
Lee, Paul H; Tse, Andy C Y
2017-11-01
Statistical analysis is essential for reporting of the results of randomized controlled trials (RCTs), as well as evaluating their effectiveness. However, the validity of a statistical analysis also depends on whether the assumptions of that analysis are valid. To review all RCTs published in journals indexed in PubMed during December 2014 to provide a complete picture of how RCTs handle assumptions of statistical analysis. We reviewed all RCTs published in December 2014 that appeared in journals indexed in PubMed using the Cochrane highly sensitive search strategy. The 2014 impact factors of the journals were used as proxies for their quality. The type of statistical analysis used and whether the assumptions of the analysis were tested were reviewed. In total, 451 papers were included. Of the 278 papers that reported a crude analysis for the primary outcomes, 31 (27·2%) reported whether the outcome was normally distributed. Of the 172 papers that reported an adjusted analysis for the primary outcomes, diagnosis checking was rarely conducted, with only 20%, 8·6% and 7% checked for generalized linear model, Cox proportional hazard model and multilevel model, respectively. Study characteristics (study type, drug trial, funding sources, journal type and endorsement of CONSORT guidelines) were not associated with the reporting of diagnosis checking. The diagnosis of statistical analyses in RCTs published in PubMed-indexed journals was usually absent. Journals should provide guidelines about the reporting of a diagnosis of assumptions. © 2017 Stichting European Society for Clinical Investigation Journal Foundation.
Kolmogorov complexity, pseudorandom generators and statistical models testing
Czech Academy of Sciences Publication Activity Database
Šindelář, Jan; Boček, Pavel
2002-01-01
Roč. 38, č. 6 (2002), s. 747-759 ISSN 0023-5954 R&D Projects: GA ČR GA102/99/1564 Institutional research plan: CEZ:AV0Z1075907 Keywords : Kolmogorov complexity * pseudorandom generators * statistical models testing Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.341, year: 2002
statistical tests for frequency distribution of mean gravity anomalies
African Journals Online (AJOL)
ES Obe
1980-03-01
Mar 1, 1980 ... STATISTICAL TESTS FOR FREQUENCY DISTRIBUTION OF MEAN. GRAVITY ANOMALIES. By ... approach. Kaula [1,2] discussed the method of applying statistical techniques in the ..... mathematical foundation of physical ...
Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.
Deng, Yangqing; Pan, Wei
2017-12-01
There is growing interest in testing genetic pleiotropy, which is when a single genetic variant influences multiple traits. Several methods have been proposed; however, these methods have some limitations. First, all the proposed methods are based on the use of individual-level genotype and phenotype data; in contrast, for logistical, and other, reasons, summary statistics of univariate SNP-trait associations are typically only available based on meta- or mega-analyzed large genome-wide association study (GWAS) data. Second, existing tests are based on marginal pleiotropy, which cannot distinguish between direct and indirect associations of a single genetic variant with multiple traits due to correlations among the traits. Hence, it is useful to consider conditional analysis, in which a subset of traits is adjusted for another subset of traits. For example, in spite of substantial lowering of low-density lipoprotein cholesterol (LDL) with statin therapy, some patients still maintain high residual cardiovascular risk, and, for these patients, it might be helpful to reduce their triglyceride (TG) level. For this purpose, in order to identify new therapeutic targets, it would be useful to identify genetic variants with pleiotropic effects on LDL and TG after adjusting the latter for LDL; otherwise, a pleiotropic effect of a genetic variant detected by a marginal model could simply be due to its association with LDL only, given the well-known correlation between the two types of lipids. Here, we develop a new pleiotropy testing procedure based only on GWAS summary statistics that can be applied for both marginal analysis and conditional analysis. Although the main technical development is based on published union-intersection testing methods, care is needed in specifying conditional models to avoid invalid statistical estimation and inference. In addition to the previously used likelihood ratio test, we also propose using generalized estimating equations under the
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.
Breunig, Nancy A.
Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.
Kosinski, Andrzej S
2013-03-15
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
A new statistic for the analysis of circular data in gamma-ray astronomy
Protheroe, R. J.
1985-01-01
A new statistic is proposed for the analysis of circular data. The statistic is designed specifically for situations where a test of uniformity is required which is powerful against alternatives in which a small fraction of the observations is grouped in a small range of directions, or phases.
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Conradsen, Knut; Skriver, Henning
2016-01-01
Based on an omnibus likelihood ratio test statistic for the equality of several variance-covariance matrices following the complex Wishart distribution with an associated p-value and a factorization of this test statistic, change analysis in a short sequence of multilook, polarimetric SAR data...... in the covariance matrix representation is carried out. The omnibus test statistic and its factorization detect if and when change(s) occur. The technique is demonstrated on airborne EMISAR L-band data but may be applied to Sentinel-1, Cosmo-SkyMed, TerraSAR-X, ALOS and RadarSat-2 or other dual- and quad...
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Conradsen, Knut; Skriver, Henning
2016-01-01
Based on an omnibus likelihood ratio test statistic for the equality of several variance-covariance matrices following the complex Wishart distribution with an associated p-value and a factorization of this test statistic, change analysis in a short sequence of multilook, polarimetric SAR data...... in the covariance matrix representation is carried out. The omnibus test statistic and its factorization detect if and when change(s) occur. The technique is demonstrated on airborne EMISAR L-band data but may be applied to Sentinel-1, Cosmo-SkyMed, TerraSAR-X, ALOS and RadarSat-2 or other dual- and quad...
Statistical inferences for bearings life using sudden death test
Directory of Open Access Journals (Sweden)
Morariu Cristin-Olimpiu
2017-01-01
Full Text Available In this paper we propose a calculus method for reliability indicators estimation and a complete statistical inferences for three parameters Weibull distribution of bearings life. Using experimental values regarding the durability of bearings tested on stands by the sudden death tests involves a series of particularities of the estimation using maximum likelihood method and statistical inference accomplishment. The paper detailing these features and also provides an example calculation.
Selecting the most appropriate inferential statistical test for your quantitative research study.
Bettany-Saltikov, Josette; Whittaker, Victoria Jane
2014-06-01
To discuss the issues and processes relating to the selection of the most appropriate statistical test. A review of the basic research concepts together with a number of clinical scenarios is used to illustrate this. Quantitative nursing research generally features the use of empirical data which necessitates the selection of both descriptive and statistical tests. Different types of research questions can be answered by different types of research designs, which in turn need to be matched to a specific statistical test(s). Discursive paper. This paper discusses the issues relating to the selection of the most appropriate statistical test and makes some recommendations as to how these might be dealt with. When conducting empirical quantitative studies, a number of key issues need to be considered. Considerations for selecting the most appropriate statistical tests are discussed and flow charts provided to facilitate this process. When nursing clinicians and researchers conduct quantitative research studies, it is crucial that the most appropriate statistical test is selected to enable valid conclusions to be made. © 2013 John Wiley & Sons Ltd.
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
Bayesian Sensitivity Analysis of Statistical Models with Missing Data.
Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng
2014-04-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.
Statistical analysis and interpolation of compositional data in materials science.
Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M
2015-02-09
Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Statistical Estimation of Heterogeneities: A New Frontier in Well Testing
Neuman, S. P.; Guadagnini, A.; Illman, W. A.; Riva, M.; Vesselinov, V. V.
2001-12-01
Well-testing methods have traditionally relied on analytical solutions of groundwater flow equations in relatively simple domains, consisting of one or at most a few units having uniform hydraulic properties. Recently, attention has been shifting toward methods and solutions that would allow one to characterize subsurface heterogeneities in greater detail. On one hand, geostatistical inverse methods are being used to assess the spatial variability of parameters, such as permeability and porosity, on the basis of multiple cross-hole pressure interference tests. On the other hand, analytical solutions are being developed to describe the mean and variance (first and second statistical moments) of flow to a well in a randomly heterogeneous medium. Geostatistical inverse interpretation of cross-hole tests yields a smoothed but detailed "tomographic" image of how parameters actually vary in three-dimensional space, together with corresponding measures of estimation uncertainty. Moment solutions may soon allow one to interpret well tests in terms of statistical parameters such as the mean and variance of log permeability, its spatial autocorrelation and statistical anisotropy. The idea of geostatistical cross-hole tomography is illustrated through pneumatic injection tests conducted in unsaturated fractured tuff at the Apache Leap Research Site near Superior, Arizona. The idea of using moment equations to interpret well-tests statistically is illustrated through a recently developed three-dimensional solution for steady state flow to a well in a bounded, randomly heterogeneous, statistically anisotropic aquifer.
A shift from significance test to hypothesis test through power analysis in medical research.
Singh, G
2006-01-01
Medical research literature until recently, exhibited substantial dominance of the Fisher's significance test approach of statistical inference concentrating more on probability of type I error over Neyman-Pearson's hypothesis test considering both probability of type I and II error. Fisher's approach dichotomises results into significant or not significant results with a P value. The Neyman-Pearson's approach talks of acceptance or rejection of null hypothesis. Based on the same theory these two approaches deal with same objective and conclude in their own way. The advancement in computing techniques and availability of statistical software have resulted in increasing application of power calculations in medical research and thereby reporting the result of significance tests in the light of power of the test also. Significance test approach, when it incorporates power analysis contains the essence of hypothesis test approach. It may be safely argued that rising application of power analysis in medical research may have initiated a shift from Fisher's significance test to Neyman-Pearson's hypothesis test procedure.
International Nuclear Information System (INIS)
Brodsky, A.
1979-01-01
Some recent reports of Mancuso, Stewart and Kneale claim findings of radiation-produced cancer in the Hanford worker population. These claims are based on statistical computations that use small differences in accumulated exposures between groups dying of cancer and groups dying of other causes; actual mortality and longevity were not reported. This paper presents a statistical method for evaluation of actual mortality and longevity longitudinally over time, as applied in a primary analysis of the mortality experience of the Hanford worker population. Although available, this method was not utilized in the Mancuso-Stewart-Kneale paper. The author's preliminary longitudinal analysis shows that the gross mortality experience of persons employed at Hanford during 1943-70 interval did not differ significantly from that of certain controls, when both employees and controls were selected from families with two or more offspring and comparison were matched by age, sex, race and year of entry into employment. This result is consistent with findings reported by Sanders (Health Phys. vol.35, 521-538, 1978). The method utilizes an approximate chi-square (1 D.F.) statistic for testing population subgroup comparisons, as well as the cumulation of chi-squares (1 D.F.) for testing the overall result of a particular type of comparison. The method is available for computer testing of the Hanford mortality data, and could also be adapted to morbidity or other population studies. (author)
688,112 statistical results : Content mining psychology articles for statistical test results
Hartgerink, C.H.J.
2016-01-01
In this data deposit, I describe a dataset that is the result of content mining 167,318 published articles for statistical test results reported according to the standards prescribed by the American Psychological Association (APA). Articles published by the APA, Springer, Sage, and Taylor & Francis
International Nuclear Information System (INIS)
Lima, Waldir C. de; Lainetti, Paulo E.O.; Lima, Roberto M. de; Peres, Henrique G.
1996-01-01
The purpose of this work is the study for introduction of the statistical control in test and analysis realized in the Departamento de Tecnologia de Combustiveis. Are succinctly introduced: theories of statistical process control, elaboration of control graphs, the definition of standards test (or analysis) and how the standards are employed for determination the control limits in the graphs. The more expressive result is the applied form for the practice quality control, moreover it is also exemplified the utilization of one standard of verification and analysis in the laboratory of control. (author)
Statistical shape analysis with applications in R
Dryden, Ian L
2016-01-01
A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...
Spatial analysis statistics, visualization, and computational methods
Oyana, Tonny J
2015-01-01
An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...
Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)
Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee
2010-12-01
Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review
De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric
2010-01-11
Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.
CUSUM-based person-fit statistics for adaptive testing
van Krimpen-Stoop, Edith; Meijer, R.R.
2001-01-01
Item scores that do not fit an assumed item response theory model may cause the latent trait value to be inaccurately estimated. Several person-fit statistics for detecting nonfitting score patterns for paper-and-pencil tests have been proposed. In the context of computerized adaptive tests (CAT),
CUSUM-based person-fit statistics for adaptive testing
van Krimpen-Stoop, Edith; Meijer, R.R.
1999-01-01
Item scores that do not fit an assumed item response theory model may cause the latent trait value to be estimated inaccurately. Several person-fit statistics for detecting nonfitting score patterns for paper-and-pencil tests have been proposed. In the context of computerized adaptive tests (CAT),
International Nuclear Information System (INIS)
Gouvea, Andre de; Murayama, Hitoshi
2003-01-01
'Anarchy' is the hypothesis that there is no fundamental distinction among the three flavors of neutrinos. It describes the mixing angles as random variables, drawn from well-defined probability distributions dictated by the group Haar measure. We perform a Kolmogorov-Smirnov (KS) statistical test to verify whether anarchy is consistent with all neutrino data, including the new result presented by KamLAND. We find a KS probability for Nature's choice of mixing angles equal to 64%, quite consistent with the anarchical hypothesis. In turn, assuming that anarchy is indeed correct, we compute lower bounds on vertical bar U e3 vertical bar 2 , the remaining unknown 'angle' of the leptonic mixing matrix
Corrections of the NIST Statistical Test Suite for Randomness
Kim, Song-Ju; Umeno, Ken; Hasegawa, Akio
2004-01-01
It is well known that the NIST statistical test suite was used for the evaluation of AES candidate algorithms. We have found that the test setting of Discrete Fourier Transform test and Lempel-Ziv test of this test suite are wrong. We give four corrections of mistakes in the test settings. This suggests that re-evaluation of the test results should be needed.
Application of descriptive statistics in analysis of experimental data
Mirilović Milorad; Pejin Ivana
2008-01-01
Statistics today represent a group of scientific methods for the quantitative and qualitative investigation of variations in mass appearances. In fact, statistics present a group of methods that are used for the accumulation, analysis, presentation and interpretation of data necessary for reaching certain conclusions. Statistical analysis is divided into descriptive statistical analysis and inferential statistics. The values which represent the results of an experiment, and which are the subj...
Velasco-Tapia, Fernando
2014-01-01
Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC) volcanic range (Mexican Volcanic Belt). In this locality, the volcanic activity (3.7 to 0.5 Ma) was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward's linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas) in the comingled lavas (binary mixtures). PMID:24737994
Directory of Open Access Journals (Sweden)
Fernando Velasco-Tapia
2014-01-01
Full Text Available Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC volcanic range (Mexican Volcanic Belt. In this locality, the volcanic activity (3.7 to 0.5 Ma was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward’s linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas in the comingled lavas (binary mixtures.
Statistical analysis of subjective preferences for video enhancement
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
Analysis of spectral data with rare events statistics
International Nuclear Information System (INIS)
Ilyushchenko, V.I.; Chernov, N.I.
1990-01-01
The case is considered of analyzing experimental data, when the results of individual experimental runs cannot be summed due to large systematic errors. A statistical analysis of the hypothesis about the persistent peaks in the spectra has been performed by means of the Neyman-Pearson test. The computations demonstrate the confidence level for the hypothesis about the presence of a persistent peak in the spectrum is proportional to the square root of the number of independent experimental runs, K. 5 refs
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data
Directory of Open Access Journals (Sweden)
Maria Vinaixa
2012-10-01
Full Text Available Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.
Benchmark validation of statistical models: Application to mediation analysis of imagery and memory.
MacKinnon, David P; Valente, Matthew J; Wurpts, Ingrid C
2018-03-29
This article describes benchmark validation, an approach to validating a statistical model. According to benchmark validation, a valid model generates estimates and research conclusions consistent with a known substantive effect. Three types of benchmark validation-(a) benchmark value, (b) benchmark estimate, and (c) benchmark effect-are described and illustrated with examples. Benchmark validation methods are especially useful for statistical models with assumptions that are untestable or very difficult to test. Benchmark effect validation methods were applied to evaluate statistical mediation analysis in eight studies using the established effect that increasing mental imagery improves recall of words. Statistical mediation analysis led to conclusions about mediation that were consistent with established theory that increased imagery leads to increased word recall. Benchmark validation based on established substantive theory is discussed as a general way to investigate characteristics of statistical models and a complement to mathematical proof and statistical simulation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Statistical Analysis of Research Data | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data. The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.
Effect of the absolute statistic on gene-sampling gene-set analysis methods.
Nam, Dougu
2017-06-01
Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
CFAssay: statistical analysis of the colony formation assay
International Nuclear Information System (INIS)
Braselmann, Herbert; Michna, Agata; Heß, Julia; Unger, Kristian
2015-01-01
Colony formation assay is the gold standard to determine cell reproductive death after treatment with ionizing radiation, applied for different cell lines or in combination with other treatment modalities. Associated linear-quadratic cell survival curves can be calculated with different methods. For easy code exchange and methodological standardisation among collaborating laboratories a software package CFAssay for R (R Core Team, R: A Language and Environment for Statistical Computing, 2014) was established to perform thorough statistical analysis of linear-quadratic cell survival curves after treatment with ionizing radiation and of two-way designs of experiments with chemical treatments only. CFAssay offers maximum likelihood and related methods by default and the least squares or weighted least squares method can be optionally chosen. A test for comparision of cell survival curves and an ANOVA test for experimental two-way designs are provided. For the two presented examples estimated parameters do not differ much between maximum-likelihood and least squares. However the dispersion parameter of the quasi-likelihood method is much more sensitive for statistical variation in the data than the multiple R 2 coefficient of determination from the least squares method. The dispersion parameter for goodness of fit and different plot functions in CFAssay help to evaluate experimental data quality. As open source software interlaboratory code sharing between users is facilitated
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P valuelinear regression P value). The statistical power of CAT test decreased, while the result of linear regression analysis remained the same when population size was reduced by 100 times and AMI incidence rate remained unchanged. The two statistical methods have their advantages and disadvantages. It is necessary to choose statistical method according the fitting degree of data, or comprehensively analyze the results of two methods.
Statistical Analysis of Geo-electric Imaging and Geotechnical Test ...
Indian Academy of Sciences (India)
12
On the other hand cost-effective geoelctric imaging methods provide 2-D / 3-D .... SPSS (Statistical package for social sciences) have been used to carry out linear ..... P W J 1997 Theory of ionic surface electrical conduction in porous media;.
Comparing statistical tests for detecting soil contamination greater than background
International Nuclear Information System (INIS)
Hardin, J.W.; Gilbert, R.O.
1993-12-01
The Washington State Department of Ecology (WSDE) recently issued a report that provides guidance on statistical issues regarding investigation and cleanup of soil and groundwater contamination under the Model Toxics Control Act Cleanup Regulation. Included in the report are procedures for determining a background-based cleanup standard and for conducting a 3-step statistical test procedure to decide if a site is contaminated greater than the background standard. The guidance specifies that the State test should only be used if the background and site data are lognormally distributed. The guidance in WSDE allows for using alternative tests on a site-specific basis if prior approval is obtained from WSDE. This report presents the results of a Monte Carlo computer simulation study conducted to evaluate the performance of the State test and several alternative tests for various contamination scenarios (background and site data distributions). The primary test performance criteria are (1) the probability the test will indicate that a contaminated site is indeed contaminated, and (2) the probability that the test will indicate an uncontaminated site is contaminated. The simulation study was conducted assuming the background concentrations were from lognormal or Weibull distributions. The site data were drawn from distributions selected to represent various contamination scenarios. The statistical tests studied are the State test, t test, Satterthwaite's t test, five distribution-free tests, and several tandem tests (wherein two or more tests are conducted using the same data set)
Statistical analysis with Excel for dummies
Schmuller, Joseph
2013-01-01
Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro
Testing and qualification of confidence in statistical procedures
Energy Technology Data Exchange (ETDEWEB)
Serghiuta, D.; Tholammakkil, J.; Hammouda, N. [Canadian Nuclear Safety Commission (Canada); O' Hagan, A. [Sheffield Univ. (United Kingdom)
2014-07-01
This paper discusses a framework for designing artificial test problems, evaluation criteria, and two of the benchmark tests developed under a research project initiated by the Canadian Nuclear Safety Commission to investigate the approaches for qualification of tolerance limit methods and algorithms proposed for application in optimization of CANDU regional/neutron overpower protection trip setpoints for aged conditions. A significant component of this investigation has been the development of a series of benchmark problems of gradually increased complexity, from simple 'theoretical' problems up to complex problems closer to the real application. The first benchmark problem discussed in this paper is a simplified scalar problem which does not involve extremal, maximum or minimum, operations, typically encountered in the real applications. The second benchmark is a high dimensional, but still simple, problem for statistical inference of maximum channel power during normal operation. Bayesian algorithms have been developed for each benchmark problem to provide an independent way of constructing tolerance limits from the same data and allow assessing how well different methods make use of those data and, depending on the type of application, evaluating what the level of 'conservatism' is. The Bayesian method is not, however, used as a reference method, or 'gold' standard, but simply as an independent review method. The approach and the tests developed can be used as a starting point for developing a generic suite (generic in the sense of potentially applying whatever the proposed statistical method) of empirical studies, with clear criteria for passing those tests. Some lessons learned, in particular concerning the need to assure the completeness of the description of the application and the role of completeness of input information, are also discussed. It is concluded that a formal process which includes extended and detailed benchmark
A shift from significance test to hypothesis test through power analysis in medical research
Directory of Open Access Journals (Sweden)
Singh Girish
2006-01-01
Full Text Available Medical research literature until recently, exhibited substantial dominance of the Fisher′s significance test approach of statistical inference concentrating more on probability of type I error over Neyman-Pearson′s hypothesis test considering both probability of type I and II error. Fisher′s approach dichotomises results into significant or not significant results with a P value. The Neyman-Pearson′s approach talks of acceptance or rejection of null hypothesis. Based on the same theory these two approaches deal with same objective and conclude in their own way. The advancement in computing techniques and availability of statistical software have resulted in increasing application of power calculations in medical research and thereby reporting the result of significance tests in the light of power of the test also. Significance test approach, when it incorporates power analysis contains the essence of hypothesis test approach. It may be safely argued that rising application of power analysis in medical research may have initiated a shift from Fisher′s significance test to Neyman-Pearson′s hypothesis test procedure.
Test for the statistical significance of differences between ROC curves
International Nuclear Information System (INIS)
Metz, C.E.; Kronman, H.B.
1979-01-01
A test for the statistical significance of observed differences between two measured Receiver Operating Characteristic (ROC) curves has been designed and evaluated. The set of observer response data for each ROC curve is assumed to be independent and to arise from a ROC curve having a form which, in the absence of statistical fluctuations in the response data, graphs as a straight line on double normal-deviate axes. To test the significance of an apparent difference between two measured ROC curves, maximum likelihood estimates of the two parameters of each curve and the associated parameter variances and covariance are calculated from the corresponding set of observer response data. An approximate Chi-square statistic with two degrees of freedom is then constructed from the differences between the parameters estimated for each ROC curve and from the variances and covariances of these estimates. This statistic is known to be truly Chi-square distributed only in the limit of large numbers of trials in the observer performance experiments. Performance of the statistic for data arising from a limited number of experimental trials was evaluated. Independent sets of rating scale data arising from the same underlying ROC curve were paired, and the fraction of differences found (falsely) significant was compared to the significance level, α, used with the test. Although test performance was found to be somewhat dependent on both the number of trials in the data and the position of the underlying ROC curve in the ROC space, the results for various significance levels showed the test to be reliable under practical experimental conditions
Which statistics should tropical biologists learn?
Loaiza Velásquez, Natalia; González Lutz, María Isabel; Monge-Nájera, Julián
2011-09-01
Tropical biologists study the richest and most endangered biodiversity in the planet, and in these times of climate change and mega-extinctions, the need for efficient, good quality research is more pressing than in the past. However, the statistical component in research published by tropical authors sometimes suffers from poor quality in data collection; mediocre or bad experimental design and a rigid and outdated view of data analysis. To suggest improvements in their statistical education, we listed all the statistical tests and other quantitative analyses used in two leading tropical journals, the Revista de Biología Tropical and Biotropica, during a year. The 12 most frequent tests in the articles were: Analysis of Variance (ANOVA), Chi-Square Test, Student's T Test, Linear Regression, Pearson's Correlation Coefficient, Mann-Whitney U Test, Kruskal-Wallis Test, Shannon's Diversity Index, Tukey's Test, Cluster Analysis, Spearman's Rank Correlation Test and Principal Component Analysis. We conclude that statistical education for tropical biologists must abandon the old syllabus based on the mathematical side of statistics and concentrate on the correct selection of these and other procedures and tests, on their biological interpretation and on the use of reliable and friendly freeware. We think that their time will be better spent understanding and protecting tropical ecosystems than trying to learn the mathematical foundations of statistics: in most cases, a well designed one-semester course should be enough for their basic requirements.
Statistical analysis of dynamic parameters of the core
International Nuclear Information System (INIS)
Ionov, V.S.
2007-01-01
The transients of various types were investigated for the cores of zero power critical facilities in RRC KI and NPP. Dynamic parameters of neutron transients were explored by tool statistical analysis. Its have sufficient duration, few channels for currents of chambers and reactivity and also some channels for technological parameters. On these values the inverse period. reactivity, lifetime of neutrons, reactivity coefficients and some effects of a reactivity are determinate, and on the values were restored values of measured dynamic parameters as result of the analysis. The mathematical means of statistical analysis were used: approximation(A), filtration (F), rejection (R), estimation of parameters of descriptive statistic (DSP), correlation performances (kk), regression analysis(KP), the prognosis (P), statistician criteria (SC). The calculation procedures were realized by computer language MATLAB. The reasons of methodical and statistical errors are submitted: inadequacy of model operation, precision neutron-physical parameters, features of registered processes, used mathematical model in reactivity meters, technique of processing for registered data etc. Examples of results of statistical analysis. Problems of validity of the methods used for definition and certification of values of statistical parameters and dynamic characteristics are considered (Authors)
Wu, Hao
2018-05-01
In structural equation modelling (SEM), a robust adjustment to the test statistic or to its reference distribution is needed when its null distribution deviates from a χ 2 distribution, which usually arises when data do not follow a multivariate normal distribution. Unfortunately, existing studies on this issue typically focus on only a few methods and neglect the majority of alternative methods in statistics. Existing simulation studies typically consider only non-normal distributions of data that either satisfy asymptotic robustness or lead to an asymptotic scaled χ 2 distribution. In this work we conduct a comprehensive study that involves both typical methods in SEM and less well-known methods from the statistics literature. We also propose the use of several novel non-normal data distributions that are qualitatively different from the non-normal distributions widely used in existing studies. We found that several under-studied methods give the best performance under specific conditions, but the Satorra-Bentler method remains the most viable method for most situations. © 2017 The British Psychological Society.
Collecting operational event data for statistical analysis
International Nuclear Information System (INIS)
Atwood, C.L.
1994-09-01
This report gives guidance for collecting operational data to be used for statistical analysis, especially analysis of event counts. It discusses how to define the purpose of the study, the unit (system, component, etc.) to be studied, events to be counted, and demand or exposure time. Examples are given of classification systems for events in the data sources. A checklist summarizes the essential steps in data collection for statistical analysis
Per Object statistical analysis
DEFF Research Database (Denmark)
2008-01-01
of a specific class in turn, and uses as pair of PPO stages to derive the statistics and then assign them to the objects' Object Variables. It may be that this could all be done in some other, simply way, but several other ways that were tried did not succeed. The procedure ouptut has been tested against...
Directory of Open Access Journals (Sweden)
Claudette Maria Medeiros Vendramini
2004-12-01
Full Text Available Este estudo objetivou analisar as 18 questões (do tipo múltipla escolha de uma prova sobre conceitos básicos de Estatística pelas teorias clássica e moderna. Participaram 325 universitários, selecionados aleatoriamente das áreas de humanas, exatas e saúde. A análise indicou que a prova é predominantemente unidimensional e que os itens podem ser mais bem ajustados ao modelo de três parâmetros. Os índices de dificuldade, discriminação e correlação bisserial apresentam valores aceitáveis. Sugere-se a inclusão de novos itens na prova, que busquem confiabilidade e validade para o contexto educacional e revelem o raciocínio estatístico de universitários ao ler representações de dados estatísticos.This study aimed at to analyze the 18 questions (of multiple choice type of a test on basic concepts of Statistics for the classic and modern theories. The test was taken by 325 undergraduate students, randomly selected from the areas of Human, Exact and Health Sciences. The analysis indicated that the test has predominantly one dimension and that the items can be better fitting to the model of three parameters. The indexes of difficulty, discrimination and biserial correlation present acceptable values. It is suggested to include new items to the test in order to obtain reliability and validity to use it in the education context and to reveal the statistical reasoning of undergraduate students when dealing with statistical data representation.
Statistical Analysis for High-Dimensional Data : The Abel Symposium 2014
Bühlmann, Peter; Glad, Ingrid; Langaas, Mette; Richardson, Sylvia; Vannucci, Marina
2016-01-01
This book features research contributions from The Abel Symposium on Statistical Analysis for High Dimensional Data, held in Nyvågar, Lofoten, Norway, in May 2014. The focus of the symposium was on statistical and machine learning methodologies specifically developed for inference in “big data” situations, with particular reference to genomic applications. The contributors, who are among the most prominent researchers on the theory of statistics for high dimensional inference, present new theories and methods, as well as challenging applications and computational solutions. Specific themes include, among others, variable selection and screening, penalised regression, sparsity, thresholding, low dimensional structures, computational challenges, non-convex situations, learning graphical models, sparse covariance and precision matrices, semi- and non-parametric formulations, multiple testing, classification, factor models, clustering, and preselection. Highlighting cutting-edge research and casting light on...
Wang, Hao; Wang, Qunwei; He, Ming
2018-05-01
In order to investigate and improve the level of detection technology of water content in liquid chemical reagents of domestic laboratories, proficiency testing provider PT0031 (CNAS) has organized proficiency testing program of water content in toluene, 48 laboratories from 18 provinces/cities/municipals took part in the PT. This paper introduces the implementation process of proficiency testing for determination of water content in toluene, including sample preparation, homogeneity and stability test, the results of statistics of iteration robust statistic technique and analysis, summarized and analyzed those of the different test standards which are widely used in the laboratories, put forward the technological suggestions for the improvement of the test quality of water content. Satisfactory results were obtained by 43 laboratories, amounting to 89.6% of the total participating laboratories.
Luo, Li; Zhu, Yun; Xiong, Momiao
2012-06-01
The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.
Reliability Verification of DBE Environment Simulation Test Facility by using Statistics Method
International Nuclear Information System (INIS)
Jang, Kyung Nam; Kim, Jong Soeg; Jeong, Sun Chul; Kyung Heum
2011-01-01
In the nuclear power plant, all the safety-related equipment including cables under the harsh environment should perform the equipment qualification (EQ) according to the IEEE std 323. There are three types of qualification methods including type testing, operating experience and analysis. In order to environmentally qualify the safety-related equipment using type testing method, not analysis or operation experience method, the representative sample of equipment, including interfaces, should be subjected to a series of tests. Among these tests, Design Basis Events (DBE) environment simulating test is the most important test. DBE simulation test is performed in DBE simulation test chamber according to the postulated DBE conditions including specified high-energy line break (HELB), loss of coolant accident (LOCA), main steam line break (MSLB) and etc, after thermal and radiation aging. Because most DBE conditions have 100% humidity condition, in order to trace temperature and pressure of DBE condition, high temperature steam should be used. During DBE simulation test, if high temperature steam under high pressure inject to the DBE test chamber, the temperature and pressure in test chamber rapidly increase over the target temperature. Therefore, the temperature and pressure in test chamber continue fluctuating during the DBE simulation test to meet target temperature and pressure. We should ensure fairness and accuracy of test result by confirming the performance of DBE environment simulation test facility. In this paper, in order to verify reliability of DBE environment simulation test facility, statistics method is used
Statistics and analysis of scientific data
Bonamente, Massimiliano
2013-01-01
Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...
Statistical analysis of global horizontal solar irradiation GHI in Fez city, Morocco
Bounoua, Z.; Mechaqrane, A.
2018-05-01
An accurate knowledge of the solar energy reaching the ground is necessary for sizing and optimizing the performances of solar installations. This paper describes a statistical analysis of the global horizontal solar irradiation (GHI) at Fez city, Morocco. For better reliability, we have first applied a set of check procedures to test the quality of hourly GHI measurements. We then eliminate the erroneous values which are generally due to measurement or the cosine effect errors. Statistical analysis show that the annual mean daily values of GHI is of approximately 5 kWh/m²/day. Daily monthly mean values and other parameter are also calculated.
Comparison of small n statistical tests of differential expression applied to microarrays
Directory of Open Access Journals (Sweden)
Lee Anna Y
2009-02-01
Full Text Available Abstract Background DNA microarrays provide data for genome wide patterns of expression between observation classes. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of differential expression. We compare the performance of the standard t-test, fold change, and four small n statistical test methods designed to circumvent these problems. We report results of various normalization methods for empirical microarray data and of various random error models for simulated data. Results Three Empirical Bayes methods (CyberT, BRB, and limma t-statistics were the most effective statistical tests across simulated and both 2-colour cDNA and Affymetrix experimental data. The CyberT regularized t-statistic in particular was able to maintain expected false positive rates with simulated data showing high variances at low gene intensities, although at the cost of low true positive rates. The Local Pooled Error (LPE test introduced a bias that lowered false positive rates below theoretically expected values and had lower power relative to the top performers. The standard two-sample t-test and fold change were also found to be sub-optimal for detecting differentially expressed genes. The generalized log transformation was shown to be beneficial in improving results with certain data sets, in particular high variance cDNA data. Conclusion Pre-processing of data influences performance and the proper combination of pre-processing and statistical testing is necessary for obtaining the best results. All three Empirical Bayes methods assessed in our study are good choices for statistical tests for small n microarray studies for both Affymetrix and cDNA data. Choice of method for a particular study will depend on software and normalization preferences.
Method for statistical data analysis of multivariate observations
Gnanadesikan, R
1997-01-01
A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
A critique of statistical hypothesis testing in clinical research
Directory of Open Access Journals (Sweden)
Somik Raha
2011-01-01
Full Text Available Many have documented the difficulty of using the current paradigm of Randomized Controlled Trials (RCTs to test and validate the effectiveness of alternative medical systems such as Ayurveda. This paper critiques the applicability of RCTs for all clinical knowledge-seeking endeavors, of which Ayurveda research is a part. This is done by examining statistical hypothesis testing, the underlying foundation of RCTs, from a practical and philosophical perspective. In the philosophical critique, the two main worldviews of probability are that of the Bayesian and the frequentist. The frequentist worldview is a special case of the Bayesian worldview requiring the unrealistic assumptions of knowing nothing about the universe and believing that all observations are unrelated to each other. Many have claimed that the first belief is necessary for science, and this claim is debunked by comparing variations in learning with different prior beliefs. Moving beyond the Bayesian and frequentist worldviews, the notion of hypothesis testing itself is challenged on the grounds that a hypothesis is an unclear distinction, and assigning a probability on an unclear distinction is an exercise that does not lead to clarity of action. This critique is of the theory itself and not any particular application of statistical hypothesis testing. A decision-making frame is proposed as a way of both addressing this critique and transcending ideological debates on probability. An example of a Bayesian decision-making approach is shown as an alternative to statistical hypothesis testing, utilizing data from a past clinical trial that studied the effect of Aspirin on heart attacks in a sample population of doctors. As a big reason for the prevalence of RCTs in academia is legislation requiring it, the ethics of legislating the use of statistical methods for clinical research is also examined.
Statistical test theory for the behavioral sciences
de Gruijter, Dato N M
2007-01-01
Since the development of the first intelligence test in the early 20th century, educational and psychological tests have become important measurement techniques to quantify human behavior. Focusing on this ubiquitous yet fruitful area of research, Statistical Test Theory for the Behavioral Sciences provides both a broad overview and a critical survey of assorted testing theories and models used in psychology, education, and other behavioral science fields. Following a logical progression from basic concepts to more advanced topics, the book first explains classical test theory, covering true score, measurement error, and reliability. It then presents generalizability theory, which provides a framework to deal with various aspects of test scores. In addition, the authors discuss the concept of validity in testing, offering a strategy for evidence-based validity. In the two chapters devoted to item response theory (IRT), the book explores item response models, such as the Rasch model, and applications, incl...
SWToolbox: A surface-water tool-box for statistical analysis of streamflow time series
Kiang, Julie E.; Flynn, Kate; Zhai, Tong; Hummel, Paul; Granato, Gregory
2018-03-07
This report is a user guide for the low-flow analysis methods provided with version 1.0 of the Surface Water Toolbox (SWToolbox) computer program. The software combines functionality from two software programs—U.S. Geological Survey (USGS) SWSTAT and U.S. Environmental Protection Agency (EPA) DFLOW. Both of these programs have been used primarily for computation of critical low-flow statistics. The main analysis methods are the computation of hydrologic frequency statistics such as the 7-day minimum flow that occurs on average only once every 10 years (7Q10), computation of design flows including biologically based flows, and computation of flow-duration curves and duration hydrographs. Other annual, monthly, and seasonal statistics can also be computed. The interface facilitates retrieval of streamflow discharge data from the USGS National Water Information System and outputs text reports for a record of the analysis. Tools for graphing data and screening tests are available to assist the analyst in conducting the analysis.
Directory of Open Access Journals (Sweden)
Jing Zhang
2016-09-01
Full Text Available Background Statistical analysis and data visualization are two crucial aspects in molecular biology and biology. For analyses that compare one dependent variable between standard (e.g., control and one or multiple independent variables, a comprehensive yet highly streamlined solution is valuable. The computer programming language R is a popular platform for researchers to develop tools that are tailored specifically for their research needs. Here we present an R package RBioplot that takes raw input data for automated statistical analysis and plotting, highly compatible with various molecular biology and biochemistry lab techniques, such as, but not limited to, western blotting, PCR, and enzyme activity assays. Method The package is built based on workflows operating on a simple raw data layout, with minimum user input or data manipulation required. The package is distributed through GitHub, which can be easily installed through one single-line R command. A detailed installation guide is available at http://kenstoreylab.com/?page_id=2448. Users can also download demo datasets from the same website. Results and Discussion By integrating selected functions from existing statistical and data visualization packages with extensive customization, RBioplot features both statistical analysis and data visualization functionalities. Key properties of RBioplot include: -Fully automated and comprehensive statistical analysis, including normality test, equal variance test, Student’s t-test and ANOVA (with post-hoc tests; -Fully automated histogram, heatmap and joint-point curve plotting modules; -Detailed output files for statistical analysis, data manipulation and high quality graphs; -Axis range finding and user customizable tick settings; -High user-customizability.
Statistical Tutorial | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean
Ohyanagi, S.; Dileonardo, C.
2013-12-01
As a natural phenomenon earthquake occurrence is difficult to predict. Statistical analysis of earthquake data was performed using candlestick chart and Bollinger Band methods. These statistical methods, commonly used in the financial world to analyze market trends were tested against earthquake data. Earthquakes above Mw 4.0 located on shore of Sanriku (37.75°N ~ 41.00°N, 143.00°E ~ 144.50°E) from February 1973 to May 2013 were selected for analysis. Two specific patterns in earthquake occurrence were recognized through the analysis. One is a spread of candlestick prior to the occurrence of events greater than Mw 6.0. A second pattern shows convergence in the Bollinger Band, which implies a positive or negative change in the trend of earthquakes. Both patterns match general models for the buildup and release of strain through the earthquake cycle, and agree with both the characteristics of the candlestick chart and Bollinger Band analysis. These results show there is a high correlation between patterns in earthquake occurrence and trend analysis by these two statistical methods. The results of this study agree with the appropriateness of the application of these financial analysis methods to the analysis of earthquake occurrence.
DEFF Research Database (Denmark)
Holbech, Henrik
-contribution of each individual to the measured response. Furthermore, the combination of a Gamma-Poisson stochastic part with a Weibull concentration-response model allowed accounting for the inter-replicate variability. Second, we checked for the possibility of optimizing the initial experimental design through...... was twofold. First, we refined the statistical analyses of reproduction data accounting for mortality all along the test period. The variable “number of clutches/eggs produced per individual-day” was used for EC x modelling, as classically done in epidemiology in order to account for the time...
Statistical methods for data analysis in particle physics
AUTHOR|(CDS)2070643
2015-01-01
This concise set of course-based notes provides the reader with the main concepts and tools to perform statistical analysis of experimental data, in particular in the field of high-energy physics (HEP). First, an introduction to probability theory and basic statistics is given, mainly as reminder from advanced undergraduate studies, yet also in view to clearly distinguish the Frequentist versus Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on upper limits as many applications in HEP concern hypothesis testing, where often the main goal is to provide better and better limits so as to be able to distinguish eventually between competing hypotheses or to rule out some of them altogether. Many worked examples will help newcomers to the field and graduate students to understand the pitfalls in applying theoretical concepts to actual data
Efficient statistical tests to compare Youden index: accounting for contingency correlation.
Chen, Fangyao; Xue, Yuqiang; Tan, Ming T; Chen, Pingyan
2015-04-30
Youden index is widely utilized in studies evaluating accuracy of diagnostic tests and performance of predictive, prognostic, or risk models. However, both one and two independent sample tests on Youden index have been derived ignoring the dependence (association) between sensitivity and specificity, resulting in potentially misleading findings. Besides, paired sample test on Youden index is currently unavailable. This article develops efficient statistical inference procedures for one sample, independent, and paired sample tests on Youden index by accounting for contingency correlation, namely associations between sensitivity and specificity and paired samples typically represented in contingency tables. For one and two independent sample tests, the variances are estimated by Delta method, and the statistical inference is based on the central limit theory, which are then verified by bootstrap estimates. For paired samples test, we show that the estimated covariance of the two sensitivities and specificities can be represented as a function of kappa statistic so the test can be readily carried out. We then show the remarkable accuracy of the estimated variance using a constrained optimization approach. Simulation is performed to evaluate the statistical properties of the derived tests. The proposed approaches yield more stable type I errors at the nominal level and substantially higher power (efficiency) than does the original Youden's approach. Therefore, the simple explicit large sample solution performs very well. Because we can readily implement the asymptotic and exact bootstrap computation with common software like R, the method is broadly applicable to the evaluation of diagnostic tests and model performance. Copyright © 2015 John Wiley & Sons, Ltd.
A statistical analysis of electrical cerebral activity
International Nuclear Information System (INIS)
Bassant, Marie-Helene
1971-01-01
The aim of this work was to study the statistical properties of the amplitude of the electroencephalographic signal. The experimental method is described (implantation of electrodes, acquisition and treatment of data). The program of the mathematical analysis is given (calculation of probability density functions, study of stationarity) and the validity of the tests discussed. The results concerned ten rabbits. Trips of EEG were sampled during 40 s. with very short intervals (500 μs). The probability density functions established for different brain structures (especially the dorsal hippocampus) and areas, were compared during sleep, arousal and visual stimulus. Using a Χ 2 test, it was found that the Gaussian distribution assumption was rejected in 96.7 per cent of the cases. For a given physiological state, there was no mathematical reason to reject the assumption of stationarity (in 96 per cent of the cases). (author) [fr
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Classification, (big) data analysis and statistical learning
Conversano, Claudio; Vichi, Maurizio
2018-01-01
This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pul...
A functional U-statistic method for association analysis of sequencing data.
Jadhav, Sneha; Tong, Xiaoran; Lu, Qing
2017-11-01
Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
Applied statistical designs for the researcher
Paulson, Daryl S
2003-01-01
Research and Statistics Basic Review of Parametric Statistics Exploratory Data Analysis Two Sample Tests Completely Randomized One-Factor Analysis of Variance One and Two Restrictions on Randomization Completely Randomized Two-Factor Factorial Designs Two-Factor Factorial Completely Randomized Blocked Designs Useful Small Scale Pilot Designs Nested Statistical Designs Linear Regression Nonparametric Statistics Introduction to Research Synthesis and "Meta-Analysis" and Conclusory Remarks References Index.
IEEE Std 101-1972: IEEE guide for the statistical analysis of thermal life test data
International Nuclear Information System (INIS)
Anon.
1992-01-01
Procedures for estimating the thermal life of electrical insulation systems and materials call for life tests at several temperatures, usually well above the expected normal operating temperature. By the selection of high temperatures for the tests, life of the insulation samples will be terminated, according to some selected failure criterion or criteria, within relatively short times -- typically one week to one year. The result of these thermally accelerated life tests will be a set of data of life values for a corresponding set of temperatures. Usually the data consist of a set of life values for each of two to four (occasionally more) test temperatures, 10 C to 25 C apart. The objective then is to establish from these data the mean life vales at each temperature and the functional dependence of life on temperature, as well as the statistical consistency and the confidence to be attributed to the mean life values and the functional life temperature dependence. The purpose of this guide is to assist in this objective and to give guidance for comparing the results of tests on different materials and of different tests on the same materials
Reproducible statistical analysis with multiple languages
DEFF Research Database (Denmark)
Lenth, Russell; Højsgaard, Søren
2011-01-01
This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....
A Modified Jonckheere Test Statistic for Ordered Alternatives in Repeated Measures Design
Directory of Open Access Journals (Sweden)
Hatice Tül Kübra AKDUR
2016-09-01
Full Text Available In this article, a new test based on Jonckheere test [1] for randomized blocks which have dependent observations within block is presented. A weighted sum for each block statistic rather than the unweighted sum proposed by Jonckheereis included. For Jonckheere type statistics, the main assumption is independency of observations within block. In the case of repeated measures design, the assumption of independence is violated. The weighted Jonckheere type statistic for the situation of dependence for different variance-covariance structure and the situation based on ordered alternative hypothesis structure of each block on the design is used. Also, the proposed statistic is compared to the existing test based on Jonckheere in terms of type I error rates by performing Monte Carlo simulation. For the strong correlations, circular bootstrap version of the proposed Jonckheere test provides lower rates of type I error.
Use of run statistics to validate tensile tests
International Nuclear Information System (INIS)
Eatherly, W.P.
1981-01-01
In tensile testing of irradiated graphites, it is difficult to assure alignment of sample and train for tensile measurements. By recording location of fractures, run (sequential) statistics can readily detect lack of randomness. The technique is based on partitioning binomial distributions
[''R"--project for statistical computing
DEFF Research Database (Denmark)
Dessau, R.B.; Pipper, Christian Bressen
2008-01-01
An introduction to the R project for statistical computing (www.R-project.org) is presented. The main topics are: 1. To make the professional community aware of "R" as a potent and free software for graphical and statistical analysis of medical data; 2. Simple well-known statistical tests are fai...... are fairly easy to perform in R, but more complex modelling requires programming skills; 3. R is seen as a tool for teaching statistics and implementing complex modelling of medical data among medical professionals Udgivelsesdato: 2008/1/28......An introduction to the R project for statistical computing (www.R-project.org) is presented. The main topics are: 1. To make the professional community aware of "R" as a potent and free software for graphical and statistical analysis of medical data; 2. Simple well-known statistical tests...
Directory of Open Access Journals (Sweden)
Priya Ranganathan
2015-01-01
Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper
Your Chi-Square Test Is Statistically Significant: Now What?
Sharpe, Donald
2015-01-01
Applied researchers have employed chi-square tests for more than one hundred years. This paper addresses the question of how one should follow a statistically significant chi-square test result in order to determine the source of that result. Four approaches were evaluated: calculating residuals, comparing cells, ransacking, and partitioning. Data…
Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph
2013-11-07
The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the
Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard
2013-11-01
With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. The work demonstrates the viability of the design approach and the software tool for analysis of large data sets.
Oliveira Mendes, Thiago de; Pinto, Liliane Pereira; Santos, Laurita dos; Tippavajhala, Vamshi Krishna; Téllez Soto, Claudio Alberto; Martin, Airton Abrahão
2016-07-01
The analysis of biological systems by spectroscopic techniques involves the evaluation of hundreds to thousands of variables. Hence, different statistical approaches are used to elucidate regions that discriminate classes of samples and to propose new vibrational markers for explaining various phenomena like disease monitoring, mechanisms of action of drugs, food, and so on. However, the technical statistics are not always widely discussed in applied sciences. In this context, this work presents a detailed discussion including the various steps necessary for proper statistical analysis. It includes univariate parametric and nonparametric tests, as well as multivariate unsupervised and supervised approaches. The main objective of this study is to promote proper understanding of the application of various statistical tools in these spectroscopic methods used for the analysis of biological samples. The discussion of these methods is performed on a set of in vivo confocal Raman spectra of human skin analysis that aims to identify skin aging markers. In the Appendix, a complete routine of data analysis is executed in a free software that can be used by the scientific community involved in these studies.
Statistics and analysis of scientific data
Bonamente, Massimiliano
2017-01-01
The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...
Evaluating statistical tests on OLAP cubes to compare degree of disease.
Ordonez, Carlos; Chen, Zhibo
2009-09-01
Statistical tests represent an important technique used to formulate and validate hypotheses on a dataset. They are particularly useful in the medical domain, where hypotheses link disease with medical measurements, risk factors, and treatment. In this paper, we propose to compute parametric statistical tests treating patient records as elements in a multidimensional cube. We introduce a technique that combines dimension lattice traversal and statistical tests to discover significant differences in the degree of disease within pairs of patient groups. In order to understand a cause-effect relationship, we focus on patient group pairs differing in one dimension. We introduce several optimizations to prune the search space, to discover significant group pairs, and to summarize results. We present experiments showing important medical findings and evaluating scalability with medical datasets.
Statistical test for the distribution of galaxies on plates
International Nuclear Information System (INIS)
Garcia Lambas, D.
1985-01-01
A statistical test for the distribution of galaxies on plates is presented. We apply the test to synthetic astronomical plates obtained by means of numerical simulation (Garcia Lambas and Sersic 1983) with three different models for the 3-dimensional distribution, comparison with an observational plate, suggest the presence of filamentary structure. (author)
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Shadish, William R; Hedges, Larry V; Pustejovsky, James E
2014-04-01
This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Statistical Analysis of CFD Solutions from the Fourth AIAA Drag Prediction Workshop
Morrison, Joseph H.
2010-01-01
A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from the U.S., Europe, Asia, and Russia using a variety of grid systems and turbulence models for the June 2009 4th Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was a new subsonic transport model, the Common Research Model, designed using a modern approach for the wing and included a horizontal tail. The fourth workshop focused on the prediction of both absolute and incremental drag levels for wing-body and wing-body-horizontal tail configurations. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with earlier workshops using the statistical framework.
Statistical Analysis of CFD Solutions From the Fifth AIAA Drag Prediction Workshop
Morrison, Joseph H.
2013-01-01
A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from North America, Europe, Asia, and South America using a common grid sequence and multiple turbulence models for the June 2012 fifth Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was the Common Research Model subsonic transport wing-body previously used for the 4th Drag Prediction Workshop. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with previous workshops.
Analysis of Variance: What Is Your Statistical Software Actually Doing?
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Statistical testing of the full-range leadership theory in nursing.
Kanste, Outi; Kääriäinen, Maria; Kyngäs, Helvi
2009-12-01
The aim of this study is to test statistically the structure of the full-range leadership theory in nursing. The data were gathered by postal questionnaires from nurses and nurse leaders working in healthcare organizations in Finland. A follow-up study was performed 1 year later. The sample consisted of 601 nurses and nurse leaders, and the follow-up study had 78 respondents. Theory was tested through structural equation modelling, standard regression analysis and two-way anova. Rewarding transformational leadership seems to promote and passive laissez-faire leadership to reduce willingness to exert extra effort, perceptions of leader effectiveness and satisfaction with the leader. Active management-by-exception seems to reduce willingness to exert extra effort and perception of leader effectiveness. Rewarding transformational leadership remained as a strong explanatory factor of all outcome variables measured 1 year later. The data supported the main structure of the full-range leadership theory, lending support to the universal nature of the theory.
Accelerated testing statistical models, test plans, and data analysis
Nelson, Wayne B
2009-01-01
The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "". . . a goldmine of knowledge on accelerated life testing principles and practices . . . one of the very few capable of advancing the science of reliability. It definitely belongs in every bookshelf on engineering.""-Dev G.
Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.
Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V
2018-04-01
A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Hayen, Andrew; Macaskill, Petra; Irwig, Les; Bossuyt, Patrick
2010-01-01
To explain which measures of accuracy and which statistical methods should be used in studies to assess the value of a new binary test as a replacement test, an add-on test, or a triage test. Selection and explanation of statistical methods, illustrated with examples. Statistical methods for
Statistical analysis in MSW collection performance assessment.
Teixeira, Carlos Afonso; Avelino, Catarina; Ferreira, Fátima; Bentes, Isabel
2014-09-01
The increase of Municipal Solid Waste (MSW) generated over the last years forces waste managers pursuing more effective collection schemes, technically viable, environmentally effective and economically sustainable. The assessment of MSW services using performance indicators plays a crucial role for improving service quality. In this work, we focus on the relevance of regular system monitoring as a service assessment tool. In particular, we select and test a core-set of MSW collection performance indicators (effective collection distance, effective collection time and effective fuel consumption) that highlights collection system strengths and weaknesses and supports pro-active management decision-making and strategic planning. A statistical analysis was conducted with data collected in mixed collection system of Oporto Municipality, Portugal, during one year, a week per month. This analysis provides collection circuits' operational assessment and supports effective short-term municipality collection strategies at the level of, e.g., collection frequency and timetables, and type of containers. Copyright © 2014 Elsevier Ltd. All rights reserved.
Sensitivity analysis and related analysis : A survey of statistical techniques
Kleijnen, J.P.C.
1995-01-01
This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical
THE ATKINSON INDEX, THE MORAN STATISTIC, AND TESTING EXPONENTIALITY
Nao, Mimoto; Ricardas, Zitikis; Department of Statistics and Probability, Michigan State University; Department of Statistical and Actuarial Sciences, University of Western Ontario
2008-01-01
Constructing tests for exponentiality has been an active and fruitful research area, with numerous applications in engineering, biology and other sciences concerned with life-time data. In the present paper, we construct and investigate powerful tests for exponentiality based on two well known quantities: the Atkinson index and the Moran statistic. We provide an extensive study of the performance of the tests and compare them with those already available in the literature.
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
Morrissey, L. A.; Weinstock, K. J.; Mouat, D. A.; Card, D. H.
1984-01-01
An evaluation of Thematic Mapper Simulator (TMS) data for the geobotanical discrimination of rock types based on vegetative cover characteristics is addressed in this research. A methodology for accomplishing this evaluation utilizing univariate and multivariate techniques is presented. TMS data acquired with a Daedalus DEI-1260 multispectral scanner were integrated with vegetation and geologic information for subsequent statistical analyses, which included a chi-square test, an analysis of variance, stepwise discriminant analysis, and Duncan's multiple range test. Results indicate that ultramafic rock types are spectrally separable from nonultramafics based on vegetative cover through the use of statistical analyses.
688,112 statistical results: Content mining psychology articles for statistical test results
Hartgerink, C.H.J.
2016-01-01
In this data deposit, I describe a dataset that is the result of content mining 167,318 published articles for statistical test results reported according to the standards prescribed by the American Psychological Association (APA). Articles published by the APA, Springer, Sage, and Taylor & Francis were included (mining from Wiley and Elsevier was actively blocked). As a result of this content mining, 688,112 results from 50,845 articles were extracted. In order to provide a comprehensive set...
Transfer of drug dissolution testing by statistical approaches: Case study
AL-Kamarany, Mohammed Amood; EL Karbane, Miloud; Ridouan, Khadija; Alanazi, Fars K.; Hubert, Philippe; Cherrah, Yahia; Bouklouze, Abdelaziz
2011-01-01
The analytical transfer is a complete process that consists in transferring an analytical procedure from a sending laboratory to a receiving laboratory. After having experimentally demonstrated that also masters the procedure in order to avoid problems in the future. Method of transfers is now commonplace during the life cycle of analytical method in the pharmaceutical industry. No official guideline exists for a transfer methodology in pharmaceutical analysis and the regulatory word of transfer is more ambiguous than for validation. Therefore, in this study, Gauge repeatability and reproducibility (R&R) studies associated with other multivariate statistics appropriates were successfully applied for the transfer of the dissolution test of diclofenac sodium as a case study from a sending laboratory A (accredited laboratory) to a receiving laboratory B. The HPLC method for the determination of the percent release of diclofenac sodium in solid pharmaceutical forms (one is the discovered product and another generic) was validated using accuracy profile (total error) in the sender laboratory A. The results showed that the receiver laboratory B masters the test dissolution process, using the same HPLC analytical procedure developed in laboratory A. In conclusion, if the sender used the total error to validate its analytical method, dissolution test can be successfully transferred without mastering the analytical method validation by receiving laboratory B and the pharmaceutical analysis method state should be maintained to ensure the same reliable results in the receiving laboratory. PMID:24109204
Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte; Astrup, Thomas Fruergaard
2017-11-01
Data for fractional solid waste composition provide relative magnitudes of individual waste fractions, the percentages of which always sum to 100, thereby connecting them intrinsically. Due to this sum constraint, waste composition data represent closed data, and their interpretation and analysis require statistical methods, other than classical statistics that are suitable only for non-constrained data such as absolute values. However, the closed characteristics of waste composition data are often ignored when analysed. The results of this study showed, for example, that unavoidable animal-derived food waste amounted to 2.21±3.12% with a confidence interval of (-4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson's correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients. Copyright © 2017 Elsevier Ltd. All rights reserved.
Zhang, Fanghong; Miyaoka, Etsuo; Huang, Fuping; Tanaka, Yutaka
2015-01-01
The problem for establishing noninferiority is discussed between a new treatment and a standard (control) treatment with ordinal categorical data. A measure of treatment effect is used and a method of specifying noninferiority margin for the measure is provided. Two Z-type test statistics are proposed where the estimation of variance is constructed under the shifted null hypothesis using U-statistics. Furthermore, the confidence interval and the sample size formula are given based on the proposed test statistics. The proposed procedure is applied to a dataset from a clinical trial. A simulation study is conducted to compare the performance of the proposed test statistics with that of the existing ones, and the results show that the proposed test statistics are better in terms of the deviation from nominal level and the power.
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
The null hypothesis of GSEA, and a novel statistical model for competitive gene set analysis
DEFF Research Database (Denmark)
Debrabant, Birgit
2017-01-01
MOTIVATION: Competitive gene set analysis intends to assess whether a specific set of genes is more associated with a trait than the remaining genes. However, the statistical models assumed to date to underly these methods do not enable a clear cut formulation of the competitive null hypothesis....... This is a major handicap to the interpretation of results obtained from a gene set analysis. RESULTS: This work presents a hierarchical statistical model based on the notion of dependence measures, which overcomes this problem. The two levels of the model naturally reflect the modular structure of many gene set...... analysis methods. We apply the model to show that the popular GSEA method, which recently has been claimed to test the self-contained null hypothesis, actually tests the competitive null if the weight parameter is zero. However, for this result to hold strictly, the choice of the dependence measures...
Statistics for experimentalists
Cooper, B E
2014-01-01
Statistics for Experimentalists aims to provide experimental scientists with a working knowledge of statistical methods and search approaches to the analysis of data. The book first elaborates on probability and continuous probability distributions. Discussions focus on properties of continuous random variables and normal variables, independence of two random variables, central moments of a continuous distribution, prediction from a normal distribution, binomial probabilities, and multiplication of probabilities and independence. The text then examines estimation and tests of significance. Topics include estimators and estimates, expected values, minimum variance linear unbiased estimators, sufficient estimators, methods of maximum likelihood and least squares, and the test of significance method. The manuscript ponders on distribution-free tests, Poisson process and counting problems, correlation and function fitting, balanced incomplete randomized block designs and the analysis of covariance, and experiment...
Application of Ontology Technology in Health Statistic Data Analysis.
Guo, Minjiang; Hu, Hongpu; Lei, Xingyun
2017-01-01
Research Purpose: establish health management ontology for analysis of health statistic data. Proposed Methods: this paper established health management ontology based on the analysis of the concepts in China Health Statistics Yearbook, and used protégé to define the syntactic and semantic structure of health statistical data. six classes of top-level ontology concepts and their subclasses had been extracted and the object properties and data properties were defined to establish the construction of these classes. By ontology instantiation, we can integrate multi-source heterogeneous data and enable administrators to have an overall understanding and analysis of the health statistic data. ontology technology provides a comprehensive and unified information integration structure of the health management domain and lays a foundation for the efficient analysis of multi-source and heterogeneous health system management data and enhancement of the management efficiency.
International Nuclear Information System (INIS)
Vardavas, I.M.
1992-01-01
A simple procedure is presented for the statistical analysis of measurement data where the primary concern is the determination of the value corresponding to a specified average exceedance probability. The analysis employs the normal and log-normal frequency distributions together with a χ 2 -test and an error analysis. The error analysis introduces the concept of a counting error criterion, or ζ-test, to test whether the data are sufficient to make the Z 2 -test reliable. The procedure is applied to the analysis of annual rainfall data recorded at stations in the tropical Top End of Australia where the Ranger uranium deposit is situated. 9 refs., 12 tabs., 9 figs
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Explorations in Statistics: The Analysis of Change
Curran-Everett, Douglas; Williams, Calvin L.
2015-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958
BrightStat.com: free statistics online.
Stricker, Daniel
2008-10-01
Powerful software for statistical analysis is expensive. Here I present BrightStat, a statistical software running on the Internet which is free of charge. BrightStat's goals, its main capabilities and functionalities are outlined. Three different sample runs, a Friedman test, a chi-square test, and a step-wise multiple regression are presented. The results obtained by BrightStat are compared with results computed by SPSS, one of the global leader in providing statistical software, and VassarStats, a collection of scripts for data analysis running on the Internet. Elementary statistics is an inherent part of academic education and BrightStat is an alternative to commercial products.
Directory of Open Access Journals (Sweden)
Shirin Iranfar
2013-12-01
Full Text Available Introduction: Test anxiety is a common phenomenon among students and is one of the problems of educational system. The present study was conducted to investigate the test anxiety in vital statistics course and its association with academic performance of students at Kermanshah University of Medical Sciences. This study was descriptive-analytical and the study sample included the students studying in nursing and midwifery, paramedicine and health faculties that had taken vital statistics course and were selected through census method. Sarason questionnaire was used to analyze the test anxiety. Data were analyzed by descriptive and inferential statistics. The findings indicated no significant correlation between test anxiety and score of vital statistics course.
Testing statistical self-similarity in the topology of river networks
Troutman, Brent M.; Mantilla, Ricardo; Gupta, Vijay K.
2010-01-01
Recent work has demonstrated that the topological properties of real river networks deviate significantly from predictions of Shreve's random model. At the same time the property of mean self-similarity postulated by Tokunaga's model is well supported by data. Recently, a new class of network model called random self-similar networks (RSN) that combines self-similarity and randomness has been introduced to replicate important topological features observed in real river networks. We investigate if the hypothesis of statistical self-similarity in the RSN model is supported by data on a set of 30 basins located across the continental United States that encompass a wide range of hydroclimatic variability. We demonstrate that the generators of the RSN model obey a geometric distribution, and self-similarity holds in a statistical sense in 26 of these 30 basins. The parameters describing the distribution of interior and exterior generators are tested to be statistically different and the difference is shown to produce the well-known Hack's law. The inter-basin variability of RSN parameters is found to be statistically significant. We also test generator dependence on two climatic indices, mean annual precipitation and radiative index of dryness. Some indication of climatic influence on the generators is detected, but this influence is not statistically significant with the sample size available. Finally, two key applications of the RSN model to hydrology and geomorphology are briefly discussed.
Spectral signature verification using statistical analysis and text mining
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION
Directory of Open Access Journals (Sweden)
А. А. Vershinina
2014-01-01
Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.
Statistical analysis of ultrasonic measurements in concrete
Chiang, Chih-Hung; Chen, Po-Chih
2002-05-01
Stress wave techniques such as measurements of ultrasonic pulse velocity are often used to evaluate concrete quality in structures. For proper interpretation of measurement results, the dependence of pulse transit time on the average acoustic impedance and the material homogeneity along the sound path need to be examined. Semi-direct measurement of pulse velocity could be more convenient than through transmission measurement. It is not necessary to assess both sides of concrete floors or walls. A novel measurement scheme is proposed and verified based on statistical analysis. It is shown that Semi-direct measurements are very effective for gathering large amount of pulse velocity data from concrete reference specimens. The variability of measurements is comparable with that reported by American Concrete Institute using either break-off or pullout tests.
Statistical analysis of network data with R
Kolaczyk, Eric D
2014-01-01
Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).
Directory of Open Access Journals (Sweden)
Hamid Reza Marateb
2014-01-01
Full Text Available Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal-variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD. Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables.
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Statistical inference based on divergence measures
Pardo, Leandro
2005-01-01
The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...
International Nuclear Information System (INIS)
CAP, JEROME S.; TRACEY, BRIAN
1999-01-01
Aerospace payloads, such as satellites, are subjected to vibroacoustic excitation during launch. Sandia's MTI satellite has recently been certified to this environment using a combination of base input random vibration and reverberant acoustic noise. The initial choices for the acoustic and random vibration test specifications were obtained from the launch vehicle Interface Control Document (ICD). In order to tailor the random vibration levels for the laboratory certification testing, it was necessary to determine whether vibration energy was flowing across the launch vehicle interface from the satellite to the launch vehicle or the other direction. For frequencies below 120 Hz this issue was addressed using response limiting techniques based on results from the Coupled Loads Analysis (CLA). However, since the CLA Finite Element Analysis FEA model was only correlated for frequencies below 120 Hz, Statistical Energy Analysis (SEA) was considered to be a better choice for predicting the direction of the energy flow for frequencies above 120 Hz. The existing SEA model of the launch vehicle had been developed using the VibroAcoustic Payload Environment Prediction System (VAPEPS) computer code[1]. Therefore, the satellite would have to be modeled using VAPEPS as well. As is the case for any computational model, the confidence in its predictive capability increases if one can correlate a sample prediction against experimental data. Fortunately, Sandia had the ideal data set for correlating an SEA model of the MTI satellite--the measured response of a realistic assembly to a reverberant acoustic test that was performed during MTI's qualification test series. The first part of this paper will briefly describe the VAPEPS modeling effort and present the results of the correlation study for the VAPEPS model. The second part of this paper will present the results from a study that used a commercial SEA software package[2] to study the effects of in-plane modes and to evaluate
Statistical Analysis Of Tank 19F Floor Sample Results
International Nuclear Information System (INIS)
Harris, S.
2010-01-01
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
Beginning R The Statistical Programming Language
Gardener, Mark
2012-01-01
Conquer the complexities of this open source statistical language R is fast becoming the de facto standard for statistical computing and analysis in science, business, engineering, and related fields. This book examines this complex language using simple statistical examples, showing how R operates in a user-friendly context. Both students and workers in fields that require extensive statistical analysis will find this book helpful as they learn to use R for simple summary statistics, hypothesis testing, creating graphs, regression, and much more. It covers formula notation, complex statistics
Semiclassical analysis, Witten Laplacians, and statistical mechanis
Helffer, Bernard
2002-01-01
This important book explains how the technique of Witten Laplacians may be useful in statistical mechanics. It considers the problem of analyzing the decay of correlations, after presenting its origin in statistical mechanics. In addition, it compares the Witten Laplacian approach with other techniques, such as the transfer matrix approach and its semiclassical analysis. The author concludes by providing a complete proof of the uniform Log-Sobolev inequality. Contents: Witten Laplacians Approach; Problems in Statistical Mechanics with Discrete Spins; Laplace Integrals and Transfer Operators; S
Error calculations statistics in radioactive measurements
International Nuclear Information System (INIS)
Verdera, Silvia
1994-01-01
Basic approach and procedures frequently used in the practice of radioactive measurements.Statistical principles applied are part of Good radiopharmaceutical Practices and quality assurance.Concept of error, classification as systematic and random errors.Statistic fundamentals,probability theories, populations distributions, Bernoulli, Poisson,Gauss, t-test distribution,Ξ2 test, error propagation based on analysis of variance.Bibliography.z table,t-test table, Poisson index ,Ξ2 test
submitter Methodologies for the Statistical Analysis of Memory Response to Radiation
Bosser, Alexandre L; Tsiligiannis, Georgios; Frost, Christopher D; Zadeh, Ali; Jaatinen, Jukka; Javanainen, Arto; Puchner, Helmut; Saigne, Frederic; Virtanen, Ari; Wrobel, Frederic; Dilillo, Luigi
2016-01-01
Methodologies are proposed for in-depth statistical analysis of Single Event Upset data. The motivation for using these methodologies is to obtain precise information on the intrinsic defects and weaknesses of the tested devices, and to gain insight on their failure mechanisms, at no additional cost. The case study is a 65 nm SRAM irradiated with neutrons, protons and heavy ions. This publication is an extended version of a previous study [1].
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-01-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008–2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. PMID:27892471
A statistical analysis of the impact of advertising signs on road safety.
Yannis, George; Papadimitriou, Eleonora; Papantoniou, Panagiotis; Voulgari, Chrisoula
2013-01-01
This research aims to investigate the impact of advertising signs on road safety. An exhaustive review of international literature was carried out on the effect of advertising signs on driver behaviour and safety. Moreover, a before-and-after statistical analysis with control groups was applied on several road sites with different characteristics in the Athens metropolitan area, in Greece, in order to investigate the correlation between the placement or removal of advertising signs and the related occurrence of road accidents. Road accident data for the 'before' and 'after' periods on the test sites and the control sites were extracted from the database of the Hellenic Statistical Authority, and the selected 'before' and 'after' periods vary from 2.5 to 6 years. The statistical analysis shows no statistical correlation between road accidents and advertising signs in none of the nine sites examined, as the confidence intervals of the estimated safety effects are non-significant at 95% confidence level. This can be explained by the fact that, in the examined road sites, drivers are overloaded with information (traffic signs, directions signs, labels of shops, pedestrians and other vehicles, etc.) so that the additional information load from advertising signs may not further distract them.
Xu, Kuan-Man
2006-01-01
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
Karadag, Engin
2010-01-01
To assess research methods and analysis of statistical techniques employed by educational researchers, this study surveyed unpublished doctoral dissertation from 2003 to 2007. Frequently used research methods consisted of experimental research; a survey; a correlational study; and a case study. Descriptive statistics, t-test, ANOVA, factor…
Introduction to Statistics - eNotes
DEFF Research Database (Denmark)
Brockhoff, Per B.; Møller, Jan Kloppenborg; Andersen, Elisabeth Wreford
2015-01-01
Online textbook used in the introductory statistics courses at DTU. It provides a basic introduction to applied statistics for engineers. The necessary elements from probability theory are introduced (stochastic variable, density and distribution function, mean and variance, etc.) and thereafter...... the most basic statistical analysis methods are presented: Confidence band, hypothesis testing, simulation, simple and muliple regression, ANOVA and analysis of contingency tables. Examples with the software R are included for all presented theory and methods....
The Statistical Analysis of Time Series
Anderson, T W
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George
Directory of Open Access Journals (Sweden)
J. Sunil Rao
2007-01-01
Full Text Available In gene selection for cancer classifi cation using microarray data, we define an eigenvalue-ratio statistic to measure a gene’s contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalueratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.
Analysis of room transfer function and reverberant signal statistics
DEFF Research Database (Denmark)
Georganti, Eleftheria; Mourjopoulos, John; Jacobsen, Finn
2008-01-01
For some time now, statistical analysis has been a valuable tool in analyzing room transfer functions (RTFs). This work examines existing statistical time-frequency models and techniques for RTF analysis (e.g., Schroeder's stochastic model and the standard deviation over frequency bands for the RTF...... magnitude and phase). RTF fractional octave smoothing, as with 1-slash 3 octave analysis, may lead to RTF simplifications that can be useful for several audio applications, like room compensation, room modeling, auralisation purposes. The aim of this work is to identify the relationship of optimal response...... and the corresponding ratio of the direct and reverberant signal. In addition, this work examines the statistical quantities for speech and audio signals prior to their reproduction within rooms and when recorded in rooms. Histograms and other statistical distributions are used to compare RTF minima of typical...
Partial discharge testing: a progress report. Statistical evaluation of PD data
International Nuclear Information System (INIS)
Warren, V.; Allan, J.
2005-01-01
It has long been known that comparing the partial discharge results obtained from a single machine is a valuable tool enabling companies to observe the gradual deterioration of a machine stator winding and thus plan appropriate maintenance for the machine. In 1998, at the annual Iris Rotating Machines Conference (IRMC), a paper was presented that compared thousands of PD test results to establish the criteria for comparing results from different machines and the expected PD levels. At subsequent annual Iris conferences, using similar analytical procedures, papers were presented that supported the previous criteria and: in 1999, established sensor location as an additional criterion; in 2000, evaluated the effect of insulation type and age on PD activity; in 2001, evaluated the effect of manufacturer on PD activity; in 2002, evaluated the effect of operating pressure for hydrogen-cooled machines; in 2003, evaluated the effect of insulation type and setting Trac alarms; in 2004, re-evaluated the effect of manufacturer on PD activity. Before going further in database analysis procedures, it would be prudent to statistically evaluate the anecdotal evidence observed to date. The goal was to determine which variables of machine conditions greatly influenced the PD results and which didn't. Therefore, this year's paper looks at the impact of operating voltage, machine type and winding type on the test results for air-cooled machines. Because of resource constraints, only data collected through 2003 was used; however, as before, it is still standardized for frequency bandwidth and pruned to include only full-load-hot (FLH) results collected for one sensor on operating machines. All questionable data, or data from off-line testing or unusual machine conditions was excluded, leaving 6824 results. Calibration of on-line PD test results is impractical; therefore, only results obtained using the same method of data collection and noise separation techniques are compared. For
Examining publication bias—a simulation-based evaluation of statistical tests on publication bias
Directory of Open Access Journals (Sweden)
Andreas Schneck
2017-11-01
Full Text Available Background Publication bias is a form of scientific misconduct. It threatens the validity of research results and the credibility of science. Although several tests on publication bias exist, no in-depth evaluations are available that examine which test performs best for different research settings. Methods Four tests on publication bias, Egger’s test (FAT, p-uniform, the test of excess significance (TES, as well as the caliper test, were evaluated in a Monte Carlo simulation. Two different types of publication bias and its degree (0%, 50%, 100% were simulated. The type of publication bias was defined either as file-drawer, meaning the repeated analysis of new datasets, or p-hacking, meaning the inclusion of covariates in order to obtain a significant result. In addition, the underlying effect (β = 0, 0.5, 1, 1.5, effect heterogeneity, the number of observations in the simulated primary studies (N = 100, 500, and the number of observations for the publication bias tests (K = 100, 1,000 were varied. Results All tests evaluated were able to identify publication bias both in the file-drawer and p-hacking condition. The false positive rates were, with the exception of the 15%- and 20%-caliper test, unbiased. The FAT had the largest statistical power in the file-drawer conditions, whereas under p-hacking the TES was, except under effect heterogeneity, slightly better. The CTs were, however, inferior to the other tests under effect homogeneity and had a decent statistical power only in conditions with 1,000 primary studies. Discussion The FAT is recommended as a test for publication bias in standard meta-analyses with no or only small effect heterogeneity. If two-sided publication bias is suspected as well as under p-hacking the TES is the first alternative to the FAT. The 5%-caliper test is recommended under conditions of effect heterogeneity and a large number of primary studies, which may be found if publication bias is examined in a
Sources of Error and the Statistical Formulation of M S: m b Seismic Event Screening Analysis
Anderson, D. N.; Patton, H. J.; Taylor, S. R.; Bonner, J. L.; Selby, N. D.
2014-03-01
The Comprehensive Nuclear-Test-Ban Treaty (CTBT), a global ban on nuclear explosions, is currently in a ratification phase. Under the CTBT, an International Monitoring System (IMS) of seismic, hydroacoustic, infrasonic and radionuclide sensors is operational, and the data from the IMS is analysed by the International Data Centre (IDC). The IDC provides CTBT signatories basic seismic event parameters and a screening analysis indicating whether an event exhibits explosion characteristics (for example, shallow depth). An important component of the screening analysis is a statistical test of the null hypothesis H 0: explosion characteristics using empirical measurements of seismic energy (magnitudes). The established magnitude used for event size is the body-wave magnitude (denoted m b) computed from the initial segment of a seismic waveform. IDC screening analysis is applied to events with m b greater than 3.5. The Rayleigh wave magnitude (denoted M S) is a measure of later arriving surface wave energy. Magnitudes are measurements of seismic energy that include adjustments (physical correction model) for path and distance effects between event and station. Relative to m b, earthquakes generally have a larger M S magnitude than explosions. This article proposes a hypothesis test (screening analysis) using M S and m b that expressly accounts for physical correction model inadequacy in the standard error of the test statistic. With this hypothesis test formulation, the 2009 Democratic Peoples Republic of Korea announced nuclear weapon test fails to reject the null hypothesis H 0: explosion characteristics.
D'Alessio, Michael
2012-01-01
AP Statistics Crash Course - Gets You a Higher Advanced Placement Score in Less Time Crash Course is perfect for the time-crunched student, the last-minute studier, or anyone who wants a refresher on the subject. AP Statistics Crash Course gives you: Targeted, Focused Review - Study Only What You Need to Know Crash Course is based on an in-depth analysis of the AP Statistics course description outline and actual Advanced Placement test questions. It covers only the information tested on the exam, so you can make the most of your valuable study time. Our easy-to-read format covers: exploring da
Application of Statistics in Engineering Technology Programs
Zhan, Wei; Fink, Rainer; Fang, Alex
2010-01-01
Statistics is a critical tool for robustness analysis, measurement system error analysis, test data analysis, probabilistic risk assessment, and many other fields in the engineering world. Traditionally, however, statistics is not extensively used in undergraduate engineering technology (ET) programs, resulting in a major disconnect from industry…
Transit safety & security statistics & analysis 2002 annual report (formerly SAMIS)
2004-12-01
The Transit Safety & Security Statistics & Analysis 2002 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
Transit safety & security statistics & analysis 2003 annual report (formerly SAMIS)
2005-12-01
The Transit Safety & Security Statistics & Analysis 2003 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
International Nuclear Information System (INIS)
Martin, Robert P.; Nutt, William T.
2011-01-01
Research highlights: → Historical recitation on application of order-statistics models to nuclear power plant thermal-hydraulics safety analysis. → Interpretation of regulatory language regarding 10 CFR 50.46 reference to a 'high level of probability'. → Derivation and explanation of order-statistics-based evaluation methodologies considering multi-variate acceptance criteria. → Summary of order-statistics models and recommendations to the nuclear power plant thermal-hydraulics safety analysis community. - Abstract: The application of order-statistics in best-estimate plus uncertainty nuclear safety analysis has received a considerable amount of attention from methodology practitioners, regulators, and academia. At the root of the debate are two questions: (1) what is an appropriate quantitative interpretation of 'high level of probability' in regulatory language appearing in the LOCA rule, 10 CFR 50.46 and (2) how best to mathematically characterize the multi-variate case. An original derivation is offered to provide a quantitative basis for 'high level of probability.' At root of the second question is whether one should recognize a probability statement based on the tolerance region method of Wald and Guba, et al., for multi-variate problems, one explicitly based on the regulatory limits, best articulated in the Wallis-Nutt 'Testing Method', or something else entirely. This paper reviews the origins of the different positions, key assumptions, limitations, and relationship to addressing acceptance criteria. It presents a mathematical interpretation of the regulatory language, including a complete derivation of uni-variate order-statistics (as credited in AREVA's Realistic Large Break LOCA methodology) and extension to multi-variate situations. Lastly, it provides recommendations for LOCA applications, endorsing the 'Testing Method' and addressing acceptance methods allowing for limited sample failures.
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-08
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.
Directory of Open Access Journals (Sweden)
Ke Li
2016-01-01
Full Text Available A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF and Diagnostic Bayesian Network (DBN is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO. To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA is proposed to evaluate the sensitiveness of symptom parameters (SPs for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-01
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006
Shardell, Michelle; Harris, Anthony D; El-Kamary, Samer S; Furuno, Jon P; Miller, Ram R; Perencevich, Eli N
2007-10-01
Quasi-experimental study designs are frequently used to assess interventions that aim to limit the emergence of antimicrobial-resistant pathogens. However, previous studies using these designs have often used suboptimal statistical methods, which may result in researchers making spurious conclusions. Methods used to analyze quasi-experimental data include 2-group tests, regression analysis, and time-series analysis, and they all have specific assumptions, data requirements, strengths, and limitations. An example of a hospital-based intervention to reduce methicillin-resistant Staphylococcus aureus infection rates and reduce overall length of stay is used to explore these methods.
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
A general statistical test for correlations in a finite-length time series.
Hanson, Jeffery A; Yang, Haw
2008-06-07
The statistical properties of the autocorrelation function from a time series composed of independently and identically distributed stochastic variables has been studied. Analytical expressions for the autocorrelation function's variance have been derived. It has been found that two common ways of calculating the autocorrelation, moving-average and Fourier transform, exhibit different uncertainty characteristics. For periodic time series, the Fourier transform method is preferred because it gives smaller uncertainties that are uniform through all time lags. Based on these analytical results, a statistically robust method has been proposed to test the existence of correlations in a time series. The statistical test is verified by computer simulations and an application to single-molecule fluorescence spectroscopy is discussed.
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
Extending the Reach of Statistical Software Testing
National Research Council Canada - National Science Library
Weber, Robert
2004-01-01
.... In particular, as system complexity increases, the matrices required to generate test cases and perform model analysis can grow dramatically, even exponentially, overwhelming the test generation...
Nonparametric statistics for social and behavioral sciences
Kraska-MIller, M
2013-01-01
Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
DEFF Research Database (Denmark)
Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte
2017-01-01
-derived food waste amounted to 2.21 ± 3.12% with a confidence interval of (−4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson’s correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste...... and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data......, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients....
Quantitative analysis of LISA pathfinder test-mass noise
International Nuclear Information System (INIS)
Ferraioli, Luigi; Congedo, Giuseppe; Hueller, Mauro; Vitale, Stefano; Hewitson, Martin; Nofrarias, Miquel; Armano, Michele
2011-01-01
LISA Pathfinder (LPF) is a mission aiming to test the critical technology for the forthcoming space-based gravitational-wave detectors. The main scientific objective of the LPF mission is to demonstrate test masses free falling with residual accelerations below 3x10 -14 m s -2 /√(Hz) at 1 mHz. Reaching such an ambitious target will require a significant amount of system optimization and characterization, which will in turn require accurate and quantitative noise analysis procedures. In this paper, we discuss two main problems associated with the analysis of the data from LPF: i) excess noise detection and ii) noise parameter identification. The mission is focused on the low-frequency region ([0.1, 10] mHz) of the available signal spectrum. In such a region, the signal is dominated by the force noise acting on test masses. At the same time, the mission duration is limited to 90 days and typical data segments will be 24 hours in length. Considering those constraints, noise analysis is expected to deal with a limited amount of non-Gaussian data, since the spectrum statistics will be far from Gaussian and the lowest available frequency is limited by the data length. In this paper, we analyze the details of the expected statistics for spectral data and develop two suitable excess noise estimators. One is based on the statistical properties of the integrated spectrum, the other is based on the Kolmogorov-Smirnov test. The sensitivity of the estimators is discussed theoretically for independent data, then the algorithms are tested on LPF synthetic data. The test on realistic LPF data allows the effect of spectral data correlations on the efficiency of the different noise excess estimators to be highlighted. It also reveals the versatility of the Kolmogorov-Smirnov approach, which can be adapted to provide reasonable results on correlated data from a modified version of the standard equations for the inversion of the test statistic. Closely related to excess noise
Comment on the asymptotics of a distribution-free goodness of fit test statistic.
Browne, Michael W; Shapiro, Alexander
2015-03-01
In a recent article Jennrich and Satorra (Psychometrika 78: 545-552, 2013) showed that a proof by Browne (British Journal of Mathematical and Statistical Psychology 37: 62-83, 1984) of the asymptotic distribution of a goodness of fit test statistic is incomplete because it fails to prove that the orthogonal component function employed is continuous. Jennrich and Satorra (Psychometrika 78: 545-552, 2013) showed how Browne's proof can be completed satisfactorily but this required the development of an extensive and mathematically sophisticated framework for continuous orthogonal component functions. This short note provides a simple proof of the asymptotic distribution of Browne's (British Journal of Mathematical and Statistical Psychology 37: 62-83, 1984) test statistic by using an equivalent form of the statistic that does not involve orthogonal component functions and consequently avoids all complicating issues associated with them.
Statistical evaluation of vibration analysis techniques
Milner, G. Martin; Miller, Patrice S.
1987-01-01
An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.
Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore
2014-04-01
Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided.
Wear behavior of AA 5083/SiC nano-particle metal matrix composite: Statistical analysis
Hussain Idrisi, Amir; Ismail Mourad, Abdel-Hamid; Thekkuden, Dinu Thomas; Christy, John Victor
2018-03-01
This paper reports study on statistical analysis of the wear characteristics of AA5083/SiC nanocomposite. The aluminum matrix composites with different wt % (0%, 1% and 2%) of SiC nanoparticles were fabricated by using stir casting route. The developed composites were used in the manufacturing of spur gears on which the study was conducted. A specially designed test rig was used in testing the wear performance of the gears. The wear was investigated under different conditions of applied load (10N, 20N, and 30N) and operation time (30 mins, 60 mins, 90 mins, and 120mins). The analysis carried out at room temperature under constant speed of 1450 rpm. The wear parameters were optimized by using Taguchi’s method. During this statistical approach, L27 Orthogonal array was selected for the analysis of output. Furthermore, analysis of variance (ANOVA) was used to investigate the influence of applied load, operation time and SiC wt. % on wear behaviour. The wear resistance was analyzed by selecting “smaller is better” characteristics as the objective of the model. From this research, it is observed that experiment time and SiC wt % have the most significant effect on the wear performance followed by the applied load.
Statistical Analysis of Video Frame Size Distribution Originating from Scalable Video Codec (SVC
Directory of Open Access Journals (Sweden)
Sima Ahmadpour
2017-01-01
Full Text Available Designing an effective and high performance network requires an accurate characterization and modeling of network traffic. The modeling of video frame sizes is normally applied in simulation studies and mathematical analysis and generating streams for testing and compliance purposes. Besides, video traffic assumed as a major source of multimedia traffic in future heterogeneous network. Therefore, the statistical distribution of video data can be used as the inputs for performance modeling of networks. The finding of this paper comprises the theoretical definition of distribution which seems to be relevant to the video trace in terms of its statistical properties and finds the best distribution using both the graphical method and the hypothesis test. The data set used in this article consists of layered video traces generating from Scalable Video Codec (SVC video compression technique of three different movies.
Statistical analysis on extreme wave height
Digital Repository Service at National Institute of Oceanography (India)
Teena, N.V.; SanilKumar, V.; Sudheesh, K.; Sajeev, R.
-294. • WAFO (2000) – A MATLAB toolbox for analysis of random waves and loads, Lund University, Sweden, homepage http://www.maths.lth.se/matstat/wafo/,2000. 15 Table 1: Statistical results of data and fitted distribution for cumulative distribution...
International Nuclear Information System (INIS)
Zhang, Jinzhao; Segurado, Jacobo; Schneidesch, Christophe
2013-01-01
Since 1980's, Tractebel Engineering (TE) has being developed and applied a multi-physical modelling and safety analyses capability, based on a code package consisting of the best estimate 3D neutronic (PANTHER), system thermal hydraulic (RELAP5), core sub-channel thermal hydraulic (COBRA-3C), and fuel thermal mechanic (FRAPCON/FRAPTRAN) codes. A series of methodologies have been developed to perform and to license the reactor safety analysis and core reload design, based on the deterministic bounding approach. Following the recent trends in research and development as well as in industrial applications, TE has been working since 2010 towards the application of the statistical sensitivity and uncertainty analysis methods to the multi-physical modelling and licensing safety analyses. In this paper, the TE multi-physical modelling and safety analyses capability is first described, followed by the proposed TE best estimate plus statistical uncertainty analysis method (BESUAM). The chosen statistical sensitivity and uncertainty analysis methods (non-parametric order statistic method or bootstrap) and tool (DAKOTA) are then presented, followed by some preliminary results of their applications to FRAPCON/FRAPTRAN simulation of OECD RIA fuel rod codes benchmark and RELAP5/MOD3.3 simulation of THTF tests. (authors)
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
A NEW TEST OF THE STATISTICAL NATURE OF THE BRIGHTEST CLUSTER GALAXIES
International Nuclear Information System (INIS)
Lin, Yen-Ting; Ostriker, Jeremiah P.; Miller, Christopher J.
2010-01-01
A novel statistic is proposed to examine the hypothesis that all cluster galaxies are drawn from the same luminosity distribution (LD). In such a 'statistical model' of galaxy LD, the brightest cluster galaxies (BCGs) are simply the statistical extreme of the galaxy population. Using a large sample of nearby clusters, we show that BCGs in high luminosity clusters (e.g., L tot ∼> 4 x 10 11 h -2 70 L sun ) are unlikely (probability ≤3 x 10 -4 ) to be drawn from the LD defined by all red cluster galaxies more luminous than M r = -20. On the other hand, BCGs in less luminous clusters are consistent with being the statistical extreme. Applying our method to the second brightest galaxies, we show that they are consistent with being the statistical extreme, which implies that the BCGs are also distinct from non-BCG luminous, red, cluster galaxies. We point out some issues with the interpretation of the classical tests proposed by Tremaine and Richstone (TR) that are designed to examine the statistical nature of BCGs, investigate the robustness of both our statistical test and those of TR against difficulties in photometry of galaxies of large angular size, and discuss the implication of our findings on surveys that use the luminous red galaxies to measure the baryon acoustic oscillation features in the galaxy power spectrum.
Bugała, Artur; Bednarek, Karol; Kasprzyk, Leszek; Tomczewski, Andrzej
2017-10-01
The paper presents the most representative - from the three-year measurement time period - characteristics of daily and monthly electricity production from a photovoltaic conversion using modules installed in a fixed and 2-axis tracking construction. Results are presented for selected summer, autumn, spring and winter days. Analyzed measuring stand is located on the roof of the Faculty of Electrical Engineering Poznan University of Technology building. The basic parameters of the statistical analysis like mean value, standard deviation, skewness, kurtosis, median, range, or coefficient of variation were used. It was found that the asymmetry factor can be useful in the analysis of the daily electricity production from a photovoltaic conversion. In order to determine the repeatability of monthly electricity production, occurring between the summer, and summer and winter months, a non-parametric Mann-Whitney U test was used as a statistical solution. In order to analyze the repeatability of daily peak hours, describing the largest value of the hourly electricity production, a non-parametric Kruskal-Wallis test was applied as an extension of the Mann-Whitney U test. Based on the analysis of the electric energy distribution from a prepared monitoring system it was found that traditional forecasting methods of the electricity production from a photovoltaic conversion, like multiple regression models, should not be the preferred methods of the analysis.
International Nuclear Information System (INIS)
Behringer, K.; Spiekerman, G.
1984-01-01
Piety (1977) proposed an automated signature analysis of power spectral density data. Eight statistical decision discriminants are introduced. For nearly all the discriminants, improved confidence statements can be made. The statistical characteristics of the last three discriminants, which are applications of non-parametric tests, are considered. (author)
Mager, P P; Rothe, H
1990-10-01
Multicollinearity of physicochemical descriptors leads to serious consequences in quantitative structure-activity relationship (QSAR) analysis, such as incorrect estimators and test statistics of regression coefficients of the ordinary least-squares (OLS) model applied usually to QSARs. Beside the diagnosis of the known simple collinearity, principal component regression analysis (PCRA) also allows the diagnosis of various types of multicollinearity. Only if the absolute values of PCRA estimators are order statistics that decrease monotonically, the effects of multicollinearity can be circumvented. Otherwise, obscure phenomena may be observed, such as good data recognition but low predictive model power of a QSAR model.
Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James
2014-01-01
Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
Statistical analysis of nematode counts from interlaboratory proficiency tests
Berg, van den W.; Hartsema, O.; Nijs, Den J.M.F.
2014-01-01
A series of proficiency tests on potato cyst nematode (PCN; n=29) and free-living stages of Meloidogyne and Pratylenchus (n=23) were investigated to determine the accuracy and precision of the nematode counts and to gain insights into possible trends and potential improvements. In each test, each
Statistical Requirements For Pass-Fail Testing Of Contraband Detection Systems
International Nuclear Information System (INIS)
Gilliam, David M.
2011-01-01
Contraband detection systems for homeland security applications are typically tested for probability of detection (PD) and probability of false alarm (PFA) using pass-fail testing protocols. Test protocols usually require specified values for PD and PFA to be demonstrated at a specified level of statistical confidence CL. Based on a recent more theoretical treatment of this subject [1], this summary reviews the definition of CL and provides formulas and spreadsheet functions for constructing tables of general test requirements and for determining the minimum number of tests required. The formulas and tables in this article may be generally applied to many other applications of pass-fail testing, in addition to testing of contraband detection systems.
Time Series Analysis Based on Running Mann Whitney Z Statistics
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Directory of Open Access Journals (Sweden)
Rafdzah Zaki
2013-06-01
Full Text Available Objective(s: Reliability measures precision or the extent to which test results can be replicated. This is the first ever systematic review to identify statistical methods used to measure reliability of equipment measuring continuous variables. This studyalso aims to highlight the inappropriate statistical method used in the reliability analysis and its implication in the medical practice. Materials and Methods: In 2010, five electronic databases were searched between 2007 and 2009 to look for reliability studies. A total of 5,795 titles were initially identified. Only 282 titles were potentially related, and finally 42 fitted the inclusion criteria. Results: The Intra-class Correlation Coefficient (ICC is the most popular method with 25 (60% studies having used this method followed by the comparing means (8 or 19%. Out of 25 studies using the ICC, only 7 (28% reported the confidence intervals and types of ICC used. Most studies (71% also tested the agreement of instruments. Conclusion: This study finds that the Intra-class Correlation Coefficient is the most popular method used to assess the reliability of medical instruments measuring continuous outcomes. There are also inappropriate applications and interpretations of statistical methods in some studies. It is important for medical researchers to be aware of this issue, and be able to correctly perform analysis in reliability studies.
Sensitivity analysis of ranked data: from order statistics to quantiles
Heidergott, B.F.; Volk-Makarewicz, W.
2015-01-01
In this paper we provide the mathematical theory for sensitivity analysis of order statistics of continuous random variables, where the sensitivity is with respect to a distributional parameter. Sensitivity analysis of order statistics over a finite number of observations is discussed before
Angeler, David G; Viedma, Olga; Moreno, José M
2009-11-01
Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
STATISTICAL ANALYSIS OF TANK 18F FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 18F as per the statistical sampling plan developed by Shine [1]. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL [2]. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results [3] to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL{sub 95%}) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 18F. The uncertainty is quantified in this report by an upper 95% confidence limit (UCL{sub 95%}) on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL{sub 95%} was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
P-Value, a true test of statistical significance? a cautionary note ...
African Journals Online (AJOL)
While it's not the intention of the founders of significance testing and hypothesis testing to have the two ideas intertwined as if they are complementary, the inconvenient marriage of the two practices into one coherent, convenient, incontrovertible and misinterpreted practice has dotted our standard statistics textbooks and ...
Feature-Based Statistical Analysis of Combustion Simulation Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
Statistical learning methods in high-energy and astrophysics analysis
Energy Technology Data Exchange (ETDEWEB)
Zimmermann, J. [Forschungszentrum Juelich GmbH, Zentrallabor fuer Elektronik, 52425 Juelich (Germany) and Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)]. E-mail: zimmerm@mppmu.mpg.de; Kiesling, C. [Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)
2004-11-21
We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application.
Statistical learning methods in high-energy and astrophysics analysis
International Nuclear Information System (INIS)
Zimmermann, J.; Kiesling, C.
2004-01-01
We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application
The fuzzy approach to statistical analysis
Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.
2006-01-01
For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
Statistical analysis of fatigue crack growth behavior for grade B cast steel
International Nuclear Information System (INIS)
Li, W.; Sakai, T.; Li, Q.; Wang, P.
2011-01-01
Tests for fatigue crack growth rate (FCGR) and crack-tip opening displacement (CTOD) were performed to clarify the fatigue crack growth behavior of a railway grade B cast steel. The threshold values of this steel with specific survival probabilities are evaluated, in which the mean value is 8.3516 MPa m 1/2 , very similar to the experimental value, about 8.7279 MPa m 1/2 . Under the conditions of plane strain and small-scale yielding, the values of fracture toughness for this steel with specific survival probabilities are converted from the corresponding critical CTOD values, in which the mean value is about 138.4256 MPa m 1/2 . In consideration of the inherent variability of crack growth rates, six statistical models are proposed to represent the probabilistic FCGR curves of this steel in entire crack propagation region from the viewpoints of statistical evaluation on the number of cycles at a given crack size and the crack growth rate at a given stress intensity factor range, stochastic characteristic of crack growth as well as statistical analysis of coefficient and exponent in FCGR power law equation. Based on the model adequacy checking, result shows that all models are basically in good agreement with test data. Although the probabilistic damage-tolerant design based on some models may involve a certain amount of risk in stable crack propagation region, they just accord with the fact that the dispersion degree of test data in this region is relatively smaller.
Foundation of statistical energy analysis in vibroacoustics
Le Bot, A
2015-01-01
This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.
An Analysis of Rocket Propulsion Testing Costs
Ramirez-Pagan, Carmen P.; Rahman, Shamim A.
2009-01-01
The primary mission at NASA Stennis Space Center (SSC) is rocket propulsion testing. Such testing is generally performed within two arenas: (1) Production testing for certification and acceptance, and (2) Developmental testing for prototype or experimental purposes. The customer base consists of NASA programs, DOD programs, and commercial programs. Resources in place to perform on-site testing include both civil servants and contractor personnel, hardware and software including data acquisition and control, and 6 test stands with a total of 14 test positions/cells. For several business reasons there is the need to augment understanding of the test costs for all the various types of test campaigns. Historical propulsion test data was evaluated and analyzed in many different ways with the intent to find any correlation or statistics that could help produce more reliable and accurate cost estimates and projections. The analytical efforts included timeline trends, statistical curve fitting, average cost per test, cost per test second, test cost timeline, and test cost envelopes. Further, the analytical effort includes examining the test cost from the perspective of thrust level and test article characteristics. Some of the analytical approaches did not produce evidence strong enough for further analysis. Some other analytical approaches yield promising results and are candidates for further development and focused study. Information was organized for into its elements: a Project Profile, Test Cost Timeline, and Cost Envelope. The Project Profile is a snap shot of the project life cycle on a timeline fashion, which includes various statistical analyses. The Test Cost Timeline shows the cumulative average test cost, for each project, at each month where there was test activity. The Test Cost Envelope shows a range of cost for a given number of test(s). The supporting information upon which this study was performed came from diverse sources and thus it was necessary to
DEFF Research Database (Denmark)
Conradsen, Knut; Nielsen, Allan Aasbjerg; Schou, Jesper
2003-01-01
. Based on this distribution, a test statistic for equality of two such matrices and an associated asymptotic probability for obtaining a smaller value of the test statistic are derived and applied successfully to change detection in polarimetric SAR data. In a case study, EMISAR L-band data from April 17...... to HH, VV, or HV data alone, the derived test statistic reduces to the well-known gamma likelihood-ratio test statistic. The derived test statistic and the associated significance value can be applied as a line or edge detector in fully polarimetric SAR data also....
Statistical analysis and digital processing of the Mössbauer spectra
International Nuclear Information System (INIS)
Prochazka, Roman; Tucek, Jiri; Mashlan, Miroslav; Pechousek, Jiri; Tucek, Pavel; Marek, Jaroslav
2010-01-01
This work is focused on using the statistical methods and development of the filtration procedures for signal processing in Mössbauer spectroscopy. Statistical tools for noise filtering in the measured spectra are used in many scientific areas. The use of a pure statistical approach in accumulated Mössbauer spectra filtration is described. In Mössbauer spectroscopy, the noise can be considered as a Poisson statistical process with a Gaussian distribution for high numbers of observations. This noise is a superposition of the non-resonant photons counting with electronic noise (from γ-ray detection and discrimination units), and the velocity system quality that can be characterized by the velocity nonlinearities. The possibility of a noise-reducing process using a new design of statistical filter procedure is described. This mathematical procedure improves the signal-to-noise ratio and thus makes it easier to determine the hyperfine parameters of the given Mössbauer spectra. The filter procedure is based on a periodogram method that makes it possible to assign the statistically important components in the spectral domain. The significance level for these components is then feedback-controlled using the correlation coefficient test results. The estimation of the theoretical correlation coefficient level which corresponds to the spectrum resolution is performed. Correlation coefficient test is based on comparison of the theoretical and the experimental correlation coefficients given by the Spearman method. The correctness of this solution was analyzed by a series of statistical tests and confirmed by many spectra measured with increasing statistical quality for a given sample (absorber). The effect of this filter procedure depends on the signal-to-noise ratio and the applicability of this method has binding conditions
Statistical analysis and digital processing of the Mössbauer spectra
Prochazka, Roman; Tucek, Pavel; Tucek, Jiri; Marek, Jaroslav; Mashlan, Miroslav; Pechousek, Jiri
2010-02-01
This work is focused on using the statistical methods and development of the filtration procedures for signal processing in Mössbauer spectroscopy. Statistical tools for noise filtering in the measured spectra are used in many scientific areas. The use of a pure statistical approach in accumulated Mössbauer spectra filtration is described. In Mössbauer spectroscopy, the noise can be considered as a Poisson statistical process with a Gaussian distribution for high numbers of observations. This noise is a superposition of the non-resonant photons counting with electronic noise (from γ-ray detection and discrimination units), and the velocity system quality that can be characterized by the velocity nonlinearities. The possibility of a noise-reducing process using a new design of statistical filter procedure is described. This mathematical procedure improves the signal-to-noise ratio and thus makes it easier to determine the hyperfine parameters of the given Mössbauer spectra. The filter procedure is based on a periodogram method that makes it possible to assign the statistically important components in the spectral domain. The significance level for these components is then feedback-controlled using the correlation coefficient test results. The estimation of the theoretical correlation coefficient level which corresponds to the spectrum resolution is performed. Correlation coefficient test is based on comparison of the theoretical and the experimental correlation coefficients given by the Spearman method. The correctness of this solution was analyzed by a series of statistical tests and confirmed by many spectra measured with increasing statistical quality for a given sample (absorber). The effect of this filter procedure depends on the signal-to-noise ratio and the applicability of this method has binding conditions.
Analysis of Statistical Methods and Errors in the Articles Published in the Korean Journal of Pain
Yim, Kyoung Hoon; Han, Kyoung Ah; Park, Soo Young
2010-01-01
Background Statistical analysis is essential in regard to obtaining objective reliability for medical research. However, medical researchers do not have enough statistical knowledge to properly analyze their study data. To help understand and potentially alleviate this problem, we have analyzed the statistical methods and errors of articles published in the Korean Journal of Pain (KJP), with the intention to improve the statistical quality of the journal. Methods All the articles, except case reports and editorials, published from 2004 to 2008 in the KJP were reviewed. The types of applied statistical methods and errors in the articles were evaluated. Results One hundred and thirty-nine original articles were reviewed. Inferential statistics and descriptive statistics were used in 119 papers and 20 papers, respectively. Only 20.9% of the papers were free from statistical errors. The most commonly adopted statistical method was the t-test (21.0%) followed by the chi-square test (15.9%). Errors of omission were encountered 101 times in 70 papers. Among the errors of omission, "no statistics used even though statistical methods were required" was the most common (40.6%). The errors of commission were encountered 165 times in 86 papers, among which "parametric inference for nonparametric data" was the most common (33.9%). Conclusions We found various types of statistical errors in the articles published in the KJP. This suggests that meticulous attention should be given not only in the applying statistical procedures but also in the reviewing process to improve the value of the article. PMID:20552071
International Nuclear Information System (INIS)
Ihara, Hitoshi; Nishimura, Hideo; Ikawa, Koji; Miura, Nobuyuki; Iwanaga, Masayuki; Kusano, Toshitsugu.
1988-03-01
An Near-Real-Time Materials Accountancy(NRTA) system had been developed as an advanced safeguards measure for PNC Tokai Reprocessing Plant; a minicomputer system for NRTA data processing was designed and constructed. A full scale field test was carried out as a JASPAS(Japan Support Program for Agency Safeguards) project with the Agency's participation and the NRTA data processing system was used. Using this field test data, investigation of the detection power of a statistical test under real circumstances was carried out for five statistical tests, i.e., a significance test of MUF, CUMUF test, average loss test, MUF residual test and Page's test on MUF residuals. The result shows that the CUMUF test, average loss test, MUF residual test and the Page's test on MUF residual test are useful to detect a significant loss or diversion. An unmeasured inventory estimation model for the PNC reprocessing plant was developed in this study. Using this model, the field test data from the C-1 to 85 - 2 campaigns were re-analyzed. (author)
Robust statistics and geochemical data analysis
International Nuclear Information System (INIS)
Di, Z.
1987-01-01
Advantages of robust procedures over ordinary least-squares procedures in geochemical data analysis is demonstrated using NURE data from the Hot Springs Quadrangle, South Dakota, USA. Robust principal components analysis with 5% multivariate trimming successfully guarded the analysis against perturbations by outliers and increased the number of interpretable factors. Regression with SINE estimates significantly increased the goodness-of-fit of the regression and improved the correspondence of delineated anomalies with known uranium prospects. Because of the ubiquitous existence of outliers in geochemical data, robust statistical procedures are suggested as routine procedures to replace ordinary least-squares procedures
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
Data Analysis & Statistical Methods for Command File Errors
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Testing statistical significance scores of sequence comparison methods with structure similarity
Directory of Open Access Journals (Sweden)
Leunissen Jack AM
2006-10-01
Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.
SOCR: Statistics Online Computational Resource
Directory of Open Access Journals (Sweden)
Ivo D. Dinov
2006-10-01
Full Text Available The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR. This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student's intuition and enhance their learning.
Statistical analysis of cone penetration resistance of railway ballast
Directory of Open Access Journals (Sweden)
Saussine Gilles
2017-01-01
Full Text Available Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd sensitivity is important and increases with depth. The (qd distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd distribution for clean ballast; they increase by 50% the average value of (qd for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd distribution.
Quantitative Evaluation of gamma-Spectrum Analysis Methods using IAEA Test Spectra
DEFF Research Database (Denmark)
Nielsen, Sven Poul
1982-01-01
A description is given of a γ-spectrum analysis method based on nonlinear least-squares fitting. The quality of the method is investigated by using statistical tests on the results from analyses of IAEA test spectra. By applying an empirical correction factor of 0.75 to the calculated peak-area u...
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.
Lai, En-Yu; Chen, Yi-Hau; Wu, Kun-Pin
2017-06-01
Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https
Statistical analysis of laser-interferometric detector Dylkin-1 data and data on seismic activity
International Nuclear Information System (INIS)
Kirillov, R S; Bochkarev, V V; Dulkyn, Academy of Sciences of the Republic of Tatarstan (Russian Federation))" data-affiliation=" (Scientific Center of Gravitational-Wave Research Dulkyn, Academy of Sciences of the Republic of Tatarstan (Russian Federation))" >Skochilov, A F
2014-01-01
This work presents statistical analysis of data collected from laser interferometric detector ''Dylkin-1'' and nearby seismic stations. The final goal of Dylkin project consists in creating detector of theoretically predicted gravitational waves produced by binary relativistic astrophysical objects. Currently, works are underway to improve sensitivity of detector by 2-3 orders. The goals of this research were to test isolation of detector from noise caused by seismic waves and to find out whether it is sensitive to variations in the gradient of gravitational potential (acceleration of free fall) caused by free Earth oscillations. Noise isolation has been tested by comparing energy of signals during significant seismic events. Sensitivity to variations in acceleration of free fall has been tested by means of cross-spectral analysis
An adaptive Mantel-Haenszel test for sensitivity analysis in observational studies.
Rosenbaum, Paul R; Small, Dylan S
2017-06-01
In a sensitivity analysis in an observational study with a binary outcome, is it better to use all of the data or to focus on subgroups that are expected to experience the largest treatment effects? The answer depends on features of the data that may be difficult to anticipate, a trade-off between unknown effect-sizes and known sample sizes. We propose a sensitivity analysis for an adaptive test similar to the Mantel-Haenszel test. The adaptive test performs two highly correlated analyses, one focused analysis using a subgroup, one combined analysis using all of the data, correcting for multiple testing using the joint distribution of the two test statistics. Because the two component tests are highly correlated, this correction for multiple testing is small compared with, for instance, the Bonferroni inequality. The test has the maximum design sensitivity of two component tests. A simulation evaluates the power of a sensitivity analysis using the adaptive test. Two examples are presented. An R package, sensitivity2x2xk, implements the procedure. © 2016, The International Biometric Society.
Energy Technology Data Exchange (ETDEWEB)
Timashev, Svyatoslav A.; Bushinskaya, Anna V. [Russian Academy of Sciences, Ekaterinburg (Russian Federation). Ural Branch. Sciences and Engineering Center ' Reliability and Safety of Large Systems and Machines'
2009-07-01
The paper discusses current possibilities and drawbacks of in-line inspection (ILI) in sizing defects in oil and gas pipelines. A methodology based on analysis of variances (ANOVA) is presented that extracts maximum possible information from the ILI measurements of defects and subsequent verification results. This full statistical analysis (FSA) methodology was extensively tested by using the Monte Carlo simulation method. It was then applied to analyze the content of sections 7, 9 and appendix E of the API 1163 RP Standard. (author)
Palazón, L; Navas, A
2017-06-01
Information on sediment contribution and transport dynamics from the contributing catchments is needed to develop management plans to tackle environmental problems related with effects of fine sediment as reservoir siltation. In this respect, the fingerprinting technique is an indirect technique known to be valuable and effective for sediment source identification in river catchments. Large variability in sediment delivery was found in previous studies in the Barasona catchment (1509 km 2 , Central Spanish Pyrenees). Simulation results with SWAT and fingerprinting approaches identified badlands and agricultural uses as the main contributors to sediment supply in the reservoir. In this study the Kruskal-Wallis H-test and (3) principal components analysis. Source contribution results were different between assessed options with the greatest differences observed for option using #3, including the two step process: principal components analysis and discriminant function analysis. The characteristics of the solutions by the applied mixing model and the conceptual understanding of the catchment showed that the most reliable solution was achieved using #2, the two step process of Kruskal-Wallis H-test and discriminant function analysis. The assessment showed the importance of the statistical procedure used to define the optimum composite fingerprint for sediment fingerprinting applications. Copyright © 2016 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Keykhosrow Keymanesh
2009-06-01
Full Text Available Modern biotechnology, based on recombinant DNA techniques, has made it possible to introduce new traits with great potential for crop improvement. However, concerns about unintended effects of gene transformation that possibly threaten environment or consumer health have persuaded scientists to set up pre-release tests on genetically modified organisms. Assessment of ‘substantial equivalence’ concept that established by comparison of genetically modified organism with a comparator with a history of safe use could be the first step of a comprehensive risk assessment. Metabolite level is the richest in performance of changes which stem from genetic or environmental factors. Since assessment of all metabolites in detail is very costly and practically impossible, statistical evaluation of processed data of grain spectroscopic values could be a time and cost effective substitution for complex chemical analysis. To investigate the ability of multivariate statistical techniques in comparison of metabolomes as well as testing a method for such comparisons with available tools, a transgenic rice in combination with its traditionally bred parent were used as test material, and the discriminant analysis were applied as supervised method and principal component analysis as unsupervised classification method on the processed data which were extracted from Fourier transform infrared spectroscopy and nuclear magnetic resonance spectral data of powdered rice and rice extraction and barley grain samples, of which the latter was considered as control. The results confirmed the capability of statistics, even with initial data processing applications in metabolome studies. Meanwhile, this study confirms that the supervised method results in more distinctive results.
International Nuclear Information System (INIS)
Kurisaka, Kenichi
1999-11-01
The objective of this study is to develop fundamental data for examination on efficiency of preventive maintenance and surveillance test from the standpoint of failure probability. In this study, as a major standby component, a pneumatic valve in sodium cooling systems was selected. A statistical analysis was made about a trend of valve in sodium cooling systems was selected. A statistical analysis was made about a trend of valve failure-to-open/close (FTOC) probability depending on number of demands ('n'), time since installation ('t') and standby time since last open/close action ('T'). The analysis is based on the field data of operating- and failure-experiences stored in the Component Reliability Database and Statistical Analysis System for LMFBR's (CORDS). In the analysis, the FTOC probability ('P') was expressed as follows: P=1-exp{-C-En-F/n-λT-aT(t-T/2)-AT 2 /2}. The functional parameters, 'C', 'E', 'F', 'λ', 'a' and 'A', were estimated with the maximum likelihood estimation method. As a result, the FTOC probability is almost expressed with the failure probability being derived from the failure rate under assumption of the Poisson distribution only when valve cycle (i.e. open-close-open cycle) exceeds about 100 days. When the valve cycle is shorter than about 100 days, the FTOC probability can be adequately estimated with the parameter model proposed in this study. The results obtained from this study may make it possible to derive an adequate frequency of surveillance test for a given target of the FTOC probability. (author)
Induction of micronuclei in hemocytes of Mytilus edulis and statistical analysis
DEFF Research Database (Denmark)
Wrisberg, M. N.; Bilbo, Carl M.; Spliid, Henrik
1992-01-01
biological variation, emphasizing the importance of application of a correct statistical method. A systematic approach to the statistical evaluation of the mussel MN test is outlined. The statistical model includes three different situations: (a) estimation of parameters of a single sample, (b) estimation...
Determination of Geometrical REVs Based on Volumetric Fracture Intensity and Statistical Tests
Directory of Open Access Journals (Sweden)
Ying Liu
2018-05-01
Full Text Available This paper presents a method to estimate a representative element volume (REV of a fractured rock mass based on the volumetric fracture intensity P32 and statistical tests. A 150 m × 80 m × 50 m 3D fracture network model was generated based on field data collected at the Maji dam site by using the rectangular window sampling method. The volumetric fracture intensity P32 of each cube was calculated by varying the cube location in the generated 3D fracture network model and varying the cube side length from 1 to 20 m, and the distribution of the P32 values was described. The size effect and spatial effect of the fractured rock mass were studied; the P32 values from the same cube sizes and different locations were significantly different, and the fluctuation in P32 values clearly decreases as the cube side length increases. In this paper, a new method that comprehensively considers the anisotropy of rock masses, simplicity of calculation and differences between different methods was proposed to estimate the geometrical REV size. The geometrical REV size of the fractured rock mass was determined based on the volumetric fracture intensity P32 and two statistical test methods, namely, the likelihood ratio test and the Wald–Wolfowitz runs test. The results of the two statistical tests were substantially different; critical cube sizes of 13 m and 12 m were estimated by the Wald–Wolfowitz runs test and the likelihood ratio test, respectively. Because the different test methods emphasize different considerations and impact factors, considering a result that these two tests accept, the larger cube size, 13 m, was selected as the geometrical REV size of the fractured rock mass at the Maji dam site in China.
Kleijnen, J.P.C.
1995-01-01
This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for
Multivariate Statistical Methods as a Tool of Financial Analysis of Farm Business
Czech Academy of Sciences Publication Activity Database
Novák, J.; Sůvová, H.; Vondráček, Jiří
2002-01-01
Roč. 48, č. 1 (2002), s. 9-12 ISSN 0139-570X Institutional research plan: AV0Z1030915 Keywords : financial analysis * financial ratios * multivariate statistical methods * correlation analysis * discriminant analysis * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research
Directory of Open Access Journals (Sweden)
Gu Jianying
2004-05-01
Full Text Available Abstract In spite of only a 1-2 per cent genomic DNA sequence difference, humans and chimpanzees differ considerably in behaviour and cognition. Affymetrix microarray technology provides a novel approach to addressing a long-term debate on whether the difference between humans and chimpanzees results from the alteration of gene expressions. Here, we used several statistical methods (distance method, two-sample t-tests, regularised t-tests, ANOVA and bootstrapping to detect the differential expression pattern between humans and great apes. Our analysis shows that the pattern we observed before is robust against various statistical methods; that is, the pronounced expression changes occurred on the human lineage after the split from chimpanzees, and that the dramatic brain expression alterations in humans may be mainly driven by a set of genes with increased expression (up-regulated rather than decreased expression (down-regulated.
Energy Technology Data Exchange (ETDEWEB)
Kančev, Duško, E-mail: dusko.kancev@ec.europa.eu [European Commission, DG-JRC, Institute for Energy and Transport, P.O. Box 2, NL-1755 ZG Petten (Netherlands); Duchac, Alexander; Zerger, Benoit [European Commission, DG-JRC, Institute for Energy and Transport, P.O. Box 2, NL-1755 ZG Petten (Netherlands); Maqua, Michael [Gesellschaft für Anlagen-und-Reaktorsicherheit (GRS) mbH, Schwetnergasse 1, 50667 Köln (Germany); Wattrelos, Didier [Institut de Radioprotection et de Sûreté Nucléaire (IRSN), BP 17 - 92262 Fontenay-aux-Roses Cedex (France)
2014-07-01
Highlights: • Analysis of operating experience related to emergency diesel generators events at NPPs. • Four abundant operating experience databases screened. • Delineating important insights and conclusions based on the operating experience. - Abstract: This paper is aimed at studying the operating experience related to emergency diesel generators (EDGs) events at nuclear power plants collected from the past 20 years. Events related to EDGs failures and/or unavailability as well as all the supporting equipment are in the focus of the analysis. The selected operating experience was analyzed in detail in order to identify the type of failures, attributes that contributed to the failure, failure modes potential or real, discuss risk relevance, summarize important lessons learned, and provide recommendations. The study in this particular paper is tightly related to the performing of statistical analysis of the operating experience. For the purpose of this study EDG failure is defined as EDG failure to function on demand (i.e. fail to start, fail to run) or during testing, or an unavailability of an EDG, except of unavailability due to regular maintenance. The Gesellschaft für Anlagen und Reaktorsicherheit mbH (GRS) and Institut de Radioprotection et de Sûreté Nucléaire (IRSN) databases as well as the operating experience contained in the IAEA/NEA International Reporting System for Operating Experience and the U.S. Licensee Event Reports were screened. The screening methodology applied for each of the four different databases is presented. Further on, analysis aimed at delineating the causes, root causes, contributing factors and consequences are performed. A statistical analysis was performed related to the chronology of events, types of failures, the operational circumstances of detection of the failure and the affected components/subsystems. The conclusions and results of the statistical analysis are discussed. The main findings concerning the testing
International Nuclear Information System (INIS)
Kančev, Duško; Duchac, Alexander; Zerger, Benoit; Maqua, Michael; Wattrelos, Didier
2014-01-01
Highlights: • Analysis of operating experience related to emergency diesel generators events at NPPs. • Four abundant operating experience databases screened. • Delineating important insights and conclusions based on the operating experience. - Abstract: This paper is aimed at studying the operating experience related to emergency diesel generators (EDGs) events at nuclear power plants collected from the past 20 years. Events related to EDGs failures and/or unavailability as well as all the supporting equipment are in the focus of the analysis. The selected operating experience was analyzed in detail in order to identify the type of failures, attributes that contributed to the failure, failure modes potential or real, discuss risk relevance, summarize important lessons learned, and provide recommendations. The study in this particular paper is tightly related to the performing of statistical analysis of the operating experience. For the purpose of this study EDG failure is defined as EDG failure to function on demand (i.e. fail to start, fail to run) or during testing, or an unavailability of an EDG, except of unavailability due to regular maintenance. The Gesellschaft für Anlagen und Reaktorsicherheit mbH (GRS) and Institut de Radioprotection et de Sûreté Nucléaire (IRSN) databases as well as the operating experience contained in the IAEA/NEA International Reporting System for Operating Experience and the U.S. Licensee Event Reports were screened. The screening methodology applied for each of the four different databases is presented. Further on, analysis aimed at delineating the causes, root causes, contributing factors and consequences are performed. A statistical analysis was performed related to the chronology of events, types of failures, the operational circumstances of detection of the failure and the affected components/subsystems. The conclusions and results of the statistical analysis are discussed. The main findings concerning the testing
Chang, Cheng; Xu, Kaikun; Guo, Chaoping; Wang, Jinxia; Yan, Qi; Zhang, Jian; He, Fuchu; Zhu, Yunping
2018-05-22
Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. 1987ccpacer@163.com and zhuyunping@gmail.com. Supplementary data are available at Bioinformatics online.
Statistics Analysis Measures Painting of Cooling Tower
Directory of Open Access Journals (Sweden)
A. Zacharopoulou
2013-01-01
Full Text Available This study refers to the cooling tower of Megalopolis (construction 1975 and protection from corrosive environment. The maintenance of the cooling tower took place in 2008. The cooling tower was badly damaged from corrosion of reinforcement. The parabolic cooling towers (factory of electrical power are a typical example of construction, which has a special aggressive environment. The protection of cooling towers is usually achieved through organic coatings. Because of the different environmental impacts on the internal and external side of the cooling tower, a different system of paint application is required. The present study refers to the damages caused by corrosion process. The corrosive environments, the application of this painting, the quality control process, the measures and statistics analysis, and the results were discussed in this study. In the process of quality control the following measurements were taken into consideration: (1 examination of the adhesion with the cross-cut test, (2 examination of the film thickness, and (3 controlling of the pull-off resistance for concrete substrates and paintings. Finally, this study refers to the correlations of measurements, analysis of failures in relation to the quality of repair, and rehabilitation of the cooling tower. Also this study made a first attempt to apply the specific corrosion inhibitors in such a large structure.
Using Relative Statistics and Approximate Disease Prevalence to Compare Screening Tests.
Samuelson, Frank; Abbey, Craig
2016-11-01
Schatzkin et al. and other authors demonstrated that the ratios of some conditional statistics such as the true positive fraction are equal to the ratios of unconditional statistics, such as disease detection rates, and therefore we can calculate these ratios between two screening tests on the same population even if negative test patients are not followed with a reference procedure and the true and false negative rates are unknown. We demonstrate that this same property applies to an expected utility metric. We also demonstrate that while simple estimates of relative specificities and relative areas under ROC curves (AUC) do depend on the unknown negative rates, we can write these ratios in terms of disease prevalence, and the dependence of these ratios on a posited prevalence is often weak particularly if that prevalence is small or the performance of the two screening tests is similar. Therefore we can estimate relative specificity or AUC with little loss of accuracy, if we use an approximate value of disease prevalence.
Usage of Latent Class Analysis in Diagnostic Microbiology in the Absence of Gold Standard Test
Directory of Open Access Journals (Sweden)
Gul Bayram Abiha
2016-12-01
Full Text Available The evaluation of performance of various tests diagnostic tests in the absence of gold standard is an important problem. Latent class analysis (LCA is a statistical analysis method known for many years, especially in the absence of a gold standard for evaluation of diagnostic tests so that LCA has found its wide application area. During the last decade, LCA method has widely used in for determining sensivity and specifity of different microbiological tests. It has investigated in the diagnosis of mycobacterium tuberculosis, mycobacterium bovis, human papilloma virus, bordetella pertussis, influenza viruses, hepatitis E virus (HEV, hepatitis C virus (HCV and other various viral infections. Researchers have compared several diagnostic tests for the diagnosis of different pathogens with LCA. We aimed to evaluate performance of latent class analysis method used microbiological diagnosis in various diseases in several researches. When we took into account all of these tests' results, we suppose that LCA is a good statistical analysis method to assess different test performances in the absence of gold standard. [Archives Medical Review Journal 2016; 25(4.000: 467-488
Statistical analysis on experimental calibration data for flowmeters in pressure pipes
Lazzarin, Alessandro; Orsi, Enrico; Sanfilippo, Umberto
2017-08-01
This paper shows a statistical analysis on experimental calibration data for flowmeters (i.e.: electromagnetic, ultrasonic, turbine flowmeters) in pressure pipes. The experimental calibration data set consists of the whole archive of the calibration tests carried out on 246 flowmeters from January 2001 to October 2015 at Settore Portate of Laboratorio di Idraulica “G. Fantoli” of Politecnico di Milano, that is accredited as LAT 104 for a flow range between 3 l/s and 80 l/s, with a certified Calibration and Measurement Capability (CMC) - formerly known as Best Measurement Capability (BMC) - equal to 0.2%. The data set is split into three subsets, respectively consisting in: 94 electromagnetic, 83 ultrasonic and 69 turbine flowmeters; each subset is analysed separately from the others, but then a final comparison is carried out. In particular, the main focus of the statistical analysis is the correction C, that is the difference between the flow rate Q measured by the calibration facility (through the accredited procedures and the certified reference specimen) minus the flow rate QM contemporarily recorded by the flowmeter under calibration, expressed as a percentage of the same QM .
Statistical analysis of environmental data
International Nuclear Information System (INIS)
Beauchamp, J.J.; Bowman, K.O.; Miller, F.L. Jr.
1975-10-01
This report summarizes the analyses of data obtained by the Radiological Hygiene Branch of the Tennessee Valley Authority from samples taken around the Browns Ferry Nuclear Plant located in Northern Alabama. The data collection was begun in 1968 and a wide variety of types of samples have been gathered on a regular basis. The statistical analysis of environmental data involving very low-levels of radioactivity is discussed. Applications of computer calculations for data processing are described
Improved score statistics for meta-analysis in single-variant and gene-level association studies.
Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo
2018-06-01
Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.
Highly Robust Statistical Methods in Medical Image Analysis
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2012-01-01
Roč. 32, č. 2 (2012), s. 3-16 ISSN 0208-5216 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust statistics * classification * faces * robust image analysis * forensic science Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.208, year: 2012 http://www.ibib.waw.pl/bbe/bbefulltext/BBE_32_2_003_FT.pdf
Modular reweighting software for statistical mechanical analysis of biased equilibrium data
Sindhikara, Daniel J.
2012-07-01
Here a simple, useful, modular approach and software suite designed for statistical reweighting and analysis of equilibrium ensembles is presented. Statistical reweighting is useful and sometimes necessary for analysis of equilibrium enhanced sampling methods, such as umbrella sampling or replica exchange, and also in experimental cases where biasing factors are explicitly known. Essentially, statistical reweighting allows extrapolation of data from one or more equilibrium ensembles to another. Here, the fundamental separable steps of statistical reweighting are broken up into modules - allowing for application to the general case and avoiding the black-box nature of some “all-inclusive” reweighting programs. Additionally, the programs included are, by-design, written with little dependencies. The compilers required are either pre-installed on most systems, or freely available for download with minimal trouble. Examples of the use of this suite applied to umbrella sampling and replica exchange molecular dynamics simulations will be shown along with advice on how to apply it in the general case. New version program summaryProgram title: Modular reweighting version 2 Catalogue identifier: AEJH_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJH_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 179 118 No. of bytes in distributed program, including test data, etc.: 8 518 178 Distribution format: tar.gz Programming language: C++, Python 2.6+, Perl 5+ Computer: Any Operating system: Any RAM: 50-500 MB Supplementary material: An updated version of the original manuscript (Comput. Phys. Commun. 182 (2011) 2227) is available Classification: 4.13 Catalogue identifier of previous version: AEJH_v1_0 Journal reference of previous version: Comput. Phys. Commun. 182 (2011) 2227 Does the new
Statistical Power Analysis with Missing Data A Structural Equation Modeling Approach
Davey, Adam
2009-01-01
Statistical power analysis has revolutionized the ways in which we conduct and evaluate research. Similar developments in the statistical analysis of incomplete (missing) data are gaining more widespread applications. This volume brings statistical power and incomplete data together under a common framework, in a way that is readily accessible to those with only an introductory familiarity with structural equation modeling. It answers many practical questions such as: How missing data affects the statistical power in a study How much power is likely with different amounts and types
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
Statistical analyses are performed for material strength parameters from a large number of specimens of structural timber. Non-parametric statistical analysis and fits have been investigated for the following distribution types: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...... for timber are investigated....
Numeric computation and statistical data analysis on the Java platform
Chekanov, Sergei V
2016-01-01
Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...
A Divergence Statistics Extension to VTK for Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
Developments in statistical analysis in quantitative genetics
DEFF Research Database (Denmark)
Sorensen, Daniel
2009-01-01
of genetic means and variances, models for the analysis of categorical and count data, the statistical genetics of a model postulating that environmental variance is partly under genetic control, and a short discussion of models that incorporate massive genetic marker information. We provide an overview......A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap...... and by Markov chain Monte Carlo (McMC). In this overview a number of specific areas are chosen to illustrate the enormous flexibility that McMC has provided for fitting models and exploring features of data that were previously inaccessible. The selected areas are inferences of the trajectories over time...
On the Statistical Validation of Technical Analysis
Directory of Open Access Journals (Sweden)
Rosane Riera Freire
2007-06-01
Full Text Available Technical analysis, or charting, aims on visually identifying geometrical patterns in price charts in order to antecipate price "trends". In this paper we revisit the issue of thecnical analysis validation which has been tackled in the literature without taking care for (i the presence of heterogeneity and (ii statistical dependence in the analyzed data - various agglutinated return time series from distinct financial securities. The main purpose here is to address the first cited problem by suggesting a validation methodology that also "homogenizes" the securities according to the finite dimensional probability distribution of their return series. The general steps go through the identification of the stochastic processes for the securities returns, the clustering of similar securities and, finally, the identification of presence, or absence, of informatinal content obtained from those price patterns. We illustrate the proposed methodology with a real data exercise including several securities of the global market. Our investigation shows that there is a statistically significant informational content in two out of three common patterns usually found through technical analysis, namely: triangle, rectangle and head and shoulders.
Statistical analysis of the D. C. Cook preoperational environmental monitoring program
International Nuclear Information System (INIS)
Murarka, I.P.
1977-02-01
This report summarizes the major findings of an evaluation of the statistical adequacy of the preoperational environmental monitoring program for nonradioactive waste disposal at the Donald C. Cook Nuclear Power Plant. As a result of this study we found that variance components analysis methods are adequate to determine large magnitude changes in the environment. When an interaction effect between years and inner-outer factors (reference-stress) exists for the preoperation period, estimating and testing for the plant operation effect becomes difficult. This was illustrated by the benthic data analysis. It was further found that for the determination of impact hypothesis, several-factor-mixed-effects models are not needed. Simplifications, as shown by us, in the collapsed model by us, can provide the answer quite easily. Advanced methods, such as time-series analysis and biomathematical modeling, should be studied for use in the impact analysis. The limited analyses with these techniques showed promising results
Data management and statistical analysis for environmental assessment
International Nuclear Information System (INIS)
Wendelberger, J.R.; McVittie, T.I.
1995-01-01
Data management and statistical analysis for environmental assessment are important issues on the interface of computer science and statistics. Data collection for environmental decision making can generate large quantities of various types of data. A database/GIS system developed is described which provides efficient data storage as well as visualization tools which may be integrated into the data analysis process. FIMAD is a living database and GIS system. The system has changed and developed over time to meet the needs of the Los Alamos National Laboratory Restoration Program. The system provides a repository for data which may be accessed by different individuals for different purposes. The database structure is driven by the large amount and varied types of data required for environmental assessment. The integration of the database with the GIS system provides the foundation for powerful visualization and analysis capabilities
Statistical test data selection for reliability evalution of process computer software
International Nuclear Information System (INIS)
Volkmann, K.P.; Hoermann, H.; Ehrenberger, W.
1976-01-01
The paper presents a concept for converting knowledge about the characteristics of process states into practicable procedures for the statistical selection of test cases in testing process computer software. Process states are defined as vectors whose components consist of values of input variables lying in discrete positions or within given limits. Two approaches for test data selection, based on knowledge about cases of demand, are outlined referring to a purely probabilistic method and to the mathematics of stratified sampling. (orig.) [de
Compliance strategy for statistically based neutron overpower protection safety analysis methodology
International Nuclear Information System (INIS)
Holliday, E.; Phan, B.; Nainer, O.
2009-01-01
The methodology employed in the safety analysis of the slow Loss of Regulation (LOR) event in the OPG and Bruce Power CANDU reactors, referred to as Neutron Overpower Protection (NOP) analysis, is a statistically based methodology. Further enhancement to this methodology includes the use of Extreme Value Statistics (EVS) for the explicit treatment of aleatory and epistemic uncertainties, and probabilistic weighting of the initial core states. A key aspect of this enhanced NOP methodology is to demonstrate adherence, or compliance, with the analysis basis. This paper outlines a compliance strategy capable of accounting for the statistical nature of the enhanced NOP methodology. (author)
STATISTICS, Program System for Statistical Analysis of Experimental Data
International Nuclear Information System (INIS)
Helmreich, F.
1991-01-01
1 - Description of problem or function: The package is composed of 83 routines, the most important of which are the following: BINDTR: Binomial distribution; HYPDTR: Hypergeometric distribution; POIDTR: Poisson distribution; GAMDTR: Gamma distribution; BETADTR: Beta-1 and Beta-2 distributions; NORDTR: Normal distribution; CHIDTR: Chi-square distribution; STUDTR : Distribution of 'Student's T'; FISDTR: Distribution of F; EXPDTR: Exponential distribution; WEIDTR: Weibull distribution; FRAKTIL: Calculation of the fractiles of the normal, chi-square, Student's, and F distributions; VARVGL: Test for equality of variance for several sample observations; ANPAST: Kolmogorov-Smirnov test and chi-square test of goodness of fit; MULIRE: Multiple linear regression analysis for a dependent variable and a set of independent variables; STPRG: Performs a stepwise multiple linear regression analysis for a dependent variable and a set of independent variables. At each step, the variable entered into the regression equation is the one which has the greatest amount of variance between it and the dependent variable. Any independent variable can be forced into or deleted from the regression equation, irrespective of its contribution to the equation. LTEST: Tests the hypotheses of linearity of the data. SPRANK: Calculates the Spearman rank correlation coefficient. 2 - Method of solution: VARVGL: The Bartlett's Test, the Cochran's Test and the Hartley's Test are performed in the program. MULIRE: The Gauss-Jordan method is used in the solution of the normal equations. STPRG: The abbreviated Doolittle method is used to (1) determine variables to enter into the regression, and (2) complete regression coefficient calculation. 3 - Restrictions on the complexity of the problem: VARVGL: The Hartley's Test is only performed if the sample observations are all of the same size
Vector field statistical analysis of kinematic and force trajectories.
Pataky, Todd C; Robinson, Mark A; Vanrenterghem, Jos
2013-09-27
When investigating the dynamics of three-dimensional multi-body biomechanical systems it is often difficult to derive spatiotemporally directed predictions regarding experimentally induced effects. A paradigm of 'non-directed' hypothesis testing has emerged in the literature as a result. Non-directed analyses typically consist of ad hoc scalar extraction, an approach which substantially simplifies the original, highly multivariate datasets (many time points, many vector components). This paper describes a commensurately multivariate method as an alternative to scalar extraction. The method, called 'statistical parametric mapping' (SPM), uses random field theory to objectively identify field regions which co-vary significantly with the experimental design. We compared SPM to scalar extraction by re-analyzing three publicly available datasets: 3D knee kinematics, a ten-muscle force system, and 3D ground reaction forces. Scalar extraction was found to bias the analyses of all three datasets by failing to consider sufficient portions of the dataset, and/or by failing to consider covariance amongst vector components. SPM overcame both problems by conducting hypothesis testing at the (massively multivariate) vector trajectory level, with random field corrections simultaneously accounting for temporal correlation and vector covariance. While SPM has been widely demonstrated to be effective for analyzing 3D scalar fields, the current results are the first to demonstrate its effectiveness for 1D vector field analysis. It was concluded that SPM offers a generalized, statistically comprehensive solution to scalar extraction's over-simplification of vector trajectories, thereby making it useful for objectively guiding analyses of complex biomechanical systems. © 2013 Published by Elsevier Ltd. All rights reserved.
A κ-generalized statistical mechanics approach to income analysis
Clementi, F.; Gallegati, M.; Kaniadakis, G.
2009-02-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.
A κ-generalized statistical mechanics approach to income analysis
International Nuclear Information System (INIS)
Clementi, F; Gallegati, M; Kaniadakis, G
2009-01-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low–middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful
Statistical analysis of fatigue strain-life data for carbon and low-alloy steels
International Nuclear Information System (INIS)
Keisler, J.; Chopra, O.K.
1995-03-01
The existing fatigue strain vs life (S-N) data, foreign and domestic, for carbon and low-alloy steels used in the construction of nuclear power plant components have been compiled and categorized according to material, loading, and environmental conditions. A statistical model has been developed for estimating the effects of the various test conditions on fatigue life. The results of a rigorous statistical analysis have been used to estimate the probability of initiating a fatigue crack. Data in the literature were reviewed to evaluate the effects of size, geometry, and surface finish of a component on its fatigue life. The fatigue S-N curves for components have been determined by applying design margins for size, geometry, and surface finish to crack initiation curves estimated from the model
Analysis of stress corrosion data by means of the statistic of extreme values
International Nuclear Information System (INIS)
Imarisio, G.; Lanza, F.
1978-01-01
The possibility of examining stress corosion by means of extreme statistic was proposed. A series of test in boiling MgCl 2 of samples made on AISI 304 have been performed. Evolution of cracks dimension and time of life of samples was followed. It has been shown that the dimensions of the maximum cracks on the sample corroded for different times can be organized following the extreme values statistic. Also the life time of sample can be treated in the same way. A confirmation has been obtained using data taken from literature. Possible uses of predictions obtained with this type of analysis have been underlined. An extension of the toward less corrosive media and samples of several volumes is suggested to check the validity of the method
Microvariability in AGNs: study of different statistical methods - I. Observational analysis
Zibecchi, L.; Andruchow, I.; Cellone, S. A.; Carpintero, D. D.; Romero, G. E.; Combi, J. A.
2017-05-01
We present the results of a study of different statistical methods currently used in the literature to analyse the (micro)variability of active galactic nuclei (AGNs) from ground-based optical observations. In particular, we focus on the comparison between the results obtained by applying the so-called C and F statistics, which are based on the ratio of standard deviations and variances, respectively. The motivation for this is that the implementation of these methods leads to different and contradictory results, making the variability classification of the light curves of a certain source dependent on the statistics implemented. For this purpose, we re-analyse the results on an AGN sample observed along several sessions with the 2.15 m 'Jorge Sahade' telescope (CASLEO), San Juan, Argentina. For each AGN, we constructed the nightly differential light curves. We thus obtained a total of 78 light curves for 39 AGNs, and we then applied the statistical tests mentioned above, in order to re-classify the variability state of these light curves and in an attempt to find the suitable statistical methodology to study photometric (micro)variations. We conclude that, although the C criterion is not proper a statistical test, it could still be a suitable parameter to detect variability and that its application allows us to get more reliable variability results, in contrast with the F test.
Practical application and statistical analysis of titrimetric monitoring ...
African Journals Online (AJOL)
2008-09-18
Sep 18, 2008 ... The statistical tests showed that, depending on the titrant concentration ... The ASD process offers the possibility of transferring waste streams into ..... (1993) Weak acid/bases and pH control in anaerobic system – A review.
STATLIB, Interactive Statistics Program Library of Tutorial System
International Nuclear Information System (INIS)
Anderson, H.E.
1986-01-01
1 - Description of program or function: STATLIB is a conversational statistical program library developed in conjunction with a Sandia National Laboratories applied statistics course intended for practicing engineers and scientists. STATLIB is a group of 15 interactive, argument-free, statistical routines. Included are analysis of sensitivity tests; sample statistics for the normal, exponential, hypergeometric, Weibull, and extreme value distributions; three models of multiple regression analysis; x-y data plots; exact probabilities for RxC tables; n sets of m permuted integers in the range 1 to m; simple linear regression and correlation; K different random integers in the range m to n; and Fisher's exact test of independence for a 2 by 2 contingency table. Forty-five other subroutines in the library support the basic 15
An introduction to statistics with Python with applications in the life sciences
Haslwanter, Thomas
2016-01-01
This textbook provides an introduction to the free software Python and its use for statistical data analysis. It covers common statistical tests for continuous, discrete and categorical data, as well as linear regression analysis and topics from survival analysis and Bayesian statistics. Working code and data for Python solutions for each test, together with easy-to-follow Python examples, can be reproduced by the reader and reinforce their immediate understanding of the topic. With recent advances in the Python ecosystem, Python has become a popular language for scientific computing, offering a powerful environment for statistical data analysis and an interesting alternative to R. The book is intended for master and PhD students, mainly from the life and medical sciences, with a basic knowledge of statistics. As it also provides some statistics background, the book can be used by anyone who wants to perform a statistical data analysis. .
BROËT, PHILIPPE; TSODIKOV, ALEXANDER; DE RYCKE, YANN; MOREAU, THIERRY
2010-01-01
This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests. PMID:15293627
Broët, Philippe; Tsodikov, Alexander; De Rycke, Yann; Moreau, Thierry
2004-06-01
This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests.
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Riley, Richard D.
2017-01-01
An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye
2016-01-13
A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
Meta-DiSc: a software for meta-analysis of test accuracy data.
Zamora, Javier; Abraira, Victor; Muriel, Alfonso; Khan, Khalid; Coomarasamy, Arri
2006-07-12
Systematic reviews and meta-analyses of test accuracy studies are increasingly being recognised as central in guiding clinical practice. However, there is currently no dedicated and comprehensive software for meta-analysis of diagnostic data. In this article, we present Meta-DiSc, a Windows-based, user-friendly, freely available (for academic use) software that we have developed, piloted, and validated to perform diagnostic meta-analysis. Meta-DiSc a) allows exploration of heterogeneity, with a variety of statistics including chi-square, I-squared and Spearman correlation tests, b) implements meta-regression techniques to explore the relationships between study characteristics and accuracy estimates, c) performs statistical pooling of sensitivities, specificities, likelihood ratios and diagnostic odds ratios using fixed and random effects models, both overall and in subgroups and d) produces high quality figures, including forest plots and summary receiver operating characteristic curves that can be exported for use in manuscripts for publication. All computational algorithms have been validated through comparison with different statistical tools and published meta-analyses. Meta-DiSc has a Graphical User Interface with roll-down menus, dialog boxes, and online help facilities. Meta-DiSc is a comprehensive and dedicated test accuracy meta-analysis software. It has already been used and cited in several meta-analyses published in high-ranking journals. The software is publicly available at http://www.hrc.es/investigacion/metadisc_en.htm.
Nonparametric statistical inference
Gibbons, Jean Dickinson
2014-01-01
Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.
Descriptive and inferential statistical methods used in burns research.
Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars
2010-05-01
Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals
Statistical Analysis of Protein Ensembles
Máté, Gabriell; Heermann, Dieter
2014-04-01
As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Lectures on algebraic statistics
Drton, Mathias; Sullivant, Seth
2009-01-01
How does an algebraic geometer studying secant varieties further the understanding of hypothesis tests in statistics? Why would a statistician working on factor analysis raise open problems about determinantal varieties? Connections of this type are at the heart of the new field of "algebraic statistics". In this field, mathematicians and statisticians come together to solve statistical inference problems using concepts from algebraic geometry as well as related computational and combinatorial techniques. The goal of these lectures is to introduce newcomers from the different camps to algebraic statistics. The introduction will be centered around the following three observations: many important statistical models correspond to algebraic or semi-algebraic sets of parameters; the geometry of these parameter spaces determines the behaviour of widely used statistical inference procedures; computational algebraic geometry can be used to study parameter spaces and other features of statistical models.
Comparison of Statistical Methods for Detector Testing Programs
Energy Technology Data Exchange (ETDEWEB)
Rennie, John Alan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Abhold, Mark [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2016-10-14
A typical goal for any detector testing program is to ascertain not only the performance of the detector systems under test, but also the confidence that systems accepted using that testing program’s acceptance criteria will exceed a minimum acceptable performance (which is usually expressed as the minimum acceptable success probability, p). A similar problem often arises in statistics, where we would like to ascertain the fraction, p, of a population of items that possess a property that may take one of two possible values. Typically, the problem is approached by drawing a fixed sample of size n, with the number of items out of n that possess the desired property, x, being termed successes. The sample mean gives an estimate of the population mean p ≈ x/n, although usually it is desirable to accompany such an estimate with a statement concerning the range within which p may fall and the confidence associated with that range. Procedures for establishing such ranges and confidence limits are described in detail by Clopper, Brown, and Agresti for two-sided symmetric confidence intervals.
Intersection tests for single marker QTL analysis can be more powerful than two marker QTL analysis
Directory of Open Access Journals (Sweden)
Doerge RW
2003-06-01
Full Text Available Abstract Background It has been reported in the quantitative trait locus (QTL literature that when testing for QTL location and effect, the statistical power supporting methodologies based on two markers and their estimated genetic map is higher than for the genetic map independent methodologies known as single marker analyses. Close examination of these reports reveals that the two marker approaches are more powerful than single marker analyses only in certain cases. Simulation studies are a commonly used tool to determine the behavior of test statistics under known conditions. We conducted a simulation study to assess the general behavior of an intersection test and a two marker test under a variety of conditions. The study was designed to reveal whether two marker tests are always more powerful than intersection tests, or whether there are cases when an intersection test may outperform the two marker approach. We present a reanalysis of a data set from a QTL study of ovariole number in Drosophila melanogaster. Results Our simulation study results show that there are situations where the single marker intersection test equals or outperforms the two marker test. The intersection test and the two marker test identify overlapping regions in the reanalysis of the Drosophila melanogaster data. The region identified is consistent with a regression based interval mapping analysis. Conclusion We find that the intersection test is appropriate for analysis of QTL data. This approach has the advantage of simplicity and for certain situations supplies equivalent or more powerful results than a comparable two marker test.
Jsub(Ic)-testing of A-533 B - statistical evaluation of some different testing techniques
International Nuclear Information System (INIS)
Nilsson, F.
1978-01-01
The purpose of the present study was to compare statistically some different methods for the evaluation of fracture toughness of the nuclear reactor material A-533 B. Since linear elastic fracture mechanics is not applicable to this material at the interesting temperature (275 0 C), the so-called Jsub(Ic) testing method was employed. Two main difficulties are inherent in this type of testing. The first one is to determine the quantity J as a function of the deflection of the three-point bend specimens used. Three different techniques were used, the first two based on the experimentally observed input of energy to the specimen and the third employing finite element calculations. The second main problem is to determine the point when crack growth begins. For this, two methods were used, a direct electrical method and the indirect R-curve method. A total of forty specimens were tested at two laboratories. No statistically significant different results were obtained from the respective laboratories. The three methods of calculating J yielded somewhat different results, although the discrepancy was small. Also the two methods of determination of the growth initiation point yielded consistent results. The R-curve method, however, exhibited a larger uncertainty as measured by the standard deviation. The resulting Jsub(Ic) value also agreed well with earlier presented results. The relative standard deviation was of the order of 25%, which is quite small for this type of experiment. (author)
Evaluating Two Models of Collaborative Tests in an Online Introductory Statistics Course
Björnsdóttir, Auðbjörg; Garfield, Joan; Everson, Michelle
2015-01-01
This study explored the use of two different types of collaborative tests in an online introductory statistics course. A study was designed and carried out to investigate three research questions: (1) What is the difference in students' learning between using consensus and non-consensus collaborative tests in the online environment?, (2) What is…
Multivariate Statistical Analysis of Water Quality data in Indian River Lagoon, Florida
Sayemuzzaman, M.; Ye, M.
2015-12-01
The Indian River Lagoon, is part of the longest barrier island complex in the United States, is a region of particular concern to the environmental scientist because of the rapid rate of human development throughout the region and the geographical position in between the colder temperate zone and warmer sub-tropical zone. Thus, the surface water quality analysis in this region always brings the newer information. In this present study, multivariate statistical procedures were applied to analyze the spatial and temporal water quality in the Indian River Lagoon over the period 1998-2013. Twelve parameters have been analyzed on twelve key water monitoring stations in and beside the lagoon on monthly datasets (total of 27,648 observations). The dataset was treated using cluster analysis (CA), principle component analysis (PCA) and non-parametric trend analysis. The CA was used to cluster twelve monitoring stations into four groups, with stations on the similar surrounding characteristics being in the same group. The PCA was then applied to the similar groups to find the important water quality parameters. The principal components (PCs), PC1 to PC5 was considered based on the explained cumulative variances 75% to 85% in each cluster groups. Nutrient species (phosphorus and nitrogen), salinity, specific conductivity and erosion factors (TSS, Turbidity) were major variables involved in the construction of the PCs. Statistical significant positive or negative trends and the abrupt trend shift were detected applying Mann-Kendall trend test and Sequential Mann-Kendall (SQMK), for each individual stations for the important water quality parameters. Land use land cover change pattern, local anthropogenic activities and extreme climate such as drought might be associated with these trends. This study presents the multivariate statistical assessment in order to get better information about the quality of surface water. Thus, effective pollution control/management of the surface
International Nuclear Information System (INIS)
Rodionov, Andrei; Atwood, Corwin L.; Kirchsteiger, Christian; Patrik, Milan
2008-01-01
The paper presents some results of a case study on 'Demonstration of statistical approaches to identify the component's ageing by operational data analysis', which was done in the frame of the EC JRC Ageing PSA Network. Several techniques: visual evaluation, nonparametric and parametric hypothesis tests, were proposed and applied in order to demonstrate the capacity, advantages and limitations of statistical approaches to identify the component's ageing by operational data analysis. Engineering considerations are out of the scope of the present study
Neti, Prasad V.S.V.; Howell, Roger W.
2010-01-01
Recently, the distribution of radioactivity among a population of cells labeled with 210Po was shown to be well described by a log-normal (LN) distribution function (J Nucl Med. 2006;47:1049–1058) with the aid of autoradiography. To ascertain the influence of Poisson statistics on the interpretation of the autoradiographic data, the present work reports on a detailed statistical analysis of these earlier data. Methods The measured distributions of α-particle tracks per cell were subjected to statistical tests with Poisson, LN, and Poisson-lognormal (P-LN) models. Results The LN distribution function best describes the distribution of radioactivity among cell populations exposed to 0.52 and 3.8 kBq/mL of 210Po-citrate. When cells were exposed to 67 kBq/mL, the P-LN distribution function gave a better fit; however, the underlying activity distribution remained log-normal. Conclusion The present analysis generally provides further support for the use of LN distributions to describe the cellular uptake of radioactivity. Care should be exercised when analyzing autoradiographic data on activity distributions to ensure that Poisson processes do not distort the underlying LN distribution. PMID:18483086
A generalization of Friedman's rank statistic
Kroon, de J.; Laan, van der P.
1983-01-01
In this paper a very natural generalization of the two·way analysis of variance rank statistic of FRIEDMAN is given. The general distribution-free test procedure based on this statistic for the effect of J treatments in a random block design can be applied in general two-way layouts without
Statistical analysis of RHIC beam position monitors performance
Calaga, R.; Tomás, R.
2004-04-01
A detailed statistical analysis of beam position monitors (BPM) performance at RHIC is a critical factor in improving regular operations and future runs. Robust identification of malfunctioning BPMs plays an important role in any orbit or turn-by-turn analysis. Singular value decomposition and Fourier transform methods, which have evolved as powerful numerical techniques in signal processing, will aid in such identification from BPM data. This is the first attempt at RHIC to use a large set of data to statistically enhance the capability of these two techniques and determine BPM performance. A comparison from run 2003 data shows striking agreement between the two methods and hence can be used to improve BPM functioning at RHIC and possibly other accelerators.
Statistical analysis of RHIC beam position monitors performance
Directory of Open Access Journals (Sweden)
R. Calaga
2004-04-01
Full Text Available A detailed statistical analysis of beam position monitors (BPM performance at RHIC is a critical factor in improving regular operations and future runs. Robust identification of malfunctioning BPMs plays an important role in any orbit or turn-by-turn analysis. Singular value decomposition and Fourier transform methods, which have evolved as powerful numerical techniques in signal processing, will aid in such identification from BPM data. This is the first attempt at RHIC to use a large set of data to statistically enhance the capability of these two techniques and determine BPM performance. A comparison from run 2003 data shows striking agreement between the two methods and hence can be used to improve BPM functioning at RHIC and possibly other accelerators.
Prediction of transmission loss through an aircraft sidewall using statistical energy analysis
Ming, Ruisen; Sun, Jincai
1989-06-01
The transmission loss of randomly incident sound through an aircraft sidewall is investigated using statistical energy analysis. Formulas are also obtained for the simple calculation of sound transmission loss through single- and double-leaf panels. Both resonant and nonresonant sound transmissions can be easily calculated using the formulas. The formulas are used to predict sound transmission losses through a Y-7 propeller airplane panel. The panel measures 2.56 m x 1.38 m and has two windows. The agreement between predicted and measured values through most of the frequency ranges tested is quite good.
Statistical analysis of the potassium concentration obtained through
International Nuclear Information System (INIS)
Pereira, Joao Eduardo da Silva; Silva, Jose Luiz Silverio da; Pires, Carlos Alberto da Fonseca; Strieder, Adelir Jose
2007-01-01
The present work was developed in outcrops of Santa Maria region, southern Brazil, Rio Grande do Sul State. Statistic evaluations were applied in different rock types. The possibility to distinguish different geologic units, sedimentary and volcanic (acid and basic types) by means of the statistic analyses from the use of airborne gamma-ray spectrometry integrating potash radiation emissions data with geological and geochemistry data is discussed. This Project was carried out at 1973 by Geological Survey of Brazil/Companhia de Pesquisas de Recursos Minerais. The Camaqua Project evaluated the behavior of potash concentrations generating XYZ Geosof 1997 format, one grid, thematic map and digital thematic map files from this total area. Using these data base, the integration of statistics analyses in sedimentary formations which belong to the Depressao Central do Rio Grande do Sul and/or to volcanic rocks from Planalto da Serra Geral at the border of Parana Basin was tested. Univariate statistics model was used: the media, the standard media error, and the trust limits were estimated. The Tukey's Test was used in order to compare mean values. The results allowed to create criteria to distinguish geological formations based on their potash content. The back-calibration technique was employed to transform K radiation to percentage. Inside this context it was possible to define characteristic values from radioactive potash emissions and their trust ranges in relation to geologic formations. The potash variable when evaluated in relation to geographic Universal Transverse Mercator coordinates system showed a spatial relation following one polynomial model of second order, with one determination coefficient. The statistica 7.1 software Generalist Linear Models produced by Statistics Department of Federal University of Santa Maria/Brazil was used. (author)
Statistics Education Research in Malaysia and the Philippines: A Comparative Analysis
Reston, Enriqueta; Krishnan, Saras; Idris, Noraini
2014-01-01
This paper presents a comparative analysis of statistics education research in Malaysia and the Philippines by modes of dissemination, research areas, and trends. An electronic search for published research papers in the area of statistics education from 2000-2012 yielded 20 for Malaysia and 19 for the Philippines. Analysis of these papers showed…
A statistical method for draft tube pressure pulsation analysis
International Nuclear Information System (INIS)
Doerfler, P K; Ruchonnet, N
2012-01-01
Draft tube pressure pulsation (DTPP) in Francis turbines is composed of various components originating from different physical phenomena. These components may be separated because they differ by their spatial relationships and by their propagation mechanism. The first step for such an analysis was to distinguish between so-called synchronous and asynchronous pulsations; only approximately periodic phenomena could be described in this manner. However, less regular pulsations are always present, and these become important when turbines have to operate in the far off-design range, in particular at very low load. The statistical method described here permits to separate the stochastic (random) component from the two traditional 'regular' components. It works in connection with the standard technique of model testing with several pressure signals measured in draft tube cone. The difference between the individual signals and the averaged pressure signal, together with the coherence between the individual pressure signals is used for analysis. An example reveals that a generalized, non-periodic version of the asynchronous pulsation is important at low load.
Statistical analysis of next generation sequencing data
Nettleton, Dan
2014-01-01
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...
The Statistic Test on Influence of Surface Treatment to Fatigue Lifetime with Limited Data
Suhartono, Agus
2009-01-01
Justifications on the influences of two or more parameters on fatigue strength are some times problematic due to the scatter nature of the fatigue data. Statistic test can facilitate the evaluation, whether the changes in material characteristics as a result of specific parameters of interest is significant. The statistic tests were applied to fatigue data of AISI 1045 steel specimens. The specimens are consisted of as received specimen, shot peened specimen with 15 and 16 Almen intensity as ...
Conducting tests for statistically significant differences using forest inventory data
James A. Westfall; Scott A. Pugh; John W. Coulston
2013-01-01
Many forest inventory and monitoring programs are based on a sample of ground plots from which estimates of forest resources are derived. In addition to evaluating metrics such as number of trees or amount of cubic wood volume, it is often desirable to make comparisons between resource attributes. To properly conduct statistical tests for differences, it is imperative...
Selected papers on analysis, probability, and statistics
Nomizu, Katsumi
1994-01-01
This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.
Testing independence of bivariate interval-censored data using modified Kendall's tau statistic.
Kim, Yuneung; Lim, Johan; Park, DoHwan
2015-11-01
In this paper, we study a nonparametric procedure to test independence of bivariate interval censored data; for both current status data (case 1 interval-censored data) and case 2 interval-censored data. To do it, we propose a score-based modification of the Kendall's tau statistic for bivariate interval-censored data. Our modification defines the Kendall's tau statistic with expected numbers of concordant and disconcordant pairs of data. The performance of the modified approach is illustrated by simulation studies and application to the AIDS study. We compare our method to alternative approaches such as the two-stage estimation method by Sun et al. (Scandinavian Journal of Statistics, 2006) and the multiple imputation method by Betensky and Finkelstein (Statistics in Medicine, 1999b). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Statistical analysis of content of Cs-137 in soils in Bansko-Razlog region
International Nuclear Information System (INIS)
Kobilarov, R. G.
2014-01-01
Statistical analysis of the data set consisting of the activity concentrations of 137 Cs in soils in Bansko–Razlog region is carried out in order to establish the dependence of the deposition and the migration of 137 Cs on the soil type. The descriptive statistics and the test of normality show that the data set have not normal distribution. Positively skewed distribution and possible outlying values of the activity of 137 Cs in soils were observed. After reduction of the effects of outliers, the data set is divided into two parts, depending on the soil type. Test of normality of the two new data sets shows that they have a normal distribution. Ordinary kriging technique is used to characterize the spatial distribution of the activity of 137 Cs over an area covering 40 km 2 (whole Razlog valley). The result (a map of the spatial distribution of the activity concentration of 137 Cs) can be used as a reference point for future studies on the assessment of radiological risk to the population and the erosion of soils in the study area
Statistical Methods for the detection of answer copying on achievement tests
Sotaridona, Leonardo
2003-01-01
This thesis contains a collection of studies where statistical methods for the detection of answer copying on achievement tests in multiple-choice format are proposed and investigated. Although all methods are suited to detect answer copying, each method is designed to address specific
Comparative analysis of positive and negative attitudes toward statistics
Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah
2015-02-01
Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.
Directory of Open Access Journals (Sweden)
Kleinjans Jos
2008-09-01
Full Text Available Abstract Background In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. Results In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh, is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. Conclusion CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a
Moretti, Stefano; van Leeuwen, Danitsja; Gmuender, Hans; Bonassi, Stefano; van Delft, Joost; Kleinjans, Jos; Patrone, Fioravante; Merlo, Domenico Franco
2008-09-02
In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that
DEFF Research Database (Denmark)
Schneider, Jesper Wiborg
2012-01-01
In this paper we discuss and question the use of statistical significance tests in relation to university rankings as recently suggested. We outline the assumptions behind and interpretations of statistical significance tests and relate this to examples from the recent SCImago Institutions Rankin...
Vapor Pressure Data Analysis and Statistics
2016-12-01
near 8, 2000, and 200, respectively. The A (or a) value is directly related to vapor pressure and will be greater for high vapor pressure materials...1, (10) where n is the number of data points, Yi is the natural logarithm of the i th experimental vapor pressure value, and Xi is the...VAPOR PRESSURE DATA ANALYSIS AND STATISTICS ECBC-TR-1422 Ann Brozena RESEARCH AND TECHNOLOGY DIRECTORATE
Statistical analysis of planktic foraminifera of the surface Continental ...
African Journals Online (AJOL)
Planktic foraminiferal assemblage recorded from selected samples obtained from shallow continental shelf sediments off southwestern Nigeria were subjected to statistical analysis. The Principal Component Analysis (PCA) was used to determine variants of planktic parameters. Values obtained for these parameters were ...
Imaging mass spectrometry statistical analysis.
Jones, Emrys A; Deininger, Sören-Oliver; Hogendoorn, Pancras C W; Deelder, André M; McDonnell, Liam A
2012-08-30
Imaging mass spectrometry is increasingly used to identify new candidate biomarkers. This clinical application of imaging mass spectrometry is highly multidisciplinary: expertise in mass spectrometry is necessary to acquire high quality data, histology is required to accurately label the origin of each pixel's mass spectrum, disease biology is necessary to understand the potential meaning of the imaging mass spectrometry results, and statistics to assess the confidence of any findings. Imaging mass spectrometry data analysis is further complicated because of the unique nature of the data (within the mass spectrometry field); several of the assumptions implicit in the analysis of LC-MS/profiling datasets are not applicable to imaging. The very large size of imaging datasets and the reporting of many data analysis routines, combined with inadequate training and accessible reviews, have exacerbated this problem. In this paper we provide an accessible review of the nature of imaging data and the different strategies by which the data may be analyzed. Particular attention is paid to the assumptions of the data analysis routines to ensure that the reader is apprised of their correct usage in imaging mass spectrometry research. Copyright © 2012 Elsevier B.V. All rights reserved.
Langmuir waveforms at interplanetary shocks: STEREO statistical analysis
Briand, C.
2016-12-01
Wave-particle interactions and particle acceleration are the two main processes allowing energy dissipation at non collisional shocks. Ion acceleration has been deeply studied for many years, also for their central role in the shock front reformation. Electron dynamics is also important in the shock dynamics through the instabilities they can generate which may impact the ion dynamics.Particle measurements can be efficiently completed by wave measurements to determine the characteristics of the electron beams and study the turbulence of the medium. Electric waveforms obtained from the S/WAVES instrument of the STEREO mission between 2007 to 2014 are analyzed. Thus, clear signature of Langmuir waves are observed on 41 interplanetary shocks. These data enable a statistical analysis and to deduce some characteristics of the electron dynamics on different shocks sources (SIR or ICME) and types (quasi-perpendicular or quasi-parallel). The conversion process between electrostatic to electromagnetic waves has also been tested in several cases.
Statistical Methods for Environmental Pollution Monitoring
Energy Technology Data Exchange (ETDEWEB)
Gilbert, Richard O. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
1987-01-01
The application of statistics to environmental pollution monitoring studies requires a knowledge of statistical analysis methods particularly well suited to pollution data. This book fills that need by providing sampling plans, statistical tests, parameter estimation procedure techniques, and references to pertinent publications. Most of the statistical techniques are relatively simple, and examples, exercises, and case studies are provided to illustrate procedures. The book is logically divided into three parts. Chapters 1, 2, and 3 are introductory chapters. Chapters 4 through 10 discuss field sampling designs and Chapters 11 through 18 deal with a broad range of statistical analysis procedures. Some statistical techniques given here are not commonly seen in statistics book. For example, see methods for handling correlated data (Sections 4.5 and 11.12), for detecting hot spots (Chapter 10), and for estimating a confidence interval for the mean of a lognormal distribution (Section 13.2). Also, Appendix B lists a computer code that estimates and tests for trends over time at one or more monitoring stations using nonparametric methods (Chapters 16 and 17). Unfortunately, some important topics could not be included because of their complexity and the need to limit the length of the book. For example, only brief mention could be made of time series analysis using Box-Jenkins methods and of kriging techniques for estimating spatial and spatial-time patterns of pollution, although multiple references on these topics are provided. Also, no discussion of methods for assessing risks from environmental pollution could be included.
Directory of Open Access Journals (Sweden)
Turk Rolf
2006-04-01
Full Text Available Abstract Background The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition. A Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials are significantly different from each other. Results We validate the temporal Hotelling T2-test on muscular gene expression data from four mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gamma-sarcoglycan deficient mice, and wild-type mice. The first three are animal models for different muscular dystrophies. Extensive biological validation shows that the method is capable of finding genes with temporal profiles significantly different across the four strains, as well as identifying potential biomarkers for each form of the disease. The added value of the temporal test compared to an identical test which does not make use of temporal ordering is demonstrated via a simulation study, and through confirmation of the expression profiles from selected genes by quantitative PCR experiments. The proposed method maximises the detection of the biologically interesting genes, whilst minimising false detections. Conclusion The temporal Hotelling T2-test is capable of finding relatively small and robust sets of genes that display different temporal profiles between the conditions of interest. The test is simple, it can be used on gene expression data generated from any experimental design and for any number of conditions, and it
Lehmann, Rüdiger; Lösler, Michael
2017-12-01
Geodetic deformation analysis can be interpreted as a model selection problem. The null model indicates that no deformation has occurred. It is opposed to a number of alternative models, which stipulate different deformation patterns. A common way to select the right model is the usage of a statistical hypothesis test. However, since we have to test a series of deformation patterns, this must be a multiple test. As an alternative solution for the test problem, we propose the p-value approach. Another approach arises from information theory. Here, the Akaike information criterion (AIC) or some alternative is used to select an appropriate model for a given set of observations. Both approaches are discussed and applied to two test scenarios: A synthetic levelling network and the Delft test data set. It is demonstrated that they work but behave differently, sometimes even producing different results. Hypothesis tests are well-established in geodesy, but may suffer from an unfavourable choice of the decision error rates. The multiple test also suffers from statistical dependencies between the test statistics, which are neglected. Both problems are overcome by applying information criterions like AIC.
Cleophas, Ton J
2012-01-01
The first part of this title contained all statistical tests relevant to starting clinical investigations, and included tests for continuous and binary data, power, sample size, multiple testing, variability, confounding, interaction, and reliability. The current part 2 of this title reviews methods for handling missing data, manipulated data, multiple confounders, predictions beyond observation, uncertainty of diagnostic tests, and the problems of outliers. Also robust tests, non-linear modeling , goodness of fit testing, Bhatacharya models, item response modeling, superiority testing, variab
Predicting Smoking Status Using Machine Learning Algorithms and Statistical Analysis
Directory of Open Access Journals (Sweden)
Charles Frank
2018-03-01
Full Text Available Smoking has been proven to negatively affect health in a multitude of ways. As of 2009, smoking has been considered the leading cause of preventable morbidity and mortality in the United States, continuing to plague the country’s overall health. This study aims to investigate the viability and effectiveness of some machine learning algorithms for predicting the smoking status of patients based on their blood tests and vital readings results. The analysis of this study is divided into two parts: In part 1, we use One-way ANOVA analysis with SAS tool to show the statistically significant difference in blood test readings between smokers and non-smokers. The results show that the difference in INR, which measures the effectiveness of anticoagulants, was significant in favor of non-smokers which further confirms the health risks associated with smoking. In part 2, we use five machine learning algorithms: Naïve Bayes, MLP, Logistic regression classifier, J48 and Decision Table to predict the smoking status of patients. To compare the effectiveness of these algorithms we use: Precision, Recall, F-measure and Accuracy measures. The results show that the Logistic algorithm outperformed the four other algorithms with Precision, Recall, F-Measure, and Accuracy of 83%, 83.4%, 83.2%, 83.44%, respectively.
Graphics and Statistics for Cardiology: Data visualisation for meta-analysis.
Kiran, Amit; Crespillo, Abel Pérez; Rahimi, Kazem
2017-01-01
Graphical displays play a pivotal role in understanding data sets and disseminating results. For meta-analysis, they are instrumental in presenting findings from multiple studies. This report presents guidance to authors wishing to submit graphical displays as part of their meta-analysis to a clinical cardiology journal, such as HeartWhen using graphical displays for meta-analysis, we recommend the following: Use a flow diagram to describe the number of studies returned from the initial search, the inclusion/exclusion criteria applied and the final number of studies used in the meta-analysis.Present results from the meta-analysis using a figure that incorporates a forest plot and underlying (tabulated) statistics, including test for heterogeneity.Use displays such as funnel plot (minimum 10 studies) and Galbraith plot to visually present distribution of effect sizes or associations in order to evaluate small-study effects and publication bias).For meta-regression, the bubble plot is a useful display for assessing associations by study-level factors.Final checks on graphs, such as appropriate use of axis scale, line pattern, text size and graph resolution, should always be performed. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
A Comparison of Several Statistical Tests of Reciprocity of Self-Disclosure.
Dindia, Kathryn
1988-01-01
Reports the results of a study that used several statistical tests of reciprocity of self-disclosure. Finds little evidence for reciprocity of self-disclosure, and concludes that either reciprocity is an illusion, or that different or more sophisticated methods are needed to detect it. (MS)
Applied Behavior Analysis and Statistical Process Control?
Hopkins, B. L.
1995-01-01
Incorporating statistical process control (SPC) methods into applied behavior analysis is discussed. It is claimed that SPC methods would likely reduce applied behavior analysts' intimate contacts with problems and would likely yield poor treatment and research decisions. Cases and data presented by Pfadt and Wheeler (1995) are cited as examples.…
Application of range-test in multiple linear regression analysis in ...
African Journals Online (AJOL)
Application of range-test in multiple linear regression analysis in the presence of outliers is studied in this paper. First, the plot of the explanatory variables (i.e. Administration, Social/Commercial, Economic services and Transfer) on the dependent variable (i.e. GDP) was done to identify the statistical trend over the years.
Testing the statistical isotropy of large scale structure with multipole vectors
International Nuclear Information System (INIS)
Zunckel, Caroline; Huterer, Dragan; Starkman, Glenn D.
2011-01-01
A fundamental assumption in cosmology is that of statistical isotropy - that the Universe, on average, looks the same in every direction in the sky. Statistical isotropy has recently been tested stringently using cosmic microwave background data, leading to intriguing results on large angular scales. Here we apply some of the same techniques used in the cosmic microwave background to the distribution of galaxies on the sky. Using the multipole vector approach, where each multipole in the harmonic decomposition of galaxy density field is described by unit vectors and an amplitude, we lay out the basic formalism of how to reconstruct the multipole vectors and their statistics out of galaxy survey catalogs. We apply the algorithm to synthetic galaxy maps, and study the sensitivity of the multipole vector reconstruction accuracy to the density, depth, sky coverage, and pixelization of galaxy catalog maps.
Vahedi, Shahrum; Farrokhi, Farahman; Gahramani, Farahnaz; Issazadegan, Ali
2012-01-01
Approximately 66-80%of graduate students experience statistics anxiety and some researchers propose that many students identify statistics courses as the most anxiety-inducing courses in their academic curriculums. As such, it is likely that statistics anxiety is, in part, responsible for many students delaying enrollment in these courses for as long as possible. This paper proposes a canonical model by treating academic procrastination (AP), learning strategies (LS) as predictor variables and statistics anxiety (SA) as explained variables. A questionnaire survey was used for data collection and 246-college female student participated in this study. To examine the mutually independent relations between procrastination, learning strategies and statistics anxiety variables, a canonical correlation analysis was computed. Findings show that two canonical functions were statistically significant. The set of variables (metacognitive self-regulation, source management, preparing homework, preparing for test and preparing term papers) helped predict changes of statistics anxiety with respect to fearful behavior, Attitude towards math and class, Performance, but not Anxiety. These findings could be used in educational and psychological interventions in the context of statistics anxiety reduction.
Cohn, T.A.; England, J.F.; Berenbrock, C.E.; Mason, R.R.; Stedinger, J.R.; Lamontagne, J.R.
2013-01-01
he Grubbs-Beck test is recommended by the federal guidelines for detection of low outliers in flood flow frequency computation in the United States. This paper presents a generalization of the Grubbs-Beck test for normal data (similar to the Rosner (1983) test; see also Spencer and McCuen (1996)) that can provide a consistent standard for identifying multiple potentially influential low flows. In cases where low outliers have been identified, they can be represented as “less-than” values, and a frequency distribution can be developed using censored-data statistical techniques, such as the Expected Moments Algorithm. This approach can improve the fit of the right-hand tail of a frequency distribution and provide protection from lack-of-fit due to unimportant but potentially influential low flows (PILFs) in a flood series, thus making the flood frequency analysis procedure more robust.
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Statistical analysis of JET disruptions
International Nuclear Information System (INIS)
Tanga, A.; Johnson, M.F.
1991-07-01
In the operation of JET and of any tokamak many discharges are terminated by a major disruption. The disruptive termination of a discharge is usually an unwanted event which may cause damage to the structure of the vessel. In a reactor disruptions are potentially a very serious problem, hence the importance of studying them and devising methods to avoid disruptions. Statistical information has been collected about the disruptions which have occurred at JET over a long span of operations. The analysis is focused on the operational aspects of the disruptions rather than on the underlining physics. (Author)
Statistical Symbolic Execution with Informed Sampling
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic
Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science.
Veldkamp, Coosje L S; Nuijten, Michèle B; Dominguez-Alvarez, Linda; van Assen, Marcel A L M; Wicherts, Jelte M
2014-01-01
Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this 'co-piloting' currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.
Statistical characteristics of mechanical heart valve cavitation in accelerated testing.
Wu, Changfu; Hwang, Ned H C; Lin, Yu-Kweng M
2004-07-01
Cavitation damage has been observed on mechanical heart valves (MHVs) undergoing accelerated testing. Cavitation itself can be modeled as a stochastic process, as it varies from beat to beat of the testing machine. This in-vitro study was undertaken to investigate the statistical characteristics of MHV cavitation. A 25-mm St. Jude Medical bileaflet MHV (SJM 25) was tested in an accelerated tester at various pulse rates, ranging from 300 to 1,000 bpm, with stepwise increments of 100 bpm. A miniature pressure transducer was placed near a leaflet tip on the inflow side of the valve, to monitor regional transient pressure fluctuations at instants of valve closure. The pressure trace associated with each beat was passed through a 70 kHz high-pass digital filter to extract the high-frequency oscillation (HFO) components resulting from the collapse of cavitation bubbles. Three intensity-related measures were calculated for each HFO burst: its time span; its local root-mean-square (LRMS) value; and the area enveloped by the absolute value of the HFO pressure trace and the time axis, referred to as cavitation impulse. These were treated as stochastic processes, of which the first-order probability density functions (PDFs) were estimated for each test rate. Both the LRMS value and cavitation impulse were log-normal distributed, and the time span was normal distributed. These distribution laws were consistent at different test rates. The present investigation was directed at understanding MHV cavitation as a stochastic process. The results provide a basis for establishing further the statistical relationship between cavitation intensity and time-evolving cavitation damage on MHV surfaces. These data are required to assess and compare the performance of MHVs of different designs.
Large-eddy simulation in a mixing tee junction: High-order turbulent statistics analysis
International Nuclear Information System (INIS)
Howard, Richard J.A.; Serre, Eric
2015-01-01
Highlights: • Mixing and thermal fluctuations in a junction are studied using large eddy simulation. • Adiabatic and conducting steel wall boundaries are tested. • Wall thermal fluctuations are not the same between the flow and the solid. • Solid thermal fluctuations cannot be predicted from the fluid thermal fluctuations. • High-order turbulent statistics show that the turbulent transport term is important. - Abstract: This study analyses the mixing and thermal fluctuations induced in a mixing tee junction with circular cross-sections when cold water flowing in a pipe is joined by hot water from a branch pipe. This configuration is representative of industrial piping systems in which temperature fluctuations in the fluid may cause thermal fatigue damage on the walls. Implicit large-eddy simulations (LES) are performed for equal inflow rates corresponding to a bulk Reynolds number Re = 39,080. Two different thermal boundary conditions are studied for the pipe walls; an insulating adiabatic boundary and a conducting steel wall boundary. The predicted flow structures show a satisfactory agreement with the literature. The velocity and thermal fields (including high-order statistics) are not affected by the heat transfer with the steel walls. However, predicted thermal fluctuations at the boundary are not the same between the flow and the solid, showing that solid thermal fluctuations cannot be predicted by the knowledge of the fluid thermal fluctuations alone. The analysis of high-order turbulent statistics provides a better understanding of the turbulence features. In particular, the budgets of the turbulent kinetic energy and temperature variance allows a comparative analysis of dissipation, production and transport terms. It is found that the turbulent transport term is an important term that acts to balance the production. We therefore use a priori tests to evaluate three different models for the triple correlation
International Nuclear Information System (INIS)
Steffen, Jason H.; Ford, Eric B.; Rowe, Jason F.; Borucki, William J.; Bryson, Steve; Caldwell, Douglas A.; Jenkins, Jon M.; Koch, David G.; Sanderfer, Dwight T.; Seader, Shawn; Twicken, Joseph D.; Fabrycky, Daniel C.; Holman, Matthew J.; Welsh, William F.; Batalha, Natalie M.; Ciardi, David R.; Kjeldsen, Hans; Prša, Andrej
2012-01-01
We analyze the deviations of transit times from a linear ephemeris for the Kepler Objects of Interest (KOI) through quarter six of science data. We conduct two statistical tests for all KOIs and a related statistical test for all pairs of KOIs in multi-transiting systems. These tests identify several systems which show potentially interesting transit timing variations (TTVs). Strong TTV systems have been valuable for the confirmation of planets and their mass measurements. Many of the systems identified in this study should prove fruitful for detailed TTV studies.
DEFF Research Database (Denmark)
Steffen, J.H.; Ford, E.B.; Rowe, J.F.
2012-01-01
We analyze the deviations of transit times from a linear ephemeris for the Kepler Objects of Interest (KOI) through quarter six of science data. We conduct two statistical tests for all KOIs and a related statistical test for all pairs of KOIs in multi-transiting systems. These tests identify...... several systems which show potentially interesting transit timing variations (TTVs). Strong TTV systems have been valuable for the confirmation of planets and their mass measurements. Many of the systems identified in this study should prove fruitful for detailed TTV studies....
Statistical analysis of the Ft. Calhoun reactor coolant pump system
International Nuclear Information System (INIS)
Patel, Bimal; Heising, C.D.
1997-01-01
In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety. As a demonstration of such an approach, a specific system is analyzed: the reactor coolant pumps (RCPs) of the Ft. Calhoun nuclear power plant. This research uses capability analysis, Shewhart X-bar, R charts, canonical correlation methods, and design of experiments to analyze the process for the state of statistical control. The results obtained show that six out of ten parameters are under control specification limits and four parameters are not in the state of statistical control. The analysis shows that statistical process control methods can be applied as an early warning system capable of identifying significant equipment problems well in advance of traditional control room alarm indicators. Such a system would provide operators with ample time to respond to possible emergency situations and thus improve plant safety and reliability. (Author)
International Nuclear Information System (INIS)
Park, Ji Eun; Sung, Yu Sub; Han, Kyung Hwa
2017-01-01
To evaluate the frequency and adequacy of statistical analyses in a general radiology journal when reporting a reliability analysis for a diagnostic test. Sixty-three studies of diagnostic test accuracy (DTA) and 36 studies reporting reliability analyses published in the Korean Journal of Radiology between 2012 and 2016 were analyzed. Studies were judged using the methodological guidelines of the Radiological Society of North America-Quantitative Imaging Biomarkers Alliance (RSNA-QIBA), and COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative. DTA studies were evaluated by nine editorial board members of the journal. Reliability studies were evaluated by study reviewers experienced with reliability analysis. Thirty-one (49.2%) of the 63 DTA studies did not include a reliability analysis when deemed necessary. Among the 36 reliability studies, proper statistical methods were used in all (5/5) studies dealing with dichotomous/nominal data, 46.7% (7/15) of studies dealing with ordinal data, and 95.2% (20/21) of studies dealing with continuous data. Statistical methods were described in sufficient detail regarding weighted kappa in 28.6% (2/7) of studies and regarding the model and assumptions of intraclass correlation coefficient in 35.3% (6/17) and 29.4% (5/17) of studies, respectively. Reliability parameters were used as if they were agreement parameters in 23.1% (3/13) of studies. Reproducibility and repeatability were used incorrectly in 20% (3/15) of studies. Greater attention to the importance of reporting reliability, thorough description of the related statistical methods, efforts not to neglect agreement parameters, and better use of relevant terminology is necessary
Energy Technology Data Exchange (ETDEWEB)
Park, Ji Eun; Sung, Yu Sub [Dept. of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul (Korea, Republic of); Han, Kyung Hwa [Dept. of Radiology, Research Institute of Radiological Science, Yonsei University College of Medicine, Seoul (Korea, Republic of); and others
2017-11-15
To evaluate the frequency and adequacy of statistical analyses in a general radiology journal when reporting a reliability analysis for a diagnostic test. Sixty-three studies of diagnostic test accuracy (DTA) and 36 studies reporting reliability analyses published in the Korean Journal of Radiology between 2012 and 2016 were analyzed. Studies were judged using the methodological guidelines of the Radiological Society of North America-Quantitative Imaging Biomarkers Alliance (RSNA-QIBA), and COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative. DTA studies were evaluated by nine editorial board members of the journal. Reliability studies were evaluated by study reviewers experienced with reliability analysis. Thirty-one (49.2%) of the 63 DTA studies did not include a reliability analysis when deemed necessary. Among the 36 reliability studies, proper statistical methods were used in all (5/5) studies dealing with dichotomous/nominal data, 46.7% (7/15) of studies dealing with ordinal data, and 95.2% (20/21) of studies dealing with continuous data. Statistical methods were described in sufficient detail regarding weighted kappa in 28.6% (2/7) of studies and regarding the model and assumptions of intraclass correlation coefficient in 35.3% (6/17) and 29.4% (5/17) of studies, respectively. Reliability parameters were used as if they were agreement parameters in 23.1% (3/13) of studies. Reproducibility and repeatability were used incorrectly in 20% (3/15) of studies. Greater attention to the importance of reporting reliability, thorough description of the related statistical methods, efforts not to neglect agreement parameters, and better use of relevant terminology is necessary.
Price limits and stock market efficiency: Evidence from rolling bicorrelation test statistic
International Nuclear Information System (INIS)
Lim, Kian-Ping; Brooks, Robert D.
2009-01-01
Using the rolling bicorrelation test statistic, the present paper compares the efficiency of stock markets from China, Korea and Taiwan in selected sub-periods with different price limits regimes. The statistical results do not support the claims that restrictive price limits and price limits per se are jeopardizing market efficiency. However, the evidence does not imply that price limits have no effect on the price discovery process but rather suggesting that market efficiency is not merely determined by price limits.
Harrigan, George G; Harrison, Jay M
2012-01-01
New transgenic (GM) crops are subjected to extensive safety assessments that include compositional comparisons with conventional counterparts as a cornerstone of the process. The influence of germplasm, location, environment, and agronomic treatments on compositional variability is, however, often obscured in these pair-wise comparisons. Furthermore, classical statistical significance testing can often provide an incomplete and over-simplified summary of highly responsive variables such as crop composition. In order to more clearly describe the influence of the numerous sources of compositional variation we present an introduction to two alternative but complementary approaches to data analysis and interpretation. These include i) exploratory data analysis (EDA) with its emphasis on visualization and graphics-based approaches and ii) Bayesian statistical methodology that provides easily interpretable and meaningful evaluations of data in terms of probability distributions. The EDA case-studies include analyses of herbicide-tolerant GM soybean and insect-protected GM maize and soybean. Bayesian approaches are presented in an analysis of herbicide-tolerant GM soybean. Advantages of these approaches over classical frequentist significance testing include the more direct interpretation of results in terms of probabilities pertaining to quantities of interest and no confusion over the application of corrections for multiple comparisons. It is concluded that a standardized framework for these methodologies could provide specific advantages through enhanced clarity of presentation and interpretation in comparative assessments of crop composition.
Research and Development of Statistical Analysis Software System of Maize Seedling Experiment
Hui Cao
2014-01-01
In this study, software engineer measures were used to develop a set of software system for maize seedling experiments statistics and analysis works. During development works, B/S structure software design method was used and a set of statistics indicators for maize seedling evaluation were established. The experiments results indicated that this set of software system could finish quality statistics and analysis for maize seedling very well. The development of this software system explored a...
A reanalysis of Lord's statistical treatment of football numbers
Zand Scholten, A.; Borsboom, D.
2009-01-01
Stevens’ theory of admissible statistics [Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677680] states that measurement levels should guide the choice of statistical test, such that the truth value of statements based on a statistical analysis remains invariant under
A Statistical Perspective on Highly Accelerated Testing
Energy Technology Data Exchange (ETDEWEB)
Thomas, Edward V. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2015-02-01
Highly accelerated life testing has been heavily promoted at Sandia (and elsewhere) as a means to rapidly identify product weaknesses caused by flaws in the product's design or manufacturing process. During product development, a small number of units are forced to fail at high stress. The failed units are then examined to determine the root causes of failure. The identification of the root causes of product failures exposed by highly accelerated life testing can instigate changes to the product's design and/or manufacturing process that result in a product with increased reliability. It is widely viewed that this qualitative use of highly accelerated life testing (often associated with the acronym HALT) can be useful. However, highly accelerated life testing has also been proposed as a quantitative means for "demonstrating" the reliability of a product where unreliability is associated with loss of margin via an identified and dominating failure mechanism. It is assumed that the dominant failure mechanism can be accelerated by changing the level of a stress factor that is assumed to be related to the dominant failure mode. In extreme cases, a minimal number of units (often from a pre-production lot) are subjected to a single highly accelerated stress relative to normal use. If no (or, sufficiently few) units fail at this high stress level, some might claim that a certain level of reliability has been demonstrated (relative to normal use conditions). Underlying this claim are assumptions regarding the level of knowledge associated with the relationship between the stress level and the probability of failure. The primary purpose of this document is to discuss (from a statistical perspective) the efficacy of using accelerated life testing protocols (and, in particular, "highly accelerated" protocols) to make quantitative inferences concerning the performance of a product (e.g., reliability) when in fact there is lack-of-knowledge and uncertainty concerning
Statistical analysis on failure-to-open/close probability of motor-operated valve in sodium system
International Nuclear Information System (INIS)
Kurisaka, Kenichi
1998-08-01
The objective of this work is to develop basic data for examination on efficiency of preventive maintenance and actuation test from the standpoint of failure probability. This work consists of a statistical trend analysis of valve failure probability in a failure-to-open/close mode on time since installation and time since last open/close action, based on the field data of operating- and failure-experience. In this work, the terms both dependent and independent on time were considered in the failure probability. The linear aging model was modified and applied to the first term. In this model there are two terms with both failure rates in proportion to time since installation and to time since last open/close-demand. Because of sufficient statistical population, motor-operated valves (MOV's) in sodium system were selected to be analyzed from the CORDS database which contains operating data and failure data of components in the fast reactors and sodium test facilities. According to these data, the functional parameters were statistically estimated to quantify the valve failure probability in a failure-to-open/close mode, with consideration of uncertainty. (J.P.N.)
Nonparametric statistics with applications to science and engineering
Kvam, Paul H
2007-01-01
A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
A testing procedure for wind turbine generators based on the power grid statistical model
DEFF Research Database (Denmark)
Farajzadehbibalan, Saber; Ramezani, Mohammad Hossein; Nielsen, Peter
2017-01-01
In this study, a comprehensive test procedure is developed to test wind turbine generators with a hardware-in-loop setup. The procedure employs the statistical model of the power grid considering the restrictions of the test facility and system dynamics. Given the model in the latent space...
A comparative analysis of the statistical properties of large mobile phone calling networks.
Li, Ming-Xia; Jiang, Zhi-Qiang; Xie, Wen-Jie; Miccichè, Salvatore; Tumminello, Michele; Zhou, Wei-Xing; Mantegna, Rosario N
2014-05-30
Mobile phone calling is one of the most widely used communication methods in modern society. The records of calls among mobile phone users provide us a valuable proxy for the understanding of human communication patterns embedded in social networks. Mobile phone users call each other forming a directed calling network. If only reciprocal calls are considered, we obtain an undirected mutual calling network. The preferential communication behavior between two connected users can be statistically tested and it results in two Bonferroni networks with statistically validated edges. We perform a comparative analysis of the statistical properties of these four networks, which are constructed from the calling records of more than nine million individuals in Shanghai over a period of 110 days. We find that these networks share many common structural properties and also exhibit idiosyncratic features when compared with previously studied large mobile calling networks. The empirical findings provide us an intriguing picture of a representative large social network that might shed new lights on the modelling of large social networks.
van Krimpen-Stoop, Edith M. L. A.; Meijer, Rob R.
Person-fit research in the context of paper-and-pencil tests is reviewed, and some specific problems regarding person fit in the context of computerized adaptive testing (CAT) are discussed. Some new methods are proposed to investigate person fit in a CAT environment. These statistics are based on Statistical Process Control (SPC) theory. A…
Analysis of Statistical Methods Currently used in Toxicology Journals.
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-09-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
StOCNET : Software for the statistical analysis of social networks
Huisman, M.; van Duijn, M.A.J.
2003-01-01
StOCNET3 is an open software system in a Windows environment for the advanced statistical analysis of social networks. It provides a platform to make a number of recently developed and therefore not (yet) standard statistical methods available to a wider audience. A flexible user interface utilizing
AutoBayes: A System for Generating Data Analysis Programs from Statistical Models
Fischer, Bernd; Schumann, Johann
2003-01-01
Data analysis is an important scientific task which is required whenever information needs to be extracted from raw data. Statistical approaches to data analysis, which use methods from probability theory and numerical analysis, are well-founded but dificult to implement: the development of a statistical data analysis program for any given application is time-consuming and requires substantial knowledge and experience in several areas. In this paper, we describe AutoBayes, a program synthesis...
Statistical analysis of random duration times
International Nuclear Information System (INIS)
Engelhardt, M.E.
1996-04-01
This report presents basic statistical methods for analyzing data obtained by observing random time durations. It gives nonparametric estimates of the cumulative distribution function, reliability function and cumulative hazard function. These results can be applied with either complete or censored data. Several models which are commonly used with time data are discussed, and methods for model checking and goodness-of-fit tests are discussed. Maximum likelihood estimates and confidence limits are given for the various models considered. Some results for situations where repeated durations such as repairable systems are also discussed
Reliability Analysis and Test Planning using CAPO-Test for Existing Structures
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard; Engelund, S.; Faber, Michael Havbro
2000-01-01
Evaluation of the reliability of existing concrete structures often requires that the compressive strength of the concrete is estimated on the basis of tests performed with concrete samples from the structure considered. In this paper the CAPO-test method is considered. The different sources...... of uncertainty related to this method are described. It is shown how the uncertainty in the transformation from the CAPO-test results to estimates of the concrete strength can be modeled. Further, the statistical uncertainty is modeled using Bayesian statistics. Finally, it is shown how reliability-based optimal...... planning of CAPO-tests can be performed taking into account the expected costs due to the CAPO-tests and possible repair or failure of the structure considered. An illustrative example is presented where the CAPO-test is compared with conventional concrete cylinder compression tests performed on cores...
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban
2016-01-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We cal...
Ueno, Tamio; Matuda, Junichi; Yamane, Nobuhisa
2013-03-01
To evaluate the occurrence of out-of acceptable ranges and accuracy of antimicrobial susceptibility tests, we applied a new statistical tool to the Inter-Laboratory Quality Control Program established by the Kyushu Quality Control Research Group. First, we defined acceptable ranges of minimum inhibitory concentration (MIC) for broth microdilution tests and inhibitory zone diameter for disk diffusion tests on the basis of Clinical and Laboratory Standards Institute (CLSI) M100-S21. In the analysis, more than two out-of acceptable range results in the 20 tests were considered as not allowable according to the CLSI document. Of the 90 participating laboratories, 46 (51%) experienced one or more occurrences of out-of acceptable range results. Then, a binomial test was applied to each participating laboratory. The results indicated that the occurrences of out-of acceptable range results in the 11 laboratories were significantly higher when compared to the CLSI recommendation (allowable rate laboratory was statistically compared with zero using a Student's t-test. The results revealed that 5 of the 11 above laboratories reported erroneous test results that systematically drifted to the side of resistance. In conclusion, our statistical approach has enabled us to detect significantly higher occurrences and source of interpretive errors in antimicrobial susceptibility tests; therefore, this approach can provide us with additional information that can improve the accuracy of the test results in clinical microbiology laboratories.
Sensometrics: Thurstonian and Statistical Models
DEFF Research Database (Denmark)
Christensen, Rune Haubo Bojesen
. sensR is a package for sensory discrimination testing with Thurstonian models and ordinal supports analysis of ordinal data with cumulative link (mixed) models. While sensR is closely connected to the sensometrics field, the ordinal package has developed into a generic statistical package applicable......This thesis is concerned with the development and bridging of Thurstonian and statistical models for sensory discrimination testing as applied in the scientific discipline of sensometrics. In sensory discrimination testing sensory differences between products are detected and quantified by the use...... and sensory discrimination testing in particular in a series of papers by advancing Thurstonian models for a range of sensory discrimination protocols in addition to facilitating their application by providing software for fitting these models. The main focus is on identifying Thurstonian models...
An Application of Multivariate Statistical Analysis for Query-Driven Visualization
Energy Technology Data Exchange (ETDEWEB)
Gosink, Luke J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Garth, Christoph [Univ. of California, Davis, CA (United States); Anderson, John C. [Univ. of California, Davis, CA (United States); Bethel, E. Wes [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Joy, Kenneth I. [Univ. of California, Davis, CA (United States)
2011-03-01
Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability.
Hendricks, Lorin; Spencer Guthrie, W.; Mazzeo, Brian
2018-04-01
An automated acoustic impact-echo testing device with seven channels has been developed for faster surveying of bridge decks. Due to potential variations in bridge deck overlay thickness, varying conditions between testing passes, and occasional imprecise equipment calibrations, a method that can account for variations in deck properties and testing conditions was necessary to correctly interpret the acoustic data. A new methodology involving statistical analyses was therefore developed. After acoustic impact-echo data are collected and analyzed, the results are normalized by the median for each channel, a Gaussian distribution is fit to the histogram of the data, and the Kullback-Leibler divergence test or Otsu's method is then used to determine the optimum threshold for differentiating between intact and delaminated concrete. The new methodology was successfully applied to individual channels of previously unusable acoustic impact-echo data obtained from a three-lane interstate bridge deck surfaced with a polymer overlay, and the resulting delamination map compared very favorably with the results of a manual deck sounding survey.
Association testing for next-generation sequencing data using score statistics
DEFF Research Database (Denmark)
Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders
2012-01-01
computationally feasible due to the use of score statistics. As part of the joint likelihood, we model the distribution of the phenotypes using a generalized linear model framework, which works for both quantitative and discrete phenotypes. Thus, the method presented here is applicable to case-control studies...... of genotype calls into account have been proposed; most require numerical optimization which for large-scale data is not always computationally feasible. We show that using a score statistic for the joint likelihood of observed phenotypes and observed sequencing data provides an attractive approach...... to association testing for next-generation sequencing data. The joint model accounts for the genotype classification uncertainty via the posterior probabilities of the genotypes given the observed sequencing data, which gives the approach higher power than methods based on called genotypes. This strategy remains...
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.
Kieffer, Kevin M.; Thompson, Bruce
As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate…
Statistical analysis of radioactivity in the environment
International Nuclear Information System (INIS)
Barnes, M.G.; Giacomini, J.J.
1980-05-01
The pattern of radioactivity in surface soils of Area 5 of the Nevada Test Site is analyzed statistically by means of kriging. The 1962 event code-named Smallboy effected the greatest proportion of the area sampled, but some of the area was also affected by a number of other events. The data for this study were collected on a regular grid to take advantage of the efficiency of grid sampling
International Nuclear Information System (INIS)
Pham, Binh T.; Hawkes, Grant L.; Einerson, Jeffrey J.
2014-01-01
As part of the High Temperature Reactors (HTR) R and D program, a series of irradiation tests, designated as Advanced Gas-cooled Reactor (AGR), have been defined to support development and qualification of fuel design, fabrication process, and fuel performance under normal operation and accident conditions. The AGR tests employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule and instrumented with thermocouples (TC) embedded in graphite blocks enabling temperature control. While not possible to obtain by direct measurements in the tests, crucial fuel conditions (e.g., temperature, neutron fast fluence, and burnup) are calculated using core physics and thermal modeling codes. This paper is focused on AGR test fuel temperature predicted by the ABAQUS code's finite element-based thermal models. The work follows up on a previous study, in which several statistical analysis methods were adapted, implemented in the NGNP Data Management and Analysis System (NDMAS), and applied for qualification of AGR-1 thermocouple data. Abnormal trends in measured data revealed by the statistical analysis are traced to either measuring instrument deterioration or physical mechanisms in capsules that may have shifted the system thermal response. The main thrust of this work is to exploit the variety of data obtained in irradiation and post-irradiation examination (PIE) for assessment of modeling assumptions. As an example, the uneven reduction of the control gas gap in Capsule 5 found in the capsule metrology measurements in PIE helps identify mechanisms other than TC drift causing the decrease in TC readings. This suggests a more physics-based modification of the thermal model that leads to a better fit with experimental data, thus reducing model uncertainty and increasing confidence in the calculated fuel temperatures of the AGR-1 test
Energy Technology Data Exchange (ETDEWEB)
Pham, Binh T., E-mail: Binh.Pham@inl.gov [Human Factor, Controls and Statistics Department, Nuclear Science and Technology, Idaho National Laboratory, Idaho Falls, ID 83415 (United States); Hawkes, Grant L. [Thermal Science and Safety Analysis Department, Nuclear Science and Technology, Idaho National Laboratory, Idaho Falls, ID 83415 (United States); Einerson, Jeffrey J. [Human Factor, Controls and Statistics Department, Nuclear Science and Technology, Idaho National Laboratory, Idaho Falls, ID 83415 (United States)
2014-05-01
As part of the High Temperature Reactors (HTR) R and D program, a series of irradiation tests, designated as Advanced Gas-cooled Reactor (AGR), have been defined to support development and qualification of fuel design, fabrication process, and fuel performance under normal operation and accident conditions. The AGR tests employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule and instrumented with thermocouples (TC) embedded in graphite blocks enabling temperature control. While not possible to obtain by direct measurements in the tests, crucial fuel conditions (e.g., temperature, neutron fast fluence, and burnup) are calculated using core physics and thermal modeling codes. This paper is focused on AGR test fuel temperature predicted by the ABAQUS code's finite element-based thermal models. The work follows up on a previous study, in which several statistical analysis methods were adapted, implemented in the NGNP Data Management and Analysis System (NDMAS), and applied for qualification of AGR-1 thermocouple data. Abnormal trends in measured data revealed by the statistical analysis are traced to either measuring instrument deterioration or physical mechanisms in capsules that may have shifted the system thermal response. The main thrust of this work is to exploit the variety of data obtained in irradiation and post-irradiation examination (PIE) for assessment of modeling assumptions. As an example, the uneven reduction of the control gas gap in Capsule 5 found in the capsule metrology measurements in PIE helps identify mechanisms other than TC drift causing the decrease in TC readings. This suggests a more physics-based modification of the thermal model that leads to a better fit with experimental data, thus reducing model uncertainty and increasing confidence in the calculated fuel temperatures of the AGR-1 test.
Explorations in Statistics: The Analysis of Ratios and Normalized Data
Curran-Everett, Douglas
2013-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of "Explorations in Statistics" explores the analysis of ratios and normalized--or standardized--data. As researchers, we compute a ratio--a numerator divided by a denominator--to compute a…
Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.
2011-01-01
Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is
Weibull statistic analysis of bending strength in the cemented carbide coatings
International Nuclear Information System (INIS)
Yi Yong; Shen Baoluo; Qiu Shaoyu; Li Cong
2003-01-01
The theoretical basis using Weibull statistics to analyze the strength of coating has been established that the Weibull distribution will be the asymptotic distribution of strength for coating as the volume of coating increase, provided that the local strength of coating is statistic independent, and has been confirmed in the following test for the bending strength of two cemented carbide coatings. The result shows that Weibull statistics can be well used to analyze the strength of two coatings. (authors)
Analysis of thrips distribution: application of spatial statistics and Kriging
John Aleong; Bruce L. Parker; Margaret Skinner; Diantha Howard
1991-01-01
Kriging is a statistical technique that provides predictions for spatially and temporally correlated data. Observations of thrips distribution and density in Vermont soils are made in both space and time. Traditional statistical analysis of such data assumes that the counts taken over space and time are independent, which is not necessarily true. Therefore, to analyze...
Quantum Statistical Testing of a Quantum Random Number Generator
Energy Technology Data Exchange (ETDEWEB)
Humble, Travis S [ORNL
2014-01-01
The unobservable elements in a quantum technology, e.g., the quantum state, complicate system verification against promised behavior. Using model-based system engineering, we present methods for verifying the opera- tion of a prototypical quantum random number generator. We begin with the algorithmic design of the QRNG followed by the synthesis of its physical design requirements. We next discuss how quantum statistical testing can be used to verify device behavior as well as detect device bias. We conclude by highlighting how system design and verification methods must influence effort to certify future quantum technologies.
Statistical wind analysis for near-space applications
Roney, Jason A.
2007-09-01
Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.
Analysis of photon statistics with Silicon Photomultiplier
International Nuclear Information System (INIS)
D'Ascenzo, N.; Saveliev, V.; Wang, L.; Xie, Q.
2015-01-01
The Silicon Photomultiplier (SiPM) is a novel silicon-based photodetector, which represents the modern perspective of low photon flux detection. The aim of this paper is to provide an introduction on the statistical analysis methods needed to understand and estimate in quantitative way the correct features and description of the response of the SiPM to a coherent source of light
Development of statistical analysis code for meteorological data (W-View)
International Nuclear Information System (INIS)
Tachibana, Haruo; Sekita, Tsutomu; Yamaguchi, Takenori
2003-03-01
A computer code (W-View: Weather View) was developed to analyze the meteorological data statistically based on 'the guideline of meteorological statistics for the safety analysis of nuclear power reactor' (Nuclear Safety Commission on January 28, 1982; revised on March 29, 2001). The code gives statistical meteorological data to assess the public dose in case of normal operation and severe accident to get the license of nuclear reactor operation. This code was revised from the original code used in a large office computer code to enable a personal computer user to analyze the meteorological data simply and conveniently and to make the statistical data tables and figures of meteorology. (author)
Statistical analysis of the Ft. Calhoun reactor coolant pump system
International Nuclear Information System (INIS)
Heising, Carolyn D.
1998-01-01
In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety. As a demonstration of such an approach to plant maintenance and control, a specific system is analyzed: the reactor coolant pumps (RCPs) of the Ft. Calhoun nuclear power plant. This research uses capability analysis, Shewhart X-bar, R-charts, canonical correlation methods, and design of experiments to analyze the process for the state of statistical control. The results obtained show that six out of ten parameters are under control specifications limits and four parameters are not in the state of statistical control. The analysis shows that statistical process control methods can be applied as an early warning system capable of identifying significant equipment problems well in advance of traditional control room alarm indicators Such a system would provide operators with ample time to respond to possible emergency situations and thus improve plant safety and reliability. (author)
Fang, Yongxiang; Wit, Ernst
2008-01-01
Fisher’s combined probability test is the most commonly used method to test the overall significance of a set independent p-values. However, it is very obviously that Fisher’s statistic is more sensitive to smaller p-values than to larger p-value and a small p-value may overrule the other p-values
International Nuclear Information System (INIS)
Kim, S. H.; Moon, J. H.; Jeong, Y. S.
2002-01-01
Two air filters (V-50, P-50) artificially loaded with urban dust were provided from IAEA and trace elements to study inter-laboratory comparison and proficiency test were determined using instrumental neutron activation analysis non-destructively. Standard reference material(Urban Particulate Matter, NIST SRM 1648) of National Institute of Standard and Technology was used for internal analytical quality control. About 20 elements in each loaded filter sample were determined, respectively. Our analytical data were compared with statistical results using neutron activation analysis, particle induced X-ray emission spectrometry, inductively coupled plasma mass spectroscopy, etc., which were collected from 49 laboratories of 40 countries. From the results that were statistically re-treated with reported values, Z-scores of our analytical values are within ±2. In addition, the results of proficiency test are passed and accuracy and precision of the analytical values are reliable. Consequently, it was proved that analytical quality control for the analysis of air dust samples is reasonable
Vahedi, Shahrum; Farrokhi, Farahman; Gahramani, Farahnaz; Issazadegan, Ali
2012-01-01
Objective: Approximately 66-80%of graduate students experience statistics anxiety and some researchers propose that many students identify statistics courses as the most anxiety-inducing courses in their academic curriculums. As such, it is likely that statistics anxiety is, in part, responsible for many students delaying enrollment in these courses for as long as possible. This paper proposes a canonical model by treating academic procrastination (AP), learning strategies (LS) as predictor variables and statistics anxiety (SA) as explained variables. Methods: A questionnaire survey was used for data collection and 246-college female student participated in this study. To examine the mutually independent relations between procrastination, learning strategies and statistics anxiety variables, a canonical correlation analysis was computed. Results: Findings show that two canonical functions were statistically significant. The set of variables (metacognitive self-regulation, source management, preparing homework, preparing for test and preparing term papers) helped predict changes of statistics anxiety with respect to fearful behavior, Attitude towards math and class, Performance, but not Anxiety. Conclusion: These findings could be used in educational and psychological interventions in the context of statistics anxiety reduction. PMID:24644468
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Simulation Experiments in Practice: Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independen...
Cross wavelet analysis: significance testing and pitfalls
Directory of Open Access Journals (Sweden)
D. Maraun
2004-01-01
Full Text Available In this paper, we present a detailed evaluation of cross wavelet analysis of bivariate time series. We develop a statistical test for zero wavelet coherency based on Monte Carlo simulations. If at least one of the two processes considered is Gaussian white noise, an approximative formula for the critical value can be utilized. In a second part, typical pitfalls of wavelet cross spectra and wavelet coherency are discussed. The wavelet cross spectrum appears to be not suitable for significance testing the interrelation between two processes. Instead, one should rather apply wavelet coherency. Furthermore we investigate problems due to multiple testing. Based on these results, we show that coherency between ENSO and NAO is an artefact for most of the time from 1900 to 1995. However, during a distinct period from around 1920 to 1940, significant coherency between the two phenomena occurs.
Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics
Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.
ASURV: Astronomical SURVival Statistics
Feigelson, E. D.; Nelson, P. I.; Isobe, T.; LaValley, M.
2014-06-01
ASURV (Astronomical SURVival Statistics) provides astronomy survival analysis for right- and left-censored data including the maximum-likelihood Kaplan-Meier estimator and several univariate two-sample tests, bivariate correlation measures, and linear regressions. ASURV is written in FORTRAN 77, and is stand-alone and does not call any specialized libraries.