sample observed statistics: Topics by WorldWideScience.org

Sample records for sample observed statistics

Evaluation of observables in statistical multifragmentation theories

International Nuclear Information System (INIS)

Cole, A.J.

1989-01-01

The canonical formulation of equilibrium statistical multifragmentation is examined. It is shown that the explicit construction of observables (average values) by sampling the partition probabilities is unnecessary insofar as closed expressions in the form of recursion relations can be obtained quite easily. Such expressions may conversely be used to verify the sampling algorithms
Sampling, Probability Models and Statistical Reasoning Statistical

Indian Academy of Sciences (India)

Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
Statistical distribution sampling

Science.gov (United States)

Johnson, E. S.

1975-01-01

Determining the distribution of statistics by sampling was investigated. Characteristic functions, the quadratic regression problem, and the differential equations for the characteristic functions are analyzed.
Staging Liver Fibrosis with Statistical Observers

Science.gov (United States)

Brand, Jonathan Frieman

Chronic liver disease is a worldwide health problem, and hepatic fibrosis (HF) is one of the hallmarks of the disease. Pathology diagnosis of HF is based on textural change in the liver as a lobular collagen network that develops within portal triads. The scale of collagen lobules is characteristically on order of 1mm, which close to the resolution limit of in vivo Gd-enhanced MRI. In this work the methods to collect training and testing images for a Hotelling observer are covered. An observer based on local texture analysis is trained and tested using wet-tissue phantoms. The technique is used to optimize the MRI sequence based on task performance. The final method developed is a two stage model observer to classify fibrotic and healthy tissue in both phantoms and in vivo MRI images. The first stage observer tests for the presence of local texture. Test statistics from the first observer are used to train the second stage observer to globally sample the local observer results. A decision of the disease class is made for an entire MRI image slice using test statistics collected from the second observer. The techniques are tested on wet-tissue phantoms and in vivo clinical patient data.
Statistical Symbolic Execution with Informed Sampling

Science.gov (United States)

Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco

2014-01-01

Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Speeding Up Non-Parametric Bootstrap Computations for Statistics Based on Sample Moments in Small/Moderate Sample Size Applications.

Directory of Open Access Journals (Sweden)

Elias Chaibub Neto

Full Text Available In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson's sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling.
42 CFR 402.109 - Statistical sampling.

Science.gov (United States)

2010-10-01

... or caused to be presented. (b) Prima facie evidence. The results of the statistical sampling study, if based upon an appropriate sampling and computed by valid statistical methods, constitute prima... § 402.1. (c) Burden of proof. Once CMS or OIG has made a prima facie case, the burden is on the...
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.

Science.gov (United States)

Lin, Johnny; Bentler, Peter M

2012-01-01

Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
Contributions to sampling statistics

CERN Document Server

Conti, Pier; Ranalli, Maria

2014-01-01

This book contains a selection of the papers presented at the ITACOSM 2013 Conference, held in Milan in June 2013. ITACOSM is the bi-annual meeting of the Survey Sampling Group S2G of the Italian Statistical Society, intended as an international forum of scientific discussion on the developments of theory and application of survey sampling methodologies and applications in human and natural sciences. The book gathers research papers carefully selected from both invited and contributed sessions of the conference. The whole book appears to be a relevant contribution to various key aspects of sampling methodology and techniques; it deals with some hot topics in sampling theory, such as calibration, quantile-regression and multiple frame surveys, and with innovative methodologies in important topics of both sampling theory and applications. Contributions cut across current sampling methodologies such as interval estimation for complex samples, randomized responses, bootstrap, weighting, modeling, imputati...
Statistical literacy and sample survey results

Science.gov (United States)

McAlevey, Lynn; Sullivan, Charles

2010-10-01

Sample surveys are widely used in the social sciences and business. The news media almost daily quote from them, yet they are widely misused. Using students with prior managerial experience embarking on an MBA course, we show that common sample survey results are misunderstood even by those managers who have previously done a statistics course. In general, they fare no better than managers who have never studied statistics. There are implications for teaching, especially in business schools, as well as for consulting.
Statistical sampling approaches for soil monitoring

NARCIS (Netherlands)

Brus, D.J.

2014-01-01

This paper describes three statistical sampling approaches for regional soil monitoring, a design-based, a model-based and a hybrid approach. In the model-based approach a space-time model is exploited to predict global statistical parameters of interest such as the space-time mean. In the hybrid
Statistical aspects of food safety sampling

NARCIS (Netherlands)

Jongenburger, I.; Besten, den H.M.W.; Zwietering, M.H.

2015-01-01

In food safety management, sampling is an important tool for verifying control. Sampling by nature is a stochastic process. However, uncertainty regarding results is made even greater by the uneven distribution of microorganisms in a batch of food. This article reviews statistical aspects of
Audit sampling: A qualitative study on the role of statistical and non-statistical sampling approaches on audit practices in Sweden

OpenAIRE

Ayam, Rufus Tekoh

2011-01-01

PURPOSE: The two approaches to audit sampling; statistical and nonstatistical have been examined in this study. The overall purpose of the study is to explore the current extent at which statistical and nonstatistical sampling approaches are utilized by independent auditors during auditing practices. Moreover, the study also seeks to achieve two additional purposes; the first is to find out whether auditors utilize different sampling techniques when auditing SME´s (Small and Medium-Sized Ente...
Measuring radioactive half-lives via statistical sampling in practice

Science.gov (United States)

Lorusso, G.; Collins, S. M.; Jagan, K.; Hitt, G. W.; Sadek, A. M.; Aitken-Smith, P. M.; Bridi, D.; Keightley, J. D.

2017-10-01

The statistical sampling method for the measurement of radioactive decay half-lives exhibits intriguing features such as that the half-life is approximately the median of a distribution closely resembling a Cauchy distribution. Whilst initial theoretical considerations suggested that in certain cases the method could have significant advantages, accurate measurements by statistical sampling have proven difficult, for they require an exercise in non-standard statistical analysis. As a consequence, no half-life measurement using this method has yet been reported and no comparison with traditional methods has ever been made. We used a Monte Carlo approach to address these analysis difficulties, and present the first experimental measurement of a radioisotope half-life (211Pb) by statistical sampling in good agreement with the literature recommended value. Our work also focused on the comparison between statistical sampling and exponential regression analysis, and concluded that exponential regression achieves generally the highest accuracy.
Subclinical delusional ideation and appreciation of sample size and heterogeneity in statistical judgment.

Science.gov (United States)

Galbraith, Niall D; Manktelow, Ken I; Morris, Neil G

2010-11-01

Previous studies demonstrate that people high in delusional ideation exhibit a data-gathering bias on inductive reasoning tasks. The current study set out to investigate the factors that may underpin such a bias by examining healthy individuals, classified as either high or low scorers on the Peters et al. Delusions Inventory (PDI). More specifically, whether high PDI scorers have a relatively poor appreciation of sample size and heterogeneity when making statistical judgments. In Expt 1, high PDI scorers made higher probability estimates when generalizing from a sample of 1 with regard to the heterogeneous human property of obesity. In Expt 2, this effect was replicated and was also observed in relation to the heterogeneous property of aggression. The findings suggest that delusion-prone individuals are less appreciative of the importance of sample size when making statistical judgments about heterogeneous properties; this may underpin the data gathering bias observed in previous studies. There was some support for the hypothesis that threatening material would exacerbate high PDI scorers' indifference to sample size.
Statistical benchmark for BosonSampling

International Nuclear Information System (INIS)

Walschaers, Mattia; Mayer, Klaus; Buchleitner, Andreas; Kuipers, Jack; Urbina, Juan-Diego; Richter, Klaus; Tichy, Malte Christopher

2016-01-01

Boson samplers—set-ups that generate complex many-particle output states through the transmission of elementary many-particle input states across a multitude of mutually coupled modes—promise the efficient quantum simulation of a classically intractable computational task, and challenge the extended Church–Turing thesis, one of the fundamental dogmas of computer science. However, as in all experimental quantum simulations of truly complex systems, one crucial problem remains: how to certify that a given experimental measurement record unambiguously results from enforcing the claimed dynamics, on bosons, fermions or distinguishable particles? Here we offer a statistical solution to the certification problem, identifying an unambiguous statistical signature of many-body quantum interference upon transmission across a multimode, random scattering device. We show that statistical analysis of only partial information on the output state allows to characterise the imparted dynamics through particle type-specific features of the emerging interference patterns. The relevant statistical quantifiers are classically computable, define a falsifiable benchmark for BosonSampling, and reveal distinctive features of many-particle quantum dynamics, which go much beyond mere bunching or anti-bunching effects. (fast track communication)
Statistical conditional sampling for variable-resolution video compression.

Directory of Open Access Journals (Sweden)

Alexander Wong

Full Text Available In this study, we investigate a variable-resolution approach to video compression based on Conditional Random Field and statistical conditional sampling in order to further improve compression rate while maintaining high-quality video. In the proposed approach, representative key-frames within a video shot are identified and stored at full resolution. The remaining frames within the video shot are stored and compressed at a reduced resolution. At the decompression stage, a region-based dictionary is constructed from the key-frames and used to restore the reduced resolution frames to the original resolution via statistical conditional sampling. The sampling approach is based on the conditional probability of the CRF modeling by use of the constructed dictionary. Experimental results show that the proposed variable-resolution approach via statistical conditional sampling has potential for improving compression rates when compared to compressing the video at full resolution, while achieving higher video quality when compared to compressing the video at reduced resolution.
Statistical sampling techniques as applied to OSE inspections

International Nuclear Information System (INIS)

Davis, J.J.; Cote, R.W.

1987-01-01

The need has been recognized for statistically valid methods for gathering information during OSE inspections; and for interpretation of results, both from performance testing and from records reviews, interviews, etc. Battelle Columbus Division, under contract to DOE OSE has performed and is continuing to perform work in the area of statistical methodology for OSE inspections. This paper represents some of the sampling methodology currently being developed for use during OSE inspections. Topics include population definition, sample size requirements, level of confidence and practical logistical constraints associated with the conduct of an inspection based on random sampling. Sequential sampling schemes and sampling from finite populations are also discussed. The methods described are applicable to various data gathering activities, ranging from the sampling and examination of classified documents to the sampling of Protective Force security inspectors for skill testing
The Statistics of Radio Astronomical Polarimetry: Disjoint, Superposed, and Composite Samples

Energy Technology Data Exchange (ETDEWEB)

Straten, W. van [Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn, VIC 3122 (Australia); Tiburzi, C., E-mail: willem.van.straten@aut.ac.nz [Max-Planck-Institut für Radioastronomie, Auf dem Hügel 69, D-53121 Bonn (Germany)

2017-02-01

A statistical framework is presented for the study of the orthogonally polarized modes of radio pulsar emission via the covariances between the Stokes parameters. To accommodate the typically heavy-tailed distributions of single-pulse radio flux density, the fourth-order joint cumulants of the electric field are used to describe the superposition of modes with arbitrary probability distributions. The framework is used to consider the distinction between superposed and disjoint modes, with particular attention to the effects of integration over finite samples. If the interval over which the polarization state is estimated is longer than the timescale for switching between two or more disjoint modes of emission, then the modes are unresolved by the instrument. The resulting composite sample mean exhibits properties that have been attributed to mode superposition, such as depolarization. Because the distinction between disjoint modes and a composite sample of unresolved disjoint modes depends on the temporal resolution of the observing instrumentation, the arguments in favor of superposed modes of pulsar emission are revisited, and observational evidence for disjoint modes is described. In principle, the four-dimensional covariance matrix that describes the distribution of sample mean Stokes parameters can be used to distinguish between disjoint modes, superposed modes, and a composite sample of unresolved disjoint modes. More comprehensive and conclusive interpretation of the covariance matrix requires more detailed consideration of various relevant phenomena, including temporally correlated subpulse modulation (e.g., jitter), statistical dependence between modes (e.g., covariant intensities and partial coherence), and multipath propagation effects (e.g., scintillation and scattering).
Statistical Analysis Of Tank 19F Floor Sample Results

International Nuclear Information System (INIS)

Harris, S.

2010-01-01

Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).

Innovations in Statistical Observations of Consumer Prices

Directory of Open Access Journals (Sweden)

Olga Stepanovna Oleynik

2016-10-01

Full Text Available This article analyzes the innovative changes in the methodology of statistical surveys of consumer prices. These changes are reflected in the “Official statistical methodology for the organization of statistical observation of consumer prices for goods and services and the calculation of the consumer price index”, approved by order of the Federal State Statistics Service of December 30, 2014 no. 734. The essence of innovation is the use of mathematical methods in determining the range of studies objects of trade and services, in calculating the sufficient observable price quotes based on price dispersion, the proportion of the observed product (service, a representative of consumer spending, as well as the indicator of the complexity of price registration. The authors analyzed the mathematical calculations of the required number of quotations for observation in the Volgograd region in 2016, the results of calculations are compared with the number of quotes included in the monitoring. The authors believe that the implementation of these mathematical models allowed to substantially reduce the influence of the subjective factor in the organization of monitoring of consumer prices, and therefore to increase the objectivity of the resulting statistics on consumer prices and inflation. At the same time, the proposed methodology needs further improvement in terms of payment for goods, products (services by representatives having a minor share in consumer expenditure.
Microvariability in AGNs: study of different statistical methods - I. Observational analysis

Science.gov (United States)

Zibecchi, L.; Andruchow, I.; Cellone, S. A.; Carpintero, D. D.; Romero, G. E.; Combi, J. A.

2017-05-01

We present the results of a study of different statistical methods currently used in the literature to analyse the (micro)variability of active galactic nuclei (AGNs) from ground-based optical observations. In particular, we focus on the comparison between the results obtained by applying the so-called C and F statistics, which are based on the ratio of standard deviations and variances, respectively. The motivation for this is that the implementation of these methods leads to different and contradictory results, making the variability classification of the light curves of a certain source dependent on the statistics implemented. For this purpose, we re-analyse the results on an AGN sample observed along several sessions with the 2.15 m 'Jorge Sahade' telescope (CASLEO), San Juan, Argentina. For each AGN, we constructed the nightly differential light curves. We thus obtained a total of 78 light curves for 39 AGNs, and we then applied the statistical tests mentioned above, in order to re-classify the variability state of these light curves and in an attempt to find the suitable statistical methodology to study photometric (micro)variations. We conclude that, although the C criterion is not proper a statistical test, it could still be a suitable parameter to detect variability and that its application allows us to get more reliable variability results, in contrast with the F test.
STATISTICAL ANALYSIS OF TANK 18F FLOOR SAMPLE RESULTS

Energy Technology Data Exchange (ETDEWEB)

Harris, S.

2010-09-02

Representative sampling has been completed for characterization of the residual material on the floor of Tank 18F as per the statistical sampling plan developed by Shine [1]. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL [2]. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results [3] to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL{sub 95%}) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 18F. The uncertainty is quantified in this report by an upper 95% confidence limit (UCL{sub 95%}) on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL{sub 95%} was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
Statistical reexamination of analytical method on the observed electron spin (or nuclear) resonance curves

International Nuclear Information System (INIS)

Kim, J.W.

1980-01-01

Observed magnetic resonance curves are statistically reexamined. Typical models of resonance lines are Lorentzian and Gaussian distribution functions. In the case of metallic, alloy or intermetallic compound samples, observed resonance lines are supperposed with the absorption line and the dispersion line. The analyzing methods of supperposed resonance lines are demonstrated. (author)
Developing Students' Reasoning about Samples and Sampling Variability as a Path to Expert Statistical Thinking

Science.gov (United States)

Garfield, Joan; Le, Laura; Zieffler, Andrew; Ben-Zvi, Dani

2015-01-01

This paper describes the importance of developing students' reasoning about samples and sampling variability as a foundation for statistical thinking. Research on expert-novice thinking as well as statistical thinking is reviewed and compared. A case is made that statistical thinking is a type of expert thinking, and as such, research…
A course in mathematical statistics and large sample theory

CERN Document Server

Bhattacharya, Rabi; Patrangenaru, Victor

2016-01-01

This graduate-level textbook is primarily aimed at graduate students of statistics, mathematics, science, and engineering who have had an undergraduate course in statistics, an upper division course in analysis, and some acquaintance with measure theoretic probability. It provides a rigorous presentation of the core of mathematical statistics. Part I of this book constitutes a one-semester course on basic parametric mathematical statistics. Part II deals with the large sample theory of statistics — parametric and nonparametric, and its contents may be covered in one semester as well. Part III provides brief accounts of a number of topics of current interest for practitioners and other disciplines whose work involves statistical methods. Large Sample theory with many worked examples, numerical calculations, and simulations to illustrate theory Appendices provide ready access to a number of standard results, with many proofs Solutions given to a number of selected exercises from Part I Part II exercises with ...
Illustrating Sampling Distribution of a Statistic: Minitab Revisited

Science.gov (United States)

Johnson, H. Dean; Evans, Marc A.

2008-01-01

Understanding the concept of the sampling distribution of a statistic is essential for the understanding of inferential procedures. Unfortunately, this topic proves to be a stumbling block for students in introductory statistics classes. In efforts to aid students in their understanding of this concept, alternatives to a lecture-based mode of…
Calculating Confidence, Uncertainty, and Numbers of Samples When Using Statistical Sampling Approaches to Characterize and Clear Contaminated Areas

Energy Technology Data Exchange (ETDEWEB)

Piepel, Gregory F.; Matzke, Brett D.; Sego, Landon H.; Amidan, Brett G.

2013-04-27

This report discusses the methodology, formulas, and inputs needed to make characterization and clearance decisions for Bacillus anthracis-contaminated and uncontaminated (or decontaminated) areas using a statistical sampling approach. Specifically, the report includes the methods and formulas for calculating the • number of samples required to achieve a specified confidence in characterization and clearance decisions • confidence in making characterization and clearance decisions for a specified number of samples for two common statistically based environmental sampling approaches. In particular, the report addresses an issue raised by the Government Accountability Office by providing methods and formulas to calculate the confidence that a decision area is uncontaminated (or successfully decontaminated) if all samples collected according to a statistical sampling approach have negative results. Key to addressing this topic is the probability that an individual sample result is a false negative, which is commonly referred to as the false negative rate (FNR). The two statistical sampling approaches currently discussed in this report are 1) hotspot sampling to detect small isolated contaminated locations during the characterization phase, and 2) combined judgment and random (CJR) sampling during the clearance phase. Typically if contamination is widely distributed in a decision area, it will be detectable via judgment sampling during the characterization phrase. Hotspot sampling is appropriate for characterization situations where contamination is not widely distributed and may not be detected by judgment sampling. CJR sampling is appropriate during the clearance phase when it is desired to augment judgment samples with statistical (random) samples. The hotspot and CJR statistical sampling approaches are discussed in the report for four situations: 1. qualitative data (detect and non-detect) when the FNR = 0 or when using statistical sampling methods that account
Pierre Gy's sampling theory and sampling practice heterogeneity, sampling correctness, and statistical process control

CERN Document Server

Pitard, Francis F

1993-01-01

Pierre Gy's Sampling Theory and Sampling Practice, Second Edition is a concise, step-by-step guide for process variability management and methods. Updated and expanded, this new edition provides a comprehensive study of heterogeneity, covering the basic principles of sampling theory and its various applications. It presents many practical examples to allow readers to select appropriate sampling protocols and assess the validity of sampling protocols from others. The variability of dynamic process streams using variography is discussed to help bridge sampling theory with statistical process control. Many descriptions of good sampling devices, as well as descriptions of poor ones, are featured to educate readers on what to look for when purchasing sampling systems. The book uses its accessible, tutorial style to focus on professional selection and use of methods. The book will be a valuable guide for mineral processing engineers; metallurgists; geologists; miners; chemists; environmental scientists; and practit...
The application of statistical and/or non-statistical sampling techniques by internal audit functions in the South African banking industry

Directory of Open Access Journals (Sweden)

D.P. van der Nest

2015-03-01

Full Text Available This article explores the use by internal audit functions of audit sampling techniques in order to test the effectiveness of controls in the banking sector. The article focuses specifically on the use of statistical and/or non-statistical sampling techniques by internal auditors. The focus of the research for this article was internal audit functions in the banking sector of South Africa. The results discussed in the article indicate that audit sampling is still used frequently as an audit evidence-gathering technique. Non-statistical sampling techniques are used more frequently than statistical sampling techniques for the evaluation of the sample. In addition, both techniques are regarded as important for the determination of the sample size and the selection of the sample items
Finite-sample instrumental variables inference using an asymptotically pivotal statistic

NARCIS (Netherlands)

Bekker, P; Kleibergen, F

2003-01-01

We consider the K-statistic, Kleibergen's (2002, Econometrica 70, 1781-1803) adaptation of the Anderson-Rubin (AR) statistic in instrumental variables regression. Whereas Kleibergen (2002) especially analyzes the asymptotic behavior of the statistic, we focus on finite-sample properties in, a
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.

Science.gov (United States)

Liu, Ruijie; Holik, Aliaksei Z; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E; Asselin-Labat, Marie-Liesse; Smyth, Gordon K; Ritchie, Matthew E

2015-09-03

Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Statistical sampling method for releasing decontaminated vehicles

International Nuclear Information System (INIS)

Lively, J.W.; Ware, J.A.

1996-01-01

Earth moving vehicles (e.g., dump trucks, belly dumps) commonly haul radiologically contaminated materials from a site being remediated to a disposal site. Traditionally, each vehicle must be surveyed before being released. The logistical difficulties of implementing the traditional approach on a large scale demand that an alternative be devised. A statistical method (MIL-STD-105E, open-quotes Sampling Procedures and Tables for Inspection by Attributesclose quotes) for assessing product quality from a continuous process was adapted to the vehicle decontamination process. This method produced a sampling scheme that automatically compensates and accommodates fluctuating batch sizes and changing conditions without the need to modify or rectify the sampling scheme in the field. Vehicles are randomly selected (sampled) upon completion of the decontamination process to be surveyed for residual radioactive surface contamination. The frequency of sampling is based on the expected number of vehicles passing through the decontamination process in a given period and the confidence level desired. This process has been successfully used for 1 year at the former uranium mill site in Monticello, Utah (a CERCLA regulated clean-up site). The method forces improvement in the quality of the decontamination process and results in a lower likelihood that vehicles exceeding the surface contamination standards are offered for survey. Implementation of this statistical sampling method on Monticello Projects has resulted in more efficient processing of vehicles through decontamination and radiological release, saved hundreds of hours of processing time, provided a high level of confidence that release limits are met, and improved the radiological cleanliness of vehicles leaving the controlled site
Statistical Sampling For In-Service Inspection Of Liquid Waste Tanks At The Savannah River Site

International Nuclear Information System (INIS)

Harris, S.

2011-01-01

Savannah River Remediation, LLC (SRR) is implementing a statistical sampling strategy for In-Service Inspection (ISI) of Liquid Waste (LW) Tanks at the United States Department of Energy's Savannah River Site (SRS) in Aiken, South Carolina. As a component of SRS's corrosion control program, the ISI program assesses tank wall structural integrity through the use of ultrasonic testing (UT). The statistical strategy for ISI is based on the random sampling of a number of vertically oriented unit areas, called strips, within each tank. The number of strips to inspect was determined so as to attain, over time, a high probability of observing at least one of the worst 5% in terms of pitting and corrosion across all tanks. The probability estimation to determine the number of strips to inspect was performed using the hypergeometric distribution. Statistical tolerance limits for pit depth and corrosion rates were calculated by fitting the lognormal distribution to the data. In addition to the strip sampling strategy, a single strip within each tank was identified to serve as the baseline for a longitudinal assessment of the tank safe operational life. The statistical sampling strategy enables the ISI program to develop individual profiles of LW tank wall structural integrity that collectively provide a high confidence in their safety and integrity over operational lifetimes.
Statistical analyses to support guidelines for marine avian sampling. Final report

Science.gov (United States)

Kinlan, Brian P.; Zipkin, Elise; O'Connell, Allan F.; Caldow, Chris

2012-01-01

Interest in development of offshore renewable energy facilities has led to a need for high-quality, statistically robust information on marine wildlife distributions. A practical approach is described to estimate the amount of sampling effort required to have sufficient statistical power to identify species-specific “hotspots” and “coldspots” of marine bird abundance and occurrence in an offshore environment divided into discrete spatial units (e.g., lease blocks), where “hotspots” and “coldspots” are defined relative to a reference (e.g., regional) mean abundance and/or occurrence probability for each species of interest. For example, a location with average abundance or occurrence that is three times larger the mean (3x effect size) could be defined as a “hotspot,” and a location that is three times smaller than the mean (1/3x effect size) as a “coldspot.” The choice of the effect size used to define hot and coldspots will generally depend on a combination of ecological and regulatory considerations. A method is also developed for testing the statistical significance of possible hotspots and coldspots. Both methods are illustrated with historical seabird survey data from the USGS Avian Compendium Database. Our approach consists of five main components: 1. A review of the primary scientific literature on statistical modeling of animal group size and avian count data to develop a candidate set of statistical distributions that have been used or may be useful to model seabird counts. 2. Statistical power curves for one-sample, one-tailed Monte Carlo significance tests of differences of observed small-sample means from a specified reference distribution. These curves show the power to detect "hotspots" or "coldspots" of occurrence and abundance at a range of effect sizes, given assumptions which we discuss. 3. A model selection procedure, based on maximum likelihood fits of models in the candidate set, to determine an appropriate statistical
Statistical sampling for holdup measurement

International Nuclear Information System (INIS)

Picard, R.R.; Pillay, K.K.S.

1986-01-01

Nuclear materials holdup is a serious problem in many operating facilities. Estimating amounts of holdup is important for materials accounting and, sometimes, for process safety. Clearly, measuring holdup in all pieces of equipment is not a viable option in terms of time, money, and radiation exposure to personnel. Furthermore, 100% measurement is not only impractical but unnecessary for developing estimated values. Principles of statistical sampling are valuable in the design of cost effective holdup monitoring plans and in qualifying uncertainties in holdup estimates. The purpose of this paper is to describe those principles and to illustrate their use
Exclusive observables from a statistical simulation of energetic nuclear collisions

International Nuclear Information System (INIS)

Fai, G.

1983-01-01

Exclusive observables are calculated in the framework of a statistical model for medium-energy nuclear collisions. The collision system is divided into a few (participant/spectator) sources, that are assumed to disassemble independently. Sufficiently excited sources explode into pions, nucleons, and composite, possibly particle unstable, nuclei. The different final states compete according to their microcanonical weight. Less excited sources, and the unstable explosion products, deexcite via light-particle evaporation. The model has been implemented as a Monte Carlo computer code that is sufficiently efficient to permit generation of large event samples. Some illustrative applications are discussed. (author)
Sampling methods to the statistical control of the production of blood components.

Science.gov (United States)

Pereira, Paulo; Seghatchian, Jerard; Caldeira, Beatriz; Santos, Paula; Castro, Rosa; Fernandes, Teresa; Xavier, Sandra; de Sousa, Gracinda; de Almeida E Sousa, João Paulo

2017-12-01

The control of blood components specifications is a requirement generalized in Europe by the European Commission Directives and in the US by the AABB standards. The use of a statistical process control methodology is recommended in the related literature, including the EDQM guideline. The control reliability is dependent of the sampling. However, a correct sampling methodology seems not to be systematically applied. Commonly, the sampling is intended to comply uniquely with the 1% specification to the produced blood components. Nevertheless, on a purely statistical viewpoint, this model could be argued not to be related to a consistent sampling technique. This could be a severe limitation to detect abnormal patterns and to assure that the production has a non-significant probability of producing nonconforming components. This article discusses what is happening in blood establishments. Three statistical methodologies are proposed: simple random sampling, sampling based on the proportion of a finite population, and sampling based on the inspection level. The empirical results demonstrate that these models are practicable in blood establishments contributing to the robustness of sampling and related statistical process control decisions for the purpose they are suggested for. Copyright © 2017 Elsevier Ltd. All rights reserved.
Multivariate Statistical Inference of Lightning Occurrence, and Using Lightning Observations

Science.gov (United States)

Boccippio, Dennis

2004-01-01

Two classes of multivariate statistical inference using TRMM Lightning Imaging Sensor, Precipitation Radar, and Microwave Imager observation are studied, using nonlinear classification neural networks as inferential tools. The very large and globally representative data sample provided by TRMM allows both training and validation (without overfitting) of neural networks with many degrees of freedom. In the first study, the flashing / or flashing condition of storm complexes is diagnosed using radar, passive microwave and/or environmental observations as neural network inputs. The diagnostic skill of these simple lightning/no-lightning classifiers can be quite high, over land (above 80% Probability of Detection; below 20% False Alarm Rate). In the second, passive microwave and lightning observations are used to diagnose radar reflectivity vertical structure. A priori diagnosis of hydrometeor vertical structure is highly important for improved rainfall retrieval from either orbital radars (e.g., the future Global Precipitation Mission "mothership") or radiometers (e.g., operational SSM/I and future Global Precipitation Mission passive microwave constellation platforms), we explore the incremental benefit to such diagnosis provided by lightning observations.
Method for statistical data analysis of multivariate observations

CERN Document Server

Gnanadesikan, R

1997-01-01

A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte

Multivariate statistics high-dimensional and large-sample approximations

CERN Document Server

Fujikoshi, Yasunori; Shimizu, Ryoichi

2010-01-01

A comprehensive examination of high-dimensional analysis of multivariate methods and their real-world applications Multivariate Statistics: High-Dimensional and Large-Sample Approximations is the first book of its kind to explore how classical multivariate methods can be revised and used in place of conventional statistical tools. Written by prominent researchers in the field, the book focuses on high-dimensional and large-scale approximations and details the many basic multivariate methods used to achieve high levels of accuracy. The authors begin with a fundamental presentation of the basic
The Role of the Sampling Distribution in Understanding Statistical Inference

Science.gov (United States)

Lipson, Kay

2003-01-01

Many statistics educators believe that few students develop the level of conceptual understanding essential for them to apply correctly the statistical techniques at their disposal and to interpret their outcomes appropriately. It is also commonly believed that the sampling distribution plays an important role in developing this understanding.…
Weighted statistical parameters for irregularly sampled time series

Science.gov (United States)

Rimoldini, Lorenzo

2014-01-01

Unevenly spaced time series are common in astronomy because of the day-night cycle, weather conditions, dependence on the source position in the sky, allocated telescope time and corrupt measurements, for example, or inherent to the scanning law of satellites like Hipparcos and the forthcoming Gaia. Irregular sampling often causes clumps of measurements and gaps with no data which can severely disrupt the values of estimators. This paper aims at improving the accuracy of common statistical parameters when linear interpolation (in time or phase) can be considered an acceptable approximation of a deterministic signal. A pragmatic solution is formulated in terms of a simple weighting scheme, adapting to the sampling density and noise level, applicable to large data volumes at minimal computational cost. Tests on time series from the Hipparcos periodic catalogue led to significant improvements in the overall accuracy and precision of the estimators with respect to the unweighted counterparts and those weighted by inverse-squared uncertainties. Automated classification procedures employing statistical parameters weighted by the suggested scheme confirmed the benefits of the improved input attributes. The classification of eclipsing binaries, Mira, RR Lyrae, Delta Cephei and Alpha2 Canum Venaticorum stars employing exclusively weighted descriptive statistics achieved an overall accuracy of 92 per cent, about 6 per cent higher than with unweighted estimators.
Optimal design of sampling and mapping schemes in the radiometric exploration of Chipilapa, El Salvador (Geo-statistics)

International Nuclear Information System (INIS)

Balcazar G, M.; Flores R, J.H.

1992-01-01

As part of the knowledge about the radiometric surface exploration, carried out in the geothermal field of Chipilapa, El Salvador, its were considered the geo-statistical parameters starting from the calculated variogram of the field data, being that the maxim distance of correlation of the samples in 'radon' in the different observation addresses (N-S, E-W, N W-S E, N E-S W), it was of 121 mts for the monitoring grill in future prospectus in the same area. Being derived of it an optimization (minimum cost) in the spacing of the field samples by means of geo-statistical techniques, without losing the detection of the anomaly. (Author)
[Effect sizes, statistical power and sample sizes in "the Japanese Journal of Psychology"].

Science.gov (United States)

Suzukawa, Yumi; Toyoda, Hideki

2012-04-01

This study analyzed the statistical power of research studies published in the "Japanese Journal of Psychology" in 2008 and 2009. Sample effect sizes and sample statistical powers were calculated for each statistical test and analyzed with respect to the analytical methods and the fields of the studies. The results show that in the fields like perception, cognition or learning, the effect sizes were relatively large, although the sample sizes were small. At the same time, because of the small sample sizes, some meaningful effects could not be detected. In the other fields, because of the large sample sizes, meaningless effects could be detected. This implies that researchers who could not get large enough effect sizes would use larger samples to obtain significant results.
Comparison of pure and 'Latinized' centroidal Voronoi tessellation against various other statistical sampling methods

International Nuclear Information System (INIS)

Romero, Vicente J.; Burkardt, John V.; Gunzburger, Max D.; Peterson, Janet S.

2006-01-01

A recently developed centroidal Voronoi tessellation (CVT) sampling method is investigated here to assess its suitability for use in statistical sampling applications. CVT efficiently generates a highly uniform distribution of sample points over arbitrarily shaped M-dimensional parameter spaces. On several 2-D test problems CVT has recently been found to provide exceedingly effective and efficient point distributions for response surface generation. Additionally, for statistical function integration and estimation of response statistics associated with uniformly distributed random-variable inputs (uncorrelated), CVT has been found in initial investigations to provide superior points sets when compared against latin-hypercube and simple-random Monte Carlo methods and Halton and Hammersley quasi-random sequence methods. In this paper, the performance of all these sampling methods and a new variant ('Latinized' CVT) are further compared for non-uniform input distributions. Specifically, given uncorrelated normal inputs in a 2-D test problem, statistical sampling efficiencies are compared for resolving various statistics of response: mean, variance, and exceedence probabilities
Observations in the statistical analysis of NBG-18 nuclear graphite strength tests

International Nuclear Information System (INIS)

Hindley, Michael P.; Mitchell, Mark N.; Blaine, Deborah C.; Groenwold, Albert A.

2012-01-01

Highlights: ► Statistical analysis of NBG-18 nuclear graphite strength test. ► A Weibull distribution and normal distribution is tested for all data. ► A Bimodal distribution in the CS data is confirmed. ► The CS data set has the lowest variance. ► A Combined data set is formed and has Weibull distribution. - Abstract: The purpose of this paper is to report on the selection of a statistical distribution chosen to represent the experimental material strength of NBG-18 nuclear graphite. Three large sets of samples were tested during the material characterisation of the Pebble Bed Modular Reactor and Core Structure Ceramics materials. These sets of samples are tensile strength, flexural strength and compressive strength (CS) measurements. A relevant statistical fit is determined and the goodness of fit is also evaluated for each data set. The data sets are also normalised for ease of comparison, and combined into one representative data set. The validity of this approach is demonstrated. A second failure mode distribution is found on the CS test data. Identifying this failure mode supports the similar observations made in the past. The success of fitting the Weibull distribution through the normalised data sets allows us to improve the basis for the estimates of the variability. This could also imply that the variability on the graphite strength for the different strength measures is based on the same flaw distribution and thus a property of the material.
Explorations in statistics: the log transformation.

Science.gov (United States)

Curran-Everett, Douglas

2018-06-01

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This thirteenth installment of Explorations in Statistics explores the log transformation, an established technique that rescales the actual observations from an experiment so that the assumptions of some statistical analysis are better met. A general assumption in statistics is that the variability of some response Y is homogeneous across groups or across some predictor variable X. If the variability-the standard deviation-varies in rough proportion to the mean value of Y, a log transformation can equalize the standard deviations. Moreover, if the actual observations from an experiment conform to a skewed distribution, then a log transformation can make the theoretical distribution of the sample mean more consistent with a normal distribution. This is important: the results of a one-sample t test are meaningful only if the theoretical distribution of the sample mean is roughly normal. If we log-transform our observations, then we want to confirm the transformation was useful. We can do this if we use the Box-Cox method, if we bootstrap the sample mean and the statistic t itself, and if we assess the residual plots from the statistical model of the actual and transformed sample observations.
A new framework of statistical inferences based on the valid joint sampling distribution of the observed counts in an incomplete contingency table.

Science.gov (United States)

Tian, Guo-Liang; Li, Hui-Qiong

2017-08-01

Some existing confidence interval methods and hypothesis testing methods in the analysis of a contingency table with incomplete observations in both margins entirely depend on an underlying assumption that the sampling distribution of the observed counts is a product of independent multinomial/binomial distributions for complete and incomplete counts. However, it can be shown that this independency assumption is incorrect and can result in unreliable conclusions because of the under-estimation of the uncertainty. Therefore, the first objective of this paper is to derive the valid joint sampling distribution of the observed counts in a contingency table with incomplete observations in both margins. The second objective is to provide a new framework for analyzing incomplete contingency tables based on the derived joint sampling distribution of the observed counts by developing a Fisher scoring algorithm to calculate maximum likelihood estimates of parameters of interest, the bootstrap confidence interval methods, and the bootstrap testing hypothesis methods. We compare the differences between the valid sampling distribution and the sampling distribution under the independency assumption. Simulation studies showed that average/expected confidence-interval widths of parameters based on the sampling distribution under the independency assumption are shorter than those based on the new sampling distribution, yielding unrealistic results. A real data set is analyzed to illustrate the application of the new sampling distribution for incomplete contingency tables and the analysis results again confirm the conclusions obtained from the simulation studies.
Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures

Directory of Open Access Journals (Sweden)

Scheid Anika

2012-07-01

-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.
Effect of model choice and sample size on statistical tolerance limits

International Nuclear Information System (INIS)

Duran, B.S.; Campbell, K.

1980-03-01

Statistical tolerance limits are estimates of large (or small) quantiles of a distribution, quantities which are very sensitive to the shape of the tail of the distribution. The exact nature of this tail behavior cannot be ascertained brom small samples, so statistical tolerance limits are frequently computed using a statistical model chosen on the basis of theoretical considerations or prior experience with similar populations. This report illustrates the effects of such choices on the computations
Sampling stored product insect pests: a comparison of four statistical sampling models for probability of pest detection

Science.gov (United States)

Statistically robust sampling strategies form an integral component of grain storage and handling activities throughout the world. Developing sampling strategies to target biological pests such as insects in stored grain is inherently difficult due to species biology and behavioral characteristics. ...
Improving Statistics Education through Simulations: The Case of the Sampling Distribution.

Science.gov (United States)

Earley, Mark A.

This paper presents a summary of action research investigating statistics students' understandings of the sampling distribution of the mean. With four sections of an introductory Statistics in Education course (n=98 students), a computer simulation activity (R. delMas, J. Garfield, and B. Chance, 1999) was implemented and evaluated to show…
Comparing Simulated and Theoretical Sampling Distributions of the U3 Person-Fit Statistic.

Science.gov (United States)

Emons, Wilco H. M.; Meijer, Rob R.; Sijtsma, Klaas

2002-01-01

Studied whether the theoretical sampling distribution of the U3 person-fit statistic is in agreement with the simulated sampling distribution under different item response theory models and varying item and test characteristics. Simulation results suggest that the use of standard normal deviates for the standardized version of the U3 statistic may…
IR Observations of a Complete Unbiased Sample of Bright Seyfert Galaxies

Science.gov (United States)

Malkan, Matthew; Bendo, George; Charmandaris, Vassilis; Smith, Howard; Spinoglio, Luigi; Tommasin, Silvia

2008-03-01

IR spectra will measure the 2 main energy-generating processes by which galactic nuclei shine: black hole accretion and star formation. Both of these play roles in galaxy evolution, and they appear connected. To obtain a complete sample of AGN, covering the range of luminosities and column-densities, we will combine 2 complete all-sky samples with complementary selections, minimally biased by dust obscuration: the 116 IRAS 12um AGN and the 41 Swift/BAT hard Xray AGN. These galaxies have been extensively studied across the entire EM spectrum. Herschel observations have been requested and will be synergistic with the Spitzer database. IRAC and MIPS imaging will allow us to separate the nuclear and galactic continua. We are completing full IR observations of the local AGN population, most of which have already been done. The only remaining observations we request are 10 IRS/HIRES, 57 MIPS-24 and 30 IRAC pointings. These high-quality observations of bright AGN in the bolometric-flux-limited samples should be completed, for the high legacy value of complete uniform datasets. We will measure quantitatively the emission at each wavelength arising from stars and from accretion in each galactic center. Since our complete samples come from flux-limited all-sky surveys in the IR and HX, we will calculate the bi-variate AGN and star formation Luminosity Functions for the local population of active galaxies, for comparison with higher redshifts.Our second aim is to understand the physical differences between AGN classes. This requires statistical comparisons of full multiwavelength observations of complete representative samples. If the difference between Sy1s and Sy2s is caused by orientation, their isotropic properties, including those of the surrounding galactic centers, should be similar. In contrast, if they are different evolutionary stages following a galaxy encounter, then we may find observational evidence that the circumnuclear ISM of Sy2s is relatively younger.
Effect of the absolute statistic on gene-sampling gene-set analysis methods.

Science.gov (United States)

Nam, Dougu

2017-06-01

Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
Statistical Analysis of Sport Movement Observations: the Case of Orienteering

Science.gov (United States)

Amouzandeh, K.; Karimipour, F.

2017-09-01

Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope) and non-spatial movement attributes (e.g. speed and heart rate) of athletes. As the case study, an example dataset of movement observations acquired during the "orienteering" sport is presented and statistically analyzed.
Nomogram for sample size calculation on a straightforward basis for the kappa statistic.

Science.gov (United States)

Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo

2014-09-01

Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high agreement but low kappa. Hence, it seems reasonable in sample size calculation that the level of agreement under a certain marginal prevalence is considered in terms of a simple proportion of agreement rather than a kappa value. Therefore, sample size formulae and nomograms using a simple proportion of agreement rather than a kappa under certain marginal prevalences are proposed. A sample size formula was derived using the kappa statistic under the common correlation model and goodness-of-fit statistic. The nomogram for the sample size formula was developed using SAS 9.3. The sample size formulae using a simple proportion of agreement instead of a kappa statistic and nomograms to eliminate the inconvenience of using a mathematical formula were produced. A nomogram for sample size calculation with a simple proportion of agreement should be useful in the planning stages when the focus of interest is on testing the hypothesis of interobserver agreement involving two raters and nominal outcome measures. Copyright © 2014 Elsevier Inc. All rights reserved.
Statistical properties of the surface velocity field in the northern Gulf of Mexico sampled by GLAD drifters

OpenAIRE

Mariano, A.J.; Ryan, E.H.; Huntley, H.S.; Laurindo, L.C.; Coelho, E.; Ozgokmen, TM; Berta, M.; Bogucki, D; Chen, S.S.; Curcic, M.; Drouin, K.L.; Gough, M; Haus, BK; Haza, A.C.; Hogan, P

2016-01-01

The Grand LAgrangian Deployment (GLAD) used multiscale sampling and GPS technology to observe time series of drifter positions with initial drifter separation of O(100 m) to O(10 km), and nominal 5 min sampling, during the summer and fall of 2012 in the northern Gulf of Mexico. Histograms of the velocity field and its statistical parameters are non-Gaussian; most are multimodal. The dominant periods for the surface velocity field are 1–2 days due to inertial oscillations, tides, and the sea b...
STATISTICAL ANALYSIS OF SPORT MOVEMENT OBSERVATIONS: THE CASE OF ORIENTEERING

Directory of Open Access Journals (Sweden)

K. Amouzandeh

2017-09-01

Full Text Available Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope and non-spatial movement attributes (e.g. speed and heart rate of athletes. As the case study, an example dataset of movement observations acquired during the “orienteering” sport is presented and statistically analyzed.

Generalizability of causal inference in observational studies under retrospective convenience sampling.

Science.gov (United States)

Hu, Zonghui; Qin, Jing

2018-05-20

Many observational studies adopt what we call retrospective convenience sampling (RCS). With the sample size in each arm prespecified, RCS randomly selects subjects from the treatment-inclined subpopulation into the treatment arm and those from the control-inclined into the control arm. Samples in each arm are representative of the respective subpopulation, but the proportion of the 2 subpopulations is usually not preserved in the sample data. We show in this work that, under RCS, existing causal effect estimators actually estimate the treatment effect over the sample population instead of the underlying study population. We investigate how to correct existing methods for consistent estimation of the treatment effect over the underlying population. Although RCS is adopted in medical studies for ethical and cost-effective purposes, it also has a big advantage for statistical inference: When the tendency to receive treatment is low in a study population, treatment effect estimators under RCS, with proper correction, are more efficient than their parallels under random sampling. These properties are investigated both theoretically and through numerical demonstration. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
THE SLOAN DIGITAL SKY SURVEY QUASAR LENS SEARCH. IV. STATISTICAL LENS SAMPLE FROM THE FIFTH DATA RELEASE

International Nuclear Information System (INIS)

Inada, Naohisa; Oguri, Masamune; Shin, Min-Su; Kayo, Issha; Fukugita, Masataka; Strauss, Michael A.; Gott, J. Richard; Hennawi, Joseph F.; Morokuma, Tomoki; Becker, Robert H.; Gregg, Michael D.; White, Richard L.; Kochanek, Christopher S.; Chiu, Kuenley; Johnston, David E.; Clocchiatti, Alejandro; Richards, Gordon T.; Schneider, Donald P.; Frieman, Joshua A.

2010-01-01

We present the second report of our systematic search for strongly lensed quasars from the data of the Sloan Digital Sky Survey (SDSS). From extensive follow-up observations of 136 candidate objects, we find 36 lenses in the full sample of 77,429 spectroscopically confirmed quasars in the SDSS Data Release 5. We then define a complete sample of 19 lenses, including 11 from our previous search in the SDSS Data Release 3, from the sample of 36,287 quasars with i Λ = 0.84 +0.06 -0.08 (stat.) +0.09 -0.07 (syst.) assuming a flat universe, which is in good agreement with other cosmological observations. We also report the discoveries of seven binary quasars with separations ranging from 1.''1 to 16.''6, which are identified in the course of our lens survey. This study concludes the construction of our statistical lens sample in the full SDSS-I data set.
MILLIMETER OBSERVATIONS OF A SAMPLE OF HIGH-REDSHIFT OBSCURED QUASARS

International Nuclear Information System (INIS)

Martinez-Sansigre, Alejo; Karim, Alexander; Schinnerer, Eva

2009-01-01

We present observations at 1.2 mm with Max-Planck Millimetre Bolometer Array (MAMBO-II) of a sample of z ∼> 2 radio-intermediate obscured quasars, as well as CO observations of two sources with the Plateau de Bure Interferometer. The typical rms noise achieved by the MAMBO observations is 0.55 mJy beam -1 and five out of 21 sources (24%) are detected at a significance of ≥3σ. Stacking all sources leads to a statistical detection of (S 1.2mm ) = 0.96 ± 0.11 mJy and stacking only the non-detections also yields a statistical detection, with (S 1.2mm ) = 0.51 ± 0.13 mJy. At the typical redshift of the sample, z = 2, 1 mJy corresponds to a far-infrared luminosity L FIR ∼4 x 10 12 L sun . If the far-infrared luminosity is powered entirely by star formation, and not by active galactic nucleus heated dust, then the characteristic inferred star formation rate is ∼700 M sun yr -1 . This far-infrared luminosity implies a dust mass of M d ∼3 x 10 8 M sun , which is expected to be distributed on ∼kpc scales. We estimate that such large dust masses on kpc scales can plausibly cause the obscuration of the quasars. Combining our observations at 1.2 mm with mid- and far-infrared data, and additional observations for two objects at 350 μm using SHARC-II, we present dust spectral energy distributions (SEDs) for our sample and derive a mean SED for our sample. This mean SED is not well fitted by clumpy torus models, unless additional extinction and far-infrared re-emission due to cool dust are included. This additional extinction can be consistently achieved by the mass of cool dust responsible for the far-infrared emission, provided the bulk of the dust is within a radius ∼2-3 kpc. Comparison of our sample to other samples of z ∼ 2 quasars suggests that obscured quasars have, on average, higher far-infrared luminosities than unobscured quasars. There is a hint that the host galaxies of obscured quasars must have higher cool-dust masses and are therefore often
Causality in Statistical Power: Isomorphic Properties of Measurement, Research Design, Effect Size, and Sample Size

Directory of Open Access Journals (Sweden)

R. Eric Heidel

2016-01-01

Full Text Available Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power.
Statistical characterization of a large geochemical database and effect of sample size

Science.gov (United States)

Zhang, C.; Manheim, F.T.; Hinde, J.; Grossman, J.N.

2005-01-01

The authors investigated statistical distributions for concentrations of chemical elements from the National Geochemical Survey (NGS) database of the U.S. Geological Survey. At the time of this study, the NGS data set encompasses 48,544 stream sediment and soil samples from the conterminous United States analyzed by ICP-AES following a 4-acid near-total digestion. This report includes 27 elements: Al, Ca, Fe, K, Mg, Na, P, Ti, Ba, Ce, Co, Cr, Cu, Ga, La, Li, Mn, Nb, Nd, Ni, Pb, Sc, Sr, Th, V, Y and Zn. The goal and challenge for the statistical overview was to delineate chemical distributions in a complex, heterogeneous data set spanning a large geographic range (the conterminous United States), and many different geological provinces and rock types. After declustering to create a uniform spatial sample distribution with 16,511 samples, histograms and quantile-quantile (Q-Q) plots were employed to delineate subpopulations that have coherent chemical and mineral affinities. Probability groupings are discerned by changes in slope (kinks) on the plots. Major rock-forming elements, e.g., Al, Ca, K and Na, tend to display linear segments on normal Q-Q plots. These segments can commonly be linked to petrologic or mineralogical associations. For example, linear segments on K and Na plots reflect dilution of clay minerals by quartz sand (low in K and Na). Minor and trace element relationships are best displayed on lognormal Q-Q plots. These sensitively reflect discrete relationships in subpopulations within the wide range of the data. For example, small but distinctly log-linear subpopulations for Pb, Cu, Zn and Ag are interpreted to represent ore-grade enrichment of naturally occurring minerals such as sulfides. None of the 27 chemical elements could pass the test for either normal or lognormal distribution on the declustered data set. Part of the reasons relate to the presence of mixtures of subpopulations and outliers. Random samples of the data set with successively
Statistical sampling strategies

International Nuclear Information System (INIS)

Andres, T.H.

1987-01-01

Systems assessment codes use mathematical models to simulate natural and engineered systems. Probabilistic systems assessment codes carry out multiple simulations to reveal the uncertainty in values of output variables due to uncertainty in the values of the model parameters. In this paper, methods are described for sampling sets of parameter values to be used in a probabilistic systems assessment code. Three Monte Carlo parameter selection methods are discussed: simple random sampling, Latin hypercube sampling, and sampling using two-level orthogonal arrays. Three post-selection transformations are also described: truncation, importance transformation, and discretization. Advantages and disadvantages of each method are summarized
Direct comparison of observed magnitude-redshift relations in complete galaxy samples with systematic predictions of alternative redshift-distance laws

International Nuclear Information System (INIS)

Segal, I.E.

1989-01-01

The directly observed average apparent magnitude (or in one case, angular diameter) as a function of redshift in each of a number of large complete galaxy samples is compared with the predictions of hypothetical redshift-distance power laws, as a systematic statistical question. Due account is taken of observational flux limits by an entirely objective and reproducible optimal statistical procedure, and no assumptions are made regarding the distribution of the galaxies in space. The laws considered are of the form z varies as r p , where r denotes the distance, for p = 1, 2 and 3. The comparative fits of the various redshift-distance laws are similar in all the samples. Overall, the cubic law fits better than the linear law, but each shows substantial systematic deviations from observation. The quadratic law fits extremely well except at high redshifts in some of the samples, where no power law fits closely and the correlation of apparent magnitude with redshift is small or negative. In all cases, the luminosity function required for theoretical prediction was estimated from the sample by the non-parametric procedure ROBUST, whose intrinsic neutrality as programmed was checked by comprehensive computer simulations. (author)
50 CFR 222.404 - Observer program sampling.

Science.gov (United States)

2010-10-01

... 50 Wildlife and Fisheries 7 2010-10-01 2010-10-01 false Observer program sampling. 222.404 Section 222.404 Wildlife and Fisheries NATIONAL MARINE FISHERIES SERVICE, NATIONAL OCEANIC AND ATMOSPHERIC... Requirement § 222.404 Observer program sampling. (a) During the program design, NMFS would be guided by the...
Covariance approximation for fast and accurate computation of channelized Hotelling observer statistics

Science.gov (United States)

Bonetto, P.; Qi, Jinyi; Leahy, R. M.

2000-08-01

Describes a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, the authors derive here a closed form for the channelized Hotelling observer (CHO) statistic applied to 2D MAP images. The theoretical analysis models both the Poission statistics of PET data and the inhomogeneity of tracer uptake. The authors show reasonably good correspondence between these theoretical results and Monte Carlo studies. The accuracy and low computational cost of the approximation allow the authors to analyze the observer performance over a wide range of operating conditions and parameter settings for the MAP reconstruction algorithm.
Statistical sampling and modelling for cork oak and eucalyptus stands

NARCIS (Netherlands)

Paulo, M.J.

2002-01-01

This thesis focuses on the use of modern statistical methods to solve problems on sampling, optimal cutting time and agricultural modelling in Portuguese cork oak and eucalyptus stands. The results are contained in five chapters that have been submitted for publication
Sample Size and Statistical Conclusions from Tests of Fit to the Rasch Model According to the Rasch Unidimensional Measurement Model (Rumm) Program in Health Outcome Measurement.

Science.gov (United States)

Hagell, Peter; Westergren, Albert

Sample size is a major factor in statistical null hypothesis testing, which is the basis for many approaches to testing Rasch model fit. Few sample size recommendations for testing fit to the Rasch model concern the Rasch Unidimensional Measurement Models (RUMM) software, which features chi-square and ANOVA/F-ratio based fit statistics, including Bonferroni and algebraic sample size adjustments. This paper explores the occurrence of Type I errors with RUMM fit statistics, and the effects of algebraic sample size adjustments. Data with simulated Rasch model fitting 25-item dichotomous scales and sample sizes ranging from N = 50 to N = 2500 were analysed with and without algebraically adjusted sample sizes. Results suggest the occurrence of Type I errors with N less then or equal to 500, and that Bonferroni correction as well as downward algebraic sample size adjustment are useful to avoid such errors, whereas upward adjustment of smaller samples falsely signal misfit. Our observations suggest that sample sizes around N = 250 to N = 500 may provide a good balance for the statistical interpretation of the RUMM fit statistics studied here with respect to Type I errors and under the assumption of Rasch model fit within the examined frame of reference (i.e., about 25 item parameters well targeted to the sample).
Gene coexpression measures in large heterogeneous samples using count statistics.

Science.gov (United States)

Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

2014-11-18

With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.
Statistical Methods and Tools for Hanford Staged Feed Tank Sampling

Energy Technology Data Exchange (ETDEWEB)

Fountain, Matthew S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Brigantic, Robert T. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Peterson, Reid A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

2013-10-01

This report summarizes work conducted by Pacific Northwest National Laboratory to technically evaluate the current approach to staged feed sampling of high-level waste (HLW) sludge to meet waste acceptance criteria (WAC) for transfer from tank farms to the Hanford Waste Treatment and Immobilization Plant (WTP). The current sampling and analysis approach is detailed in the document titled Initial Data Quality Objectives for WTP Feed Acceptance Criteria, 24590-WTP-RPT-MGT-11-014, Revision 0 (Arakali et al. 2011). The goal of this current work is to evaluate and provide recommendations to support a defensible, technical and statistical basis for the staged feed sampling approach that meets WAC data quality objectives (DQOs).
The Orientation of Gastric Biopsy Samples Improves the Inter-observer Agreement of the OLGA Staging System.

Science.gov (United States)

Cotruta, Bogdan; Gheorghe, Cristian; Iacob, Razvan; Dumbrava, Mona; Radu, Cristina; Bancila, Ion; Becheanu, Gabriel

2017-12-01

Evaluation of severity and extension of gastric atrophy and intestinal metaplasia is recommended to identify subjects with a high risk for gastric cancer. The inter-observer agreement for the assessment of gastric atrophy is reported to be low. The aim of the study was to evaluate the inter-observer agreement for the assessment of severity and extension of gastric atrophy using oriented and unoriented gastric biopsy samples. Furthermore, the quality of biopsy specimens in oriented and unoriented samples was analyzed. A total of 35 subjects with dyspeptic symptoms addressed for gastrointestinal endoscopy that agreed to enter the study were prospectively enrolled. The OLGA/OLGIM gastric biopsies protocol was used. From each subject two sets of biopsies were obtained (four from the antrum, two oriented and two unoriented, two from the gastric incisure, one oriented and one unoriented, four from the gastric body, two oriented and two unoriented). The orientation of the biopsy samples was completed using nitrocellulose filters (Endokit®, BioOptica, Milan, Italy). The samples were blindly examined by two experienced pathologists. Inter-observer agreement was evaluated using kappa statistic for inter-rater agreement. The quality of histopathology specimens taking into account the identification of lamina propria was analyzed in oriented vs. unoriented samples. The samples with detectable lamina propria mucosae were defined as good quality specimens. Categorical data was analyzed using chi-square test and a two-sided p value <0.05 was considered statistically significant. A total of 350 biopsy samples were analyzed (175 oriented / 175 unoriented). The kappa index values for oriented/unoriented OLGA 0/I/II/III and IV stages have been 0.62/0.13, 0.70/0.20, 0.61/0.06, 0.62/0.46, and 0.77/0.50, respectively. For OLGIM 0/I/II/III stages the kappa index values for oriented/unoriented samples were 0.83/0.83, 0.88/0.89, 0.70/0.88 and 0.83/1, respectively. No case of OLGIM IV
Analysis of statistical misconception in terms of statistical reasoning

Science.gov (United States)

Maryati, I.; Priatna, N.

2018-05-01

Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.
Covariance approximation for fast and accurate computation of channelized Hotelling observer statistics

International Nuclear Information System (INIS)

Bonetto, Paola; Qi, Jinyi; Leahy, Richard M.

1999-01-01

We describe a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, we derive here a closed form for the channelized Hotelling observer (CHO) statistic applied to 2D MAP images. We show reasonably good correspondence between these theoretical results and Monte Carlo studies. The accuracy and low computational cost of the approximation allow us to analyze the observer performance over a wide range of operating conditions and parameter settings for the MAP reconstruction algorithm
Effect of the Target Motion Sampling Temperature Treatment Method on the Statistics and Performance

Science.gov (United States)

Viitanen, Tuomas; Leppänen, Jaakko

2014-06-01

Target Motion Sampling (TMS) is a stochastic on-the-fly temperature treatment technique that is being developed as a part of the Monte Carlo reactor physics code Serpent. The method provides for modeling of arbitrary temperatures in continuous-energy Monte Carlo tracking routines with only one set of cross sections stored in the computer memory. Previously, only the performance of the TMS method in terms of CPU time per transported neutron has been discussed. Since the effective cross sections are not calculated at any point of a transport simulation with TMS, reaction rate estimators must be scored using sampled cross sections, which is expected to increase the variances and, consequently, to decrease the figures-of-merit. This paper examines the effects of the TMS on the statistics and performance in practical calculations involving reaction rate estimation with collision estimators. Against all expectations it turned out that the usage of sampled response values has no practical effect on the performance of reaction rate estimators when using TMS with elevated basis cross section temperatures (EBT), i.e. the usual way. With 0 Kelvin cross sections a significant increase in the variances of capture rate estimators was observed right below the energy region of unresolved resonances, but at these energies the figures-of-merit could be increased using a simple resampling technique to decrease the variances of the responses. It was, however, noticed that the usage of the TMS method increases the statistical deviances of all estimators, including the flux estimator, by tens of percents in the vicinity of very strong resonances. This effect is actually not related to the usage of sampled responses, but is instead an inherent property of the TMS tracking method and concerns both EBT and 0 K calculations.
Statistical sampling applied to the radiological characterization of historical waste

Directory of Open Access Journals (Sweden)

Zaffora Biagio

2016-01-01

Full Text Available The evaluation of the activity of radionuclides in radioactive waste is required for its disposal in final repositories. Easy-to-measure nuclides, like γ-emitters and high-energy X-rays, can be measured via non-destructive nuclear techniques from outside a waste package. Some radionuclides are difficult-to-measure (DTM from outside a package because they are α- or β-emitters. The present article discusses the application of linear regression, scaling factors (SF and the so-called “mean activity method” to estimate the activity of DTM nuclides on metallic waste produced at the European Organization for Nuclear Research (CERN. Various statistical sampling techniques including simple random sampling, systematic sampling, stratified and authoritative sampling are described and applied to 2 waste populations of activated copper cables. The bootstrap is introduced as a tool to estimate average activities and standard errors in waste characterization. The analysis of the DTM Ni-63 is used as an example. Experimental and theoretical values of SFs are calculated and compared. Guidelines for sampling historical waste using probabilistic and non-probabilistic sampling are finally given.
The large sample size fallacy.

Science.gov (United States)

Lantz, Björn

2013-06-01

Significance in the statistical sense has little to do with significance in the common practical sense. Statistical significance is a necessary but not a sufficient condition for practical significance. Hence, results that are extremely statistically significant may be highly nonsignificant in practice. The degree of practical significance is generally determined by the size of the observed effect, not the p-value. The results of studies based on large samples are often characterized by extreme statistical significance despite small or even trivial effect sizes. Interpreting such results as significant in practice without further analysis is referred to as the large sample size fallacy in this article. The aim of this article is to explore the relevance of the large sample size fallacy in contemporary nursing research. Relatively few nursing articles display explicit measures of observed effect sizes or include a qualitative discussion of observed effect sizes. Statistical significance is often treated as an end in itself. Effect sizes should generally be calculated and presented along with p-values for statistically significant results, and observed effect sizes should be discussed qualitatively through direct and explicit comparisons with the effects in related literature. © 2012 Nordic College of Caring Science.
Comparison of statistical sampling methods with ScannerBit, the GAMBIT scanning module

Energy Technology Data Exchange (ETDEWEB)

Martinez, Gregory D. [University of California, Physics and Astronomy Department, Los Angeles, CA (United States); McKay, James; Scott, Pat [Imperial College London, Department of Physics, Blackett Laboratory, London (United Kingdom); Farmer, Ben; Conrad, Jan [AlbaNova University Centre, Oskar Klein Centre for Cosmoparticle Physics, Stockholm (Sweden); Stockholm University, Department of Physics, Stockholm (Sweden); Roebber, Elinore [McGill University, Department of Physics, Montreal, QC (Canada); Putze, Antje [LAPTh, Universite de Savoie, CNRS, Annecy-le-Vieux (France); Collaboration: The GAMBIT Scanner Workgroup

2017-11-15

We introduce ScannerBit, the statistics and sampling module of the public, open-source global fitting framework GAMBIT. ScannerBit provides a standardised interface to different sampling algorithms, enabling the use and comparison of multiple computational methods for inferring profile likelihoods, Bayesian posteriors, and other statistical quantities. The current version offers random, grid, raster, nested sampling, differential evolution, Markov Chain Monte Carlo (MCMC) and ensemble Monte Carlo samplers. We also announce the release of a new standalone differential evolution sampler, Diver, and describe its design, usage and interface to ScannerBit. We subject Diver and three other samplers (the nested sampler MultiNest, the MCMC GreAT, and the native ScannerBit implementation of the ensemble Monte Carlo algorithm T-Walk) to a battery of statistical tests. For this we use a realistic physical likelihood function, based on the scalar singlet model of dark matter. We examine the performance of each sampler as a function of its adjustable settings, and the dimensionality of the sampling problem. We evaluate performance on four metrics: optimality of the best fit found, completeness in exploring the best-fit region, number of likelihood evaluations, and total runtime. For Bayesian posterior estimation at high resolution, T-Walk provides the most accurate and timely mapping of the full parameter space. For profile likelihood analysis in less than about ten dimensions, we find that Diver and MultiNest score similarly in terms of best fit and speed, outperforming GreAT and T-Walk; in ten or more dimensions, Diver substantially outperforms the other three samplers on all metrics. (orig.)

Sampling Errors in Monthly Rainfall Totals for TRMM and SSM/I, Based on Statistics of Retrieved Rain Rates and Simple Models

Science.gov (United States)

Bell, Thomas L.; Kundu, Prasun K.; Einaudi, Franco (Technical Monitor)

2000-01-01

Estimates from TRMM satellite data of monthly total rainfall over an area are subject to substantial sampling errors due to the limited number of visits to the area by the satellite during the month. Quantitative comparisons of TRMM averages with data collected by other satellites and by ground-based systems require some estimate of the size of this sampling error. A method of estimating this sampling error based on the actual statistics of the TRMM observations and on some modeling work has been developed. "Sampling error" in TRMM monthly averages is defined here relative to the monthly total a hypothetical satellite permanently stationed above the area would have reported. "Sampling error" therefore includes contributions from the random and systematic errors introduced by the satellite remote sensing system. As part of our long-term goal of providing error estimates for each grid point accessible to the TRMM instruments, sampling error estimates for TRMM based on rain retrievals from TRMM microwave (TMI) data are compared for different times of the year and different oceanic areas (to minimize changes in the statistics due to algorithmic differences over land and ocean). Changes in sampling error estimates due to changes in rain statistics due 1) to evolution of the official algorithms used to process the data, and 2) differences from other remote sensing systems such as the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave/Imager (SSM/I), are analyzed.
Statistics 101 for Radiologists.

Science.gov (United States)

Anvari, Arash; Halpern, Elkan F; Samir, Anthony E

2015-10-01

Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics

Science.gov (United States)

Pohorille, Andrew

2006-01-01

The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described
Statistical sampling plans

International Nuclear Information System (INIS)

Jaech, J.L.

1984-01-01

In auditing and in inspection, one selects a number of items by some set of procedures and performs measurements which are compared with the operator's values. This session considers the problem of how to select the samples to be measured, and what kinds of measurements to make. In the inspection situation, the ultimate aim is to independently verify the operator's material balance. The effectiveness of the sample plan in achieving this objective is briefly considered. The discussion focuses on the model plant
Comparing simulated and theoretical sampling distributions of the U3 person-fit statistic

NARCIS (Netherlands)

Emons, W.H.M.; Meijer, R.R.; Sijtsma, K.

2002-01-01

The accuracy with which the theoretical sampling distribution of van der Flier's person-.t statistic U3 approaches the empirical U3 sampling distribution is affected by the item discrimination. A simulation study showed that for tests with a moderate or a strong mean item discrimination, the Type I
Comparative statistical analysis of carcinogenic and non-carcinogenic effects of uranium in groundwater samples from different regions of Punjab, India.

Science.gov (United States)

Saini, Komal; Singh, Parminder; Bajwa, Bikramjit Singh

2016-12-01

LED flourimeter has been used for microanalysis of uranium concentration in groundwater samples collected from six districts of South West (SW), West (W) and North East (NE) Punjab, India. Average value of uranium content in water samples of SW Punjab is observed to be higher than WHO, USEPA recommended safe limit of 30µgl -1 as well as AERB proposed limit of 60µgl -1 . Whereas, for W and NE region of Punjab, average level of uranium concentration was within AERB recommended limit of 60µgl -1 . Average value observed in SW Punjab is around 3-4 times the value observed in W Punjab, whereas its value is more than 17 times the average value observed in NE region of Punjab. Statistical analysis of carcinogenic as well as non carcinogenic risks due to uranium have been evaluated for each studied district. Copyright © 2016 Elsevier Ltd. All rights reserved.
TRAN-STAT, Issue No. 3, January 1978. Topics discussed: some statistical aspects of compositing field samples

International Nuclear Information System (INIS)

Gilbert, R.O.

1978-01-01

Some statistical aspects of compositing field samples of soils for determining the content of Pu are discussed. Some of the potential problems involved in pooling samples are reviewed. This is followed by more detailed discussions and examples of compositing designs, adequacy of mixing, statistical models and their role in compositing, and related topics
Comparing simulated and theoretical sampling distributions of the U3 person-fit statistic

NARCIS (Netherlands)

Emons, Wilco H.M.; Meijer, R.R.; Sijtsma, Klaas

2002-01-01

The accuracy with which the theoretical sampling distribution of van der Flier’s person-fit statistic U3 approaches the empirical U3 sampling distribution is affected by the item discrimination. A simulation study showed that for tests with a moderate or a strong mean item discrimination, the Type I
Mapping cell populations in flow cytometry data for cross‐sample comparison using the Friedman–Rafsky test statistic as a distance measure

Science.gov (United States)

Hsiao, Chiaowen; Liu, Mengya; Stanton, Rick; McGee, Monnie; Qian, Yu

2015-01-01

Abstract Flow cytometry (FCM) is a fluorescence‐based single‐cell experimental technology that is routinely applied in biomedical research for identifying cellular biomarkers of normal physiological responses and abnormal disease states. While many computational methods have been developed that focus on identifying cell populations in individual FCM samples, very few have addressed how the identified cell populations can be matched across samples for comparative analysis. This article presents FlowMap‐FR, a novel method for cell population mapping across FCM samples. FlowMap‐FR is based on the Friedman–Rafsky nonparametric test statistic (FR statistic), which quantifies the equivalence of multivariate distributions. As applied to FCM data by FlowMap‐FR, the FR statistic objectively quantifies the similarity between cell populations based on the shapes, sizes, and positions of fluorescence data distributions in the multidimensional feature space. To test and evaluate the performance of FlowMap‐FR, we simulated the kinds of biological and technical sample variations that are commonly observed in FCM data. The results show that FlowMap‐FR is able to effectively identify equivalent cell populations between samples under scenarios of proportion differences and modest position shifts. As a statistical test, FlowMap‐FR can be used to determine whether the expression of a cellular marker is statistically different between two cell populations, suggesting candidates for new cellular phenotypes by providing an objective statistical measure. In addition, FlowMap‐FR can indicate situations in which inappropriate splitting or merging of cell populations has occurred during gating procedures. We compared the FR statistic with the symmetric version of Kullback–Leibler divergence measure used in a previous population matching method with both simulated and real data. The FR statistic outperforms the symmetric version of KL‐distance in distinguishing
Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure.

Science.gov (United States)

Hsiao, Chiaowen; Liu, Mengya; Stanton, Rick; McGee, Monnie; Qian, Yu; Scheuermann, Richard H

2016-01-01

Flow cytometry (FCM) is a fluorescence-based single-cell experimental technology that is routinely applied in biomedical research for identifying cellular biomarkers of normal physiological responses and abnormal disease states. While many computational methods have been developed that focus on identifying cell populations in individual FCM samples, very few have addressed how the identified cell populations can be matched across samples for comparative analysis. This article presents FlowMap-FR, a novel method for cell population mapping across FCM samples. FlowMap-FR is based on the Friedman-Rafsky nonparametric test statistic (FR statistic), which quantifies the equivalence of multivariate distributions. As applied to FCM data by FlowMap-FR, the FR statistic objectively quantifies the similarity between cell populations based on the shapes, sizes, and positions of fluorescence data distributions in the multidimensional feature space. To test and evaluate the performance of FlowMap-FR, we simulated the kinds of biological and technical sample variations that are commonly observed in FCM data. The results show that FlowMap-FR is able to effectively identify equivalent cell populations between samples under scenarios of proportion differences and modest position shifts. As a statistical test, FlowMap-FR can be used to determine whether the expression of a cellular marker is statistically different between two cell populations, suggesting candidates for new cellular phenotypes by providing an objective statistical measure. In addition, FlowMap-FR can indicate situations in which inappropriate splitting or merging of cell populations has occurred during gating procedures. We compared the FR statistic with the symmetric version of Kullback-Leibler divergence measure used in a previous population matching method with both simulated and real data. The FR statistic outperforms the symmetric version of KL-distance in distinguishing equivalent from nonequivalent cell
Effects of (α,n) contaminants and sample multiplication on statistical neutron correlation measurements

International Nuclear Information System (INIS)

Dowdy, E.J.; Hansen, G.E.; Robba, A.A.; Pratt, J.C.

1980-01-01

The complete formalism for the use of statistical neutron fluctuation measurements for the nondestructive assay of fissionable materials has been developed. This formalism includes the effect of detector deadtime, neutron multiplicity, random neutron pulse contributions from (α,n) contaminants in the sample, and the sample multiplication of both fission-related and background neutrons
STATISTICAL LANDMARKS AND PRACTICAL ISSUES REGARDING THE USE OF SIMPLE RANDOM SAMPLING IN MARKET RESEARCHES

Directory of Open Access Journals (Sweden)

CODRUŢA DURA

2010-01-01

Full Text Available The sample represents a particular segment of the statistical populationchosen to represent it as a whole. The representativeness of the sample determines the accuracyfor estimations made on the basis of calculating the research indicators and the inferentialstatistics. The method of random sampling is part of probabilistic methods which can be usedwithin marketing research and it is characterized by the fact that it imposes the requirementthat each unit belonging to the statistical population should have an equal chance of beingselected for the sampling process. When the simple random sampling is meant to be rigorouslyput into practice, it is recommended to use the technique of random number tables in order toconfigure the sample which will provide information that the marketer needs. The paper alsodetails the practical procedure implemented in order to create a sample for a marketingresearch by generating random numbers using the facilities offered by Microsoft Excel.
The Sloan Digital Sky Survey Quasar Lens Search. IV. Statistical Lens Sample from the Fifth Data Release

Energy Technology Data Exchange (ETDEWEB)

Inada, Naohisa; /Wako, RIKEN /Tokyo U., ICEPP; Oguri, Masamune; /Natl. Astron. Observ. of Japan /Stanford U., Phys. Dept.; Shin, Min-Su; /Michigan U. /Princeton U. Observ.; Kayo, Issha; /Tokyo U., ICRR; Strauss, Michael A.; /Princeton U. Observ.; Hennawi, Joseph F.; /UC, Berkeley /Heidelberg, Max Planck Inst. Astron.; Morokuma, Tomoki; /Natl. Astron. Observ. of Japan; Becker, Robert H.; /LLNL, Livermore /UC, Davis; White, Richard L.; /Baltimore, Space Telescope Sci.; Kochanek, Christopher S.; /Ohio State U.; Gregg, Michael D.; /LLNL, Livermore /UC, Davis /Exeter U.

2010-05-01

We present the second report of our systematic search for strongly lensed quasars from the data of the Sloan Digital Sky Survey (SDSS). From extensive follow-up observations of 136 candidate objects, we find 36 lenses in the full sample of 77,429 spectroscopically confirmed quasars in the SDSS Data Release 5. We then define a complete sample of 19 lenses, including 11 from our previous search in the SDSS Data Release 3, from the sample of 36,287 quasars with i < 19.1 in the redshift range 0.6 < z < 2.2, where we require the lenses to have image separations of 1 < {theta} < 20 and i-band magnitude differences between the two images smaller than 1.25 mag. Among the 19 lensed quasars, 3 have quadruple-image configurations, while the remaining 16 show double images. This lens sample constrains the cosmological constant to be {Omega}{sub {Lambda}} = 0.84{sub -0.08}{sup +0.06}(stat.){sub -0.07}{sup + 0.09}(syst.) assuming a flat universe, which is in good agreement with other cosmological observations. We also report the discoveries of 7 binary quasars with separations ranging from 1.1 to 16.6, which are identified in the course of our lens survey. This study concludes the construction of our statistical lens sample in the full SDSS-I data set.
Sample Size Requirements for Assessing Statistical Moments of Simulated Crop Yield Distributions

NARCIS (Netherlands)

Lehmann, N.; Finger, R.; Klein, T.; Calanca, P.

2013-01-01

Mechanistic crop growth models are becoming increasingly important in agricultural research and are extensively used in climate change impact assessments. In such studies, statistics of crop yields are usually evaluated without the explicit consideration of sample size requirements. The purpose of
The Bologna complete sample of nearby radio sources. II. Phase referenced observations of faint nuclear sources

Science.gov (United States)

Liuzzo, E.; Giovannini, G.; Giroletti, M.; Taylor, G. B.

2009-10-01

Aims: To study statistical properties of different classes of sources, it is necessary to observe a sample that is free of selection effects. To do this, we initiated a project to observe a complete sample of radio galaxies selected from the B2 Catalogue of Radio Sources and the Third Cambridge Revised Catalogue (3CR), with no selection constraint on the nuclear properties. We named this sample “the Bologna Complete Sample” (BCS). Methods: We present new VLBI observations at 5 and 1.6 GHz for 33 sources drawn from a sample not biased toward orientation. By combining these data with those in the literature, information on the parsec-scale morphology is available for a total of 76 of 94 radio sources with a range in radio power and kiloparsec-scale morphologies. Results: The fraction of two-sided sources at milliarcsecond resolution is high (30%), compared to the fraction found in VLBI surveys selected at centimeter wavelengths, as expected from the predictions of unified models. The parsec-scale jets are generally found to be straight and to line up with the kiloparsec-scale jets. A few peculiar sources are discussed in detail. Tables 1-4 are only available in electronic form at http://www.aanda.org
Effect of the Target Motion Sampling temperature treatment method on the statistics and performance

International Nuclear Information System (INIS)

Viitanen, Tuomas; Leppänen, Jaakko

2015-01-01

Highlights: • Use of the Target Motion Sampling (TMS) method with collision estimators is studied. • The expected values of the estimators agree with NJOY-based reference. • In most practical cases also the variances of the estimators are unaffected by TMS. • Transport calculation slow-down due to TMS dominates the impact on figures-of-merit. - Abstract: Target Motion Sampling (TMS) is a stochastic on-the-fly temperature treatment technique that is being developed as a part of the Monte Carlo reactor physics code Serpent. The method provides for modeling of arbitrary temperatures in continuous-energy Monte Carlo tracking routines with only one set of cross sections stored in the computer memory. Previously, only the performance of the TMS method in terms of CPU time per transported neutron has been discussed. Since the effective cross sections are not calculated at any point of a transport simulation with TMS, reaction rate estimators must be scored using sampled cross sections, which is expected to increase the variances and, consequently, to decrease the figures-of-merit. This paper examines the effects of the TMS on the statistics and performance in practical calculations involving reaction rate estimation with collision estimators. Against all expectations it turned out that the usage of sampled response values has no practical effect on the performance of reaction rate estimators when using TMS with elevated basis cross section temperatures (EBT), i.e. the usual way. With 0 Kelvin cross sections a significant increase in the variances of capture rate estimators was observed right below the energy region of unresolved resonances, but at these energies the figures-of-merit could be increased using a simple resampling technique to decrease the variances of the responses. It was, however, noticed that the usage of the TMS method increases the statistical deviances of all estimators, including the flux estimator, by tens of percents in the vicinity of very
On the representativeness of behavior observation samples in classrooms.

Science.gov (United States)

Tiger, Jeffrey H; Miller, Sarah J; Mevers, Joanna Lomas; Mintz, Joslyn Cynkus; Scheithauer, Mindy C; Alvarez, Jessica

2013-01-01

School consultants who rely on direct observation typically conduct observational samples (e.g., 1 30-min observation per day) with the hopes that the sample is representative of performance during the remainder of the day, but the representativeness of these samples is unclear. In the current study, we recorded the problem behavior of 3 referred students for 4 consecutive school days between 9:30 a.m. and 2:30 p.m. using duration recording in consecutive 10-min sessions. We then culled 10-min, 20-min, 30-min, and 60-min observations from the complete record and compared these observations to the true daily mean to assess their accuracy (i.e., how well individual observations represented the daily occurrence of target behaviors). The results indicated that when behavior occurred with low variability, the majority of brief observations were representative of the overall levels; however, when behavior occurred with greater variability, even 60-min observations did not accurately capture the true levels of behavior. © Society for the Experimental Analysis of Behavior.
A cost-saving statistically based screening technique for focused sampling of a lead-contaminated site

International Nuclear Information System (INIS)

Moscati, A.F. Jr.; Hediger, E.M.; Rupp, M.J.

1986-01-01

High concentrations of lead in soils along an abandoned railroad line prompted a remedial investigation to characterize the extent of contamination across a 7-acre site. Contamination was thought to be spotty across the site reflecting its past use in battery recycling operations at discrete locations. A screening technique was employed to delineate the more highly contaminated areas by testing a statistically determined minimum number of random samples from each of seven discrete site areas. The approach not only quickly identified those site areas which would require more extensive grid sampling, but also provided a statistically defensible basis for excluding other site areas from further consideration, thus saving the cost of additional sample collection and analysis. The reduction in the number of samples collected in ''clean'' areas of the site ranged from 45 to 60%
Ecotoxicology statistical sampling

International Nuclear Information System (INIS)

Saona, G.

2012-01-01

This presentation introduces to general concepts in toxicology sample designs such as the distribution of organic or inorganic contaminants, a microbiological contamination, and the determination of the position in an eco toxicological bioassays ecosystem.
Survey of statistical and sampling needs for environmental monitoring of commercial low-level radioactive waste disposal facilities

International Nuclear Information System (INIS)

Eberhardt, L.L.; Thomas, J.M.

1986-07-01

This project was designed to develop guidance for implementing 10 CFR Part 61 and to determine the overall needs for sampling and statistical work in characterizing, surveying, monitoring, and closing commercial low-level waste sites. When cost-effectiveness and statistical reliability are of prime importance, then double sampling, compositing, and stratification (with optimal allocation) are identified as key issues. If the principal concern is avoiding questionable statistical practice, then the applicability of kriging (for assessing spatial pattern), methods for routine monitoring, and use of standard textbook formulae in reporting monitoring results should be reevaluated. Other important issues identified include sampling for estimating model parameters and the use of data from left-censored (less than detectable limits) distributions

A Unimodal Model for Double Observer Distance Sampling Surveys.

Directory of Open Access Journals (Sweden)

Earl F Becker

Full Text Available Distance sampling is a widely used method to estimate animal population size. Most distance sampling models utilize a monotonically decreasing detection function such as a half-normal. Recent advances in distance sampling modeling allow for the incorporation of covariates into the distance model, and the elimination of the assumption of perfect detection at some fixed distance (usually the transect line with the use of double-observer models. The assumption of full observer independence in the double-observer model is problematic, but can be addressed by using the point independence assumption which assumes there is one distance, the apex of the detection function, where the 2 observers are assumed independent. Aerially collected distance sampling data can have a unimodal shape and have been successfully modeled with a gamma detection function. Covariates in gamma detection models cause the apex of detection to shift depending upon covariate levels, making this model incompatible with the point independence assumption when using double-observer data. This paper reports a unimodal detection model based on a two-piece normal distribution that allows covariates, has only one apex, and is consistent with the point independence assumption when double-observer data are utilized. An aerial line-transect survey of black bears in Alaska illustrate how this method can be applied.
Statistical analysis of hydrological response in urbanising catchments based on adaptive sampling using inter-amount times

Science.gov (United States)

ten Veldhuis, Marie-Claire; Schleiss, Marc

2017-04-01

Urban catchments are typically characterised by a more flashy nature of the hydrological response compared to natural catchments. Predicting flow changes associated with urbanisation is not straightforward, as they are influenced by interactions between impervious cover, basin size, drainage connectivity and stormwater management infrastructure. In this study, we present an alternative approach to statistical analysis of hydrological response variability and basin flashiness, based on the distribution of inter-amount times. We analyse inter-amount time distributions of high-resolution streamflow time series for 17 (semi-)urbanised basins in North Carolina, USA, ranging from 13 to 238 km2 in size. We show that in the inter-amount-time framework, sampling frequency is tuned to the local variability of the flow pattern, resulting in a different representation and weighting of high and low flow periods in the statistical distribution. This leads to important differences in the way the distribution quantiles, mean, coefficient of variation and skewness vary across scales and results in lower mean intermittency and improved scaling. Moreover, we show that inter-amount-time distributions can be used to detect regulation effects on flow patterns, identify critical sampling scales and characterise flashiness of hydrological response. The possibility to use both the classical approach and the inter-amount-time framework to identify minimum observable scales and analyse flow data opens up interesting areas for future research.
Two sample Bayesian prediction intervals for order statistics based on the inverse exponential-type distributions using right censored sample

Directory of Open Access Journals (Sweden)

M.M. Mohie El-Din

2011-10-01

Full Text Available In this paper, two sample Bayesian prediction intervals for order statistics (OS are obtained. This prediction is based on a certain class of the inverse exponential-type distributions using a right censored sample. A general class of prior density functions is used and the predictive cumulative function is obtained in the two samples case. The class of the inverse exponential-type distributions includes several important distributions such the inverse Weibull distribution, the inverse Burr distribution, the loglogistic distribution, the inverse Pareto distribution and the inverse paralogistic distribution. Special cases of the inverse Weibull model such as the inverse exponential model and the inverse Rayleigh model are considered.
The Impact of Time Difference between Satellite Overpass and Ground Observation on Cloud Cover Performance Statistics

Directory of Open Access Journals (Sweden)

Jędrzej S. Bojanowski

2014-12-01

Full Text Available Cloud property data sets derived from passive sensors onboard the polar orbiting satellites (such as the NOAA’s Advanced Very High Resolution Radiometer have global coverage and now span a climatological time period. Synoptic surface observations (SYNOP are often used to characterize the accuracy of satellite-based cloud cover. Infrequent overpasses of polar orbiting satellites combined with the 3- or 6-h SYNOP frequency lead to collocation time differences of up to 3 h. The associated collocation error degrades the cloud cover performance statistics such as the Hanssen-Kuiper’s discriminant (HK by up to 45%. Limiting the time difference to 10 min, on the other hand, introduces a sampling error due to a lower number of corresponding satellite and SYNOP observations. This error depends on both the length of the validated time series and the SYNOP frequency. The trade-off between collocation and sampling error call for an optimum collocation time difference. It however depends on cloud cover characteristics and SYNOP frequency, and cannot be generalized. Instead, a method is presented to reconstruct the unbiased (true HK from HK affected by the collocation differences, which significantly (t-test p < 0.01 improves the validation results.
Constrained statistical inference: sample-size tables for ANOVA and regression

Directory of Open Access Journals (Sweden)

Leonard eVanbrabant

2015-01-01

Full Text Available Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient beta1 is larger than beta2 and beta3. The corresponding hypothesis is H: beta1 > {beta2, beta3} and this is known as an (order constrained hypothesis. A major advantage of testing such a hypothesis is that power can be gained and inherently a smaller sample size is needed. This article discusses this gain in sample size reduction, when an increasing number of constraints is included into the hypothesis. The main goal is to present sample-size tables for constrained hypotheses. A sample-size table contains the necessary sample-size at a prespecified power (say, 0.80 for an increasing number of constraints. To obtain sample-size tables, two Monte Carlo simulations were performed, one for ANOVA and one for multiple regression. Three results are salient. First, in an ANOVA the needed sample-size decreases with 30% to 50% when complete ordering of the parameters is taken into account. Second, small deviations from the imposed order have only a minor impact on the power. Third, at the maximum number of constraints, the linear regression results are comparable with the ANOVA results. However, in the case of fewer constraints, ordering the parameters (e.g., beta1 > beta2 results in a higher power than assigning a positive or a negative sign to the parameters (e.g., beta1 > 0.
CAN'T MISS--conquer any number task by making important statistics simple. Part 2. Probability, populations, samples, and normal distributions.

Science.gov (United States)

Hansen, John P

2003-01-01

Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 2, describes probability, populations, and samples. The uses of descriptive and inferential statistics are outlined. The article also discusses the properties and probability of normal distributions, including the standard normal distribution.
In situ statistical observations of EMIC waves by Arase satellite

Science.gov (United States)

Nomura, R.; Matsuoka, A.; Teramoto, M.; Nose, M.; Yoshizumi, M.; Fujimoto, A.; Shinohara, M.; Tanaka, Y.

2017-12-01

We present in situ statistical survey of electromagnetic ion cyclotron (EMIC) waves observed by Arase satellite from 3 March to 16 July 2017. We identified 64 events using the fluxgate magnetometer (MGF) on the satellite. The EMIC wave is the key phenomena to understand the loss dynamics of MeV-energy electrons in the radiation belt. We will show the radial and latitudinal dependence of the wave occurance rate and the wave parameters (frequency band, coherence, polarization, and ellipticity). Especially the EMIC waves observed at localized weak background magnetic field will be discussed for the wave excitation mechanism in the deep inner magnetosphere.
The role of ensemble-based statistics in variational assimilation of cloud-affected observations from infrared imagers

Science.gov (United States)

Hacker, Joshua; Vandenberghe, Francois; Jung, Byoung-Jo; Snyder, Chris

2017-04-01

Effective assimilation of cloud-affected radiance observations from space-borne imagers, with the aim of improving cloud analysis and forecasting, has proven to be difficult. Large observation biases, nonlinear observation operators, and non-Gaussian innovation statistics present many challenges. Ensemble-variational data assimilation (EnVar) systems offer the benefits of flow-dependent background error statistics from an ensemble, and the ability of variational minimization to handle nonlinearity. The specific benefits of ensemble statistics, relative to static background errors more commonly used in variational systems, have not been quantified for the problem of assimilating cloudy radiances. A simple experiment framework is constructed with a regional NWP model and operational variational data assimilation system, to provide the basis understanding the importance of ensemble statistics in cloudy radiance assimilation. Restricting the observations to those corresponding to clouds in the background forecast leads to innovations that are more Gaussian. The number of large innovations is reduced compared to the more general case of all observations, but not eliminated. The Huber norm is investigated to handle the fat tails of the distributions, and allow more observations to be assimilated without the need for strict background checks that eliminate them. Comparing assimilation using only ensemble background error statistics with assimilation using only static background error statistics elucidates the importance of the ensemble statistics. Although the cost functions in both experiments converge to similar values after sufficient outer-loop iterations, the resulting cloud water, ice, and snow content are greater in the ensemble-based analysis. The subsequent forecasts from the ensemble-based analysis also retain more condensed water species, indicating that the local environment is more supportive of clouds. In this presentation we provide details that explain the
Supporting Students to Develop Concepts Underlying Sampling and to Shuttle Between Contextual and Statistical Spheres

NARCIS (Netherlands)

Bakker, A.; Dierdorp, A.; Maanen, J.A. van; Eijkelhof, H.M.C.

2012-01-01

To stimulate students’ shuttling between contextual and statistical spheres, we based tasks on professional practices. This article focuses on two tasks to support reasoning about sampling by students aged 16-17. The purpose of the tasks was to find out which smaller sample size would have been
Statistical intervals a guide for practitioners

CERN Document Server

Hahn, Gerald J

2011-01-01

Presents a detailed exposition of statistical intervals and emphasizes applications in industry. The discussion differentiates at an elementary level among different kinds of statistical intervals and gives instruction with numerous examples and simple math on how to construct such intervals from sample data. This includes confidence intervals to contain a population percentile, confidence intervals on probability of meeting specified threshold value, and prediction intervals to include observation in a future sample. Also has an appendix containing computer subroutines for nonparametric stati
Filtering a statistically exactly solvable test model for turbulent tracers from partial observations

International Nuclear Information System (INIS)

Gershgorin, B.; Majda, A.J.

2011-01-01

A statistically exactly solvable model for passive tracers is introduced as a test model for the authors' Nonlinear Extended Kalman Filter (NEKF) as well as other filtering algorithms. The model involves a Gaussian velocity field and a passive tracer governed by the advection-diffusion equation with an imposed mean gradient. The model has direct relevance to engineering problems such as the spread of pollutants in the air or contaminants in the water as well as climate change problems concerning the transport of greenhouse gases such as carbon dioxide with strongly intermittent probability distributions consistent with the actual observations of the atmosphere. One of the attractive properties of the model is the existence of the exact statistical solution. In particular, this unique feature of the model provides an opportunity to design and test fast and efficient algorithms for real-time data assimilation based on rigorous mathematical theory for a turbulence model problem with many active spatiotemporal scales. Here, we extensively study the performance of the NEKF which uses the exact first and second order nonlinear statistics without any approximations due to linearization. The role of partial and sparse observations, the frequency of observations and the observation noise strength in recovering the true signal, its spectrum, and fat tail probability distribution are the central issues discussed here. The results of our study provide useful guidelines for filtering realistic turbulent systems with passive tracers through partial observations.
Statistical issues in reporting quality data: small samples and casemix variation.

Science.gov (United States)

Zaslavsky, A M

2001-12-01

To present two key statistical issues that arise in analysis and reporting of quality data. Casemix variation is relevant to quality reporting when the units being measured have differing distributions of patient characteristics that also affect the quality outcome. When this is the case, adjustment using stratification or regression may be appropriate. Such adjustments may be controversial when the patient characteristic does not have an obvious relationship to the outcome. Stratified reporting poses problems for sample size and reporting format, but may be useful when casemix effects vary across units. Although there are no absolute standards of reliability, high reliabilities (interunit F > or = 10 or reliability > or = 0.9) are desirable for distinguishing above- and below-average units. When small or unequal sample sizes complicate reporting, precision may be improved using indirect estimation techniques that incorporate auxiliary information, and 'shrinkage' estimation can help to summarize the strength of evidence about units with small samples. With broader understanding of casemix adjustment and methods for analyzing small samples, quality data can be analysed and reported more accurately.
Statistical evaluation of the data obtained from the K East Basin Sandfilter Backwash Pit samples

International Nuclear Information System (INIS)

Welsh, T.L.

1994-01-01

Samples were obtained from different locations from the K Each Sandfilter Backwash Pit to characterize the sludge material. These samples were analyzed chemically for elements, radionuclides, and residual compounds. The analytical results were statistically analyzed to determine the mean analyte content and the associated variability for each mean value
Discrimination of handlebar grip samples by fourier transform infrared microspectroscopy analysis and statistics

Directory of Open Access Journals (Sweden)

Zeyu Lin

2017-01-01

Full Text Available In this paper, the authors presented a study on the discrimination of handlebar grip samples, to provide effective forensic science service for hit and run traffic cases. 50 bicycle handlebar grip samples, 49 electric bike handlebar grip samples, and 96 motorcycle handlebar grip samples have been randomly collected by the local police in Beijing (China. Fourier transform infrared microspectroscopy (FTIR was utilized as analytical technology. Then, target absorption selection, data pretreatment, and discrimination of linked samples and unlinked samples were chosen as three steps to improve the discrimination of FTIR spectrums collected from different handlebar grip samples. Principal component analysis and receiver operating characteristic curve were utilized to evaluate different data selection methods and different data pretreatment methods, respectively. It is possible to explore the evidential value of handlebar grip residue evidence through instrumental analysis and statistical treatments. It will provide a universal discrimination method for other forensic science samples as well.
Statistical inference for discrete-time samples from affine stochastic delay differential equations

DEFF Research Database (Denmark)

Küchler, Uwe; Sørensen, Michael

2013-01-01

Statistical inference for discrete time observations of an affine stochastic delay differential equation is considered. The main focus is on maximum pseudo-likelihood estimators, which are easy to calculate in practice. A more general class of prediction-based estimating functions is investigated...
Comparative statistical analysis of carcinogenic and non-carcinogenic effects of uranium in groundwater samples from different regions of Punjab, India

International Nuclear Information System (INIS)

Saini, Komal; Singh, Parminder; Bajwa, Bikramjit Singh

2016-01-01

LED flourimeter has been used for microanalysis of uranium concentration in groundwater samples collected from six districts of South West (SW), West (W) and North East (NE) Punjab, India. Average value of uranium content in water samples of SW Punjab is observed to be higher than WHO, USEPA recommended safe limit of 30 µg l −1 as well as AERB proposed limit of 60 µg l −1 . Whereas, for W and NE region of Punjab, average level of uranium concentration was within AERB recommended limit of 60 µg l −1 . Average value observed in SW Punjab is around 3–4 times the value observed in W Punjab, whereas its value is more than 17 times the average value observed in NE region of Punjab. Statistical analysis of carcinogenic as well as non carcinogenic risks due to uranium have been evaluated for each studied district. - Highlights: • Uranium level in groundwater samples have been assessed in different regions of Punjab. • Comparative study of carcinogenic and non carcinogenic effects of uranium has been done. • Wide variation has been found for different geological regions. • It has been found that South west Punjab is worst affected by uranium contamination in its water. • For west and north east regions of Punjab, uranium levels in groundwater laid under recommended safe limits.
Statistical prediction of Late Miocene climate

Digital Repository Service at National Institute of Oceanography (India)

Fernandes, A.A; Gupta, S.M.

by making certain simplifying assumptions; for example in modelling ocean 4 currents, the geostrophic approximation is made. In case of statistical prediction no such a priori assumption need be made. statistical prediction comprises of using observed data... the number of equations. In this case the equations are overdetermined, and therefore one has to look for a solution that best fits the sample data in a least squares sense. To this end we express the sample .... (2.1)+ ry = y + data as follows: n L c. (x...
DWPF Sample Vial Insert Study-Statistical Analysis of DWPF Mock-Up Test Data

International Nuclear Information System (INIS)

Harris, S.P.

1997-01-01

This report is prepared as part of Technical/QA Task Plan WSRC-RP-97-351 which was issued in response to Technical Task Request HLW/DWPF/TTR-970132 submitted by DWPF. Presented in this report is a statistical analysis of DWPF Mock-up test data for evaluation of two new analytical methods which use insert samples from the existing HydragardTM sampler. The first is a new hydrofluoric acid based method called the Cold Chemical Method (Cold Chem) and the second is a modified fusion method.Both new methods use the existing HydragardTM sampler to collect a smaller insert sample from the process sampling system. The insert testing methodology applies to the DWPF Slurry Mix Evaporator (SME) and the Melter Feed Tank (MFT) samples. Samples in small 3 ml containers (Inserts) are analyzed by either the cold chemical method or a modified fusion method. The current analytical method uses a HydragardTM sample station to obtain nearly full 15 ml peanut vials. The samples are prepared by a multi-step process for Inductively Coupled Plasma (ICP) analysis by drying, vitrification, grinding and finally dissolution by either mixed acid or fusion. In contrast, the insert sample is placed directly in the dissolution vessel, thus eliminating the drying, vitrification and grinding operations for the Cold chem method. Although the modified fusion still requires drying and calcine conversion, the process is rapid due to the decreased sample size and that no vitrification step is required.A slurry feed simulant material was acquired from the TNX pilot facility from the test run designated as PX-7.The Mock-up test data were gathered on the basis of a statistical design presented in SRT-SCS-97004 (Rev. 0). Simulant PX-7 samples were taken in the DWPF Analytical Cell Mock-up Facility using 3 ml inserts and 15 ml peanut vials. A number of the insert samples were analyzed by Cold Chem and compared with full peanut vial samples analyzed by the current methods. The remaining inserts were analyzed by
Comparison of long-term Moscow and Danish NLC observations: statistical results

Directory of Open Access Journals (Sweden)

P. Dalin

2006-11-01

Full Text Available Noctilucent clouds (NLC are the highest clouds in the Earth's atmosphere, observed close to the mesopause at 80–90 km altitudes. Systematic NLC observations conducted in Moscow for the period of 1962–2005 and in Denmark for 1983–2005 are compared and statistical results both for seasonally summarized NLC parameters and for individual NLC appearances are described. Careful attention is paid to the weather conditions during each season of observations. This turns out to be a very important factor both for the NLC case study and for long-term data set analysis. Time series of seasonal values show moderate similarity (taking into account the weather conditions but, at the same time, the comparison of individual cases of NLC occurrence reveals substantial differences. There are positive trends in the Moscow and Danish normalized NLC brightness as well as nearly zero trend in the Moscow normalized NLC occurrence frequency but these long-term changes are not statistically significant. The quasi-ten-year cycle in NLC parameters is about 1 year shorter than the solar cycle during the same period. The characteristic scale of NLC fields is estimated for the first time and it is found to be less than 800 km.
Exploring Tree Age & Diameter to Illustrate Sample Design & Inference in Observational Ecology

Science.gov (United States)

Casady, Grant M.

2015-01-01

Undergraduate biology labs often explore the techniques of data collection but neglect the statistical framework necessary to express findings. Students can be confused about how to use their statistical knowledge to address specific biological questions. Growth in the area of observational ecology requires that students gain experience in…

SU-E-I-46: Sample-Size Dependence of Model Observers for Estimating Low-Contrast Detection Performance From CT Images

International Nuclear Information System (INIS)

Reiser, I; Lu, Z

2014-01-01

Purpose: Recently, task-based assessment of diagnostic CT systems has attracted much attention. Detection task performance can be estimated using human observers, or mathematical observer models. While most models are well established, considerable bias can be introduced when performance is estimated from a limited number of image samples. Thus, the purpose of this work was to assess the effect of sample size on bias and uncertainty of two channelized Hotelling observers and a template-matching observer. Methods: The image data used for this study consisted of 100 signal-present and 100 signal-absent regions-of-interest, which were extracted from CT slices. The experimental conditions included two signal sizes and five different x-ray beam current settings (mAs). Human observer performance for these images was determined in 2-alternative forced choice experiments. These data were provided by the Mayo clinic in Rochester, MN. Detection performance was estimated from three observer models, including channelized Hotelling observers (CHO) with Gabor or Laguerre-Gauss (LG) channels, and a template-matching observer (TM). Different sample sizes were generated by randomly selecting a subset of image pairs, (N=20,40,60,80). Observer performance was quantified as proportion of correct responses (PC). Bias was quantified as the relative difference of PC for 20 and 80 image pairs. Results: For n=100, all observer models predicted human performance across mAs and signal sizes. Bias was 23% for CHO (Gabor), 7% for CHO (LG), and 3% for TM. The relative standard deviation, σ(PC)/PC at N=20 was highest for the TM observer (11%) and lowest for the CHO (Gabor) observer (5%). Conclusion: In order to make image quality assessment feasible in the clinical practice, a statistically efficient observer model, that can predict performance from few samples, is needed. Our results identified two observer models that may be suited for this task
Sensitivity of the Hydrogen Epoch of Reionization Array and its build-out stages to one-point statistics from redshifted 21 cm observations

Science.gov (United States)

Kittiwisit, Piyanat; Bowman, Judd D.; Jacobs, Daniel C.; Beardsley, Adam P.; Thyagarajan, Nithyanandan

2018-03-01

We present a baseline sensitivity analysis of the Hydrogen Epoch of Reionization Array (HERA) and its build-out stages to one-point statistics (variance, skewness, and kurtosis) of redshifted 21 cm intensity fluctuation from the Epoch of Reionization (EoR) based on realistic mock observations. By developing a full-sky 21 cm light-cone model, taking into account the proper field of view and frequency bandwidth, utilizing a realistic measurement scheme, and assuming perfect foreground removal, we show that HERA will be able to recover statistics of the sky model with high sensitivity by averaging over measurements from multiple fields. All build-out stages will be able to detect variance, while skewness and kurtosis should be detectable for HERA128 and larger. We identify sample variance as the limiting constraint of the measurements at the end of reionization. The sensitivity can also be further improved by performing frequency windowing. In addition, we find that strong sample variance fluctuation in the kurtosis measured from an individual field of observation indicates the presence of outlying cold or hot regions in the underlying fluctuations, a feature that can potentially be used as an EoR bubble indicator.
Statistical sampling plan for the TRU waste assay facility

International Nuclear Information System (INIS)

Beauchamp, J.J.; Wright, T.; Schultz, F.J.; Haff, K.; Monroe, R.J.

1983-08-01

Due to limited space, there is a need to dispose appropriately of the Oak Ridge National Laboratory transuranic waste which is presently stored below ground in 55-gal (208-l) drums within weather-resistant structures. Waste containing less than 100 nCi/g transuranics can be removed from the present storage and be buried, while waste containing greater than 100 nCi/g transuranics must continue to be retrievably stored. To make the necessary measurements needed to determine the drums that can be buried, a transuranic Neutron Interrogation Assay System (NIAS) has been developed at Los Alamos National Laboratory and can make the needed measurements much faster than previous techniques which involved γ-ray spectroscopy. The previous techniques are reliable but time consuming. Therefore, a validation study has been planned to determine the ability of the NIAS to make adequate measurements. The validation of the NIAS will be based on a paired comparison of a sample of measurements made by the previous techniques and the NIAS. The purpose of this report is to describe the proposed sampling plan and the statistical analyses needed to validate the NIAS. 5 references, 4 figures, 5 tables
Statistics and sampling in transuranic studies

International Nuclear Information System (INIS)

Eberhardt, L.L.; Gilbert, R.O.

1980-01-01

The existing data on transuranics in the environment exhibit a remarkably high variability from sample to sample (coefficients of variation of 100% or greater). This chapter stresses the necessity of adequate sample size and suggests various ways to increase sampling efficiency. Objectives in sampling are regarded as being of great importance in making decisions as to sampling methodology. Four different classes of sampling methods are described: (1) descriptive sampling, (2) sampling for spatial pattern, (3) analytical sampling, and (4) sampling for modeling. A number of research needs are identified in the various sampling categories along with several problems that appear to be common to two or more such areas
Statistical methods for detecting differentially abundant features in clinical metagenomic samples.

Directory of Open Access Journals (Sweden)

James Robert White

2009-04-01

Full Text Available Numerous studies are currently underway to characterize the microbial communities inhabiting our world. These studies aim to dramatically expand our understanding of the microbial biosphere and, more importantly, hope to reveal the secrets of the complex symbiotic relationship between us and our commensal bacterial microflora. An important prerequisite for such discoveries are computational tools that are able to rapidly and accurately compare large datasets generated from complex bacterial communities to identify features that distinguish them.We present a statistical method for comparing clinical metagenomic samples from two treatment populations on the basis of count data (e.g. as obtained through sequencing to detect differentially abundant features. Our method, Metastats, employs the false discovery rate to improve specificity in high-complexity environments, and separately handles sparsely-sampled features using Fisher's exact test. Under a variety of simulations, we show that Metastats performs well compared to previously used methods, and significantly outperforms other methods for features with sparse counts. We demonstrate the utility of our method on several datasets including a 16S rRNA survey of obese and lean human gut microbiomes, COG functional profiles of infant and mature gut microbiomes, and bacterial and viral metabolic subsystem data inferred from random sequencing of 85 metagenomes. The application of our method to the obesity dataset reveals differences between obese and lean subjects not reported in the original study. For the COG and subsystem datasets, we provide the first statistically rigorous assessment of the differences between these populations. The methods described in this paper are the first to address clinical metagenomic datasets comprising samples from multiple subjects. Our methods are robust across datasets of varied complexity and sampling level. While designed for metagenomic applications, our software
Statistical assessment of fish behavior from split-beam hydro-acoustic sampling

International Nuclear Information System (INIS)

McKinstry, Craig A.; Simmons, Mary Ann; Simmons, Carver S.; Johnson, Robert L.

2005-01-01

Statistical methods are presented for using echo-traces from split-beam hydro-acoustic sampling to assess fish behavior in response to a stimulus. The data presented are from a study designed to assess the response of free-ranging, lake-resident fish, primarily kokanee (Oncorhynchus nerka) and rainbow trout (Oncorhynchus mykiss) to high intensity strobe lights, and was conducted at Grand Coulee Dam on the Columbia River in Northern Washington State. The lights were deployed immediately upstream from the turbine intakes, in a region exposed to daily alternating periods of high and low flows. The study design included five down-looking split-beam transducers positioned in a line at incremental distances upstream from the strobe lights, and treatments applied in randomized pseudo-replicate blocks. Statistical methods included the use of odds-ratios from fitted loglinear models. Fish-track velocity vectors were modeled using circular probability distributions. Both analyses are depicted graphically. Study results suggest large increases of fish activity in the presence of the strobe lights, most notably at night and during periods of low flow. The lights also induced notable bimodality in the angular distributions of the fish track velocity vectors. Statistical/SUMmaries are presented along with interpretations on fish behavior
A statistical evaluation of asbestos air concentrations

International Nuclear Information System (INIS)

Lange, J.H.

1999-01-01

Both area and personal air samples collected during an asbestos abatement project were matched and statistically analysed. Among the many parameters studied were fibre concentrations and their variability. Mean values for area and personal samples were 0.005 and 0.024 f cm - - 3 of air, respectively. Summary values for area and personal samples suggest that exposures are low with no single exposure value exceeding the current OSHA TWA value of 0.1 f cm -3 of air. Within- and between-worker analysis suggests that these data are homogeneous. Comparison of within- and between-worker values suggests that the exposure source and variability for abatement are more related to the process than individual practices. This supports the importance of control measures for abatement. Study results also suggest that area and personal samples are not statistically related, that is, there is no association observed for these two sampling methods when data are analysed by correlation or regression analysis. Personal samples were statistically higher in concentration than area samples. Area sampling cannot be used as a surrogate exposure for asbestos abatement workers. (author)
Statistical Inference for Data Adaptive Target Parameters.

Science.gov (United States)

Hubbard, Alan E; Kherad-Pajouh, Sara; van der Laan, Mark J

2016-05-01

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science.
USING STATISTICAL SURVEY IN ECONOMICS

Directory of Open Access Journals (Sweden)

Delia TESELIOS

2012-01-01

Full Text Available Statistical survey is an effective method of statistical investigation that involves gathering quantitative data, which is often preferred in statistical reports due to the information which can be obtained regarding the entire population studied by observing a part of it. Therefore, because of the information provided, polls are used in many research areas. In economics, statistics are used in the decision making process in choosing competitive strategies in the analysis of certain economic phenomena, the formulation of forecasts. Economic study presented in this paper is to illustrate how a simple random sampling is used to analyze the existing parking spaces situation in a given locality.
Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data

Directory of Open Access Journals (Sweden)

Alexander P. Kartun-Giles

2018-04-01

Full Text Available A projective network model is a model that enables predictions to be made based on a subsample of the network data, with the predictions remaining unchanged if a larger sample is taken into consideration. An exchangeable model is a model that does not depend on the order in which nodes are sampled. Despite a large variety of non-equilibrium (growing and equilibrium (static sparse complex network models that are widely used in network science, how to reconcile sparseness (constant average degree with the desired statistical properties of projectivity and exchangeability is currently an outstanding scientific problem. Here we propose a network process with hidden variables which is projective and can generate sparse power-law networks. Despite the model not being exchangeable, it can be closely related to exchangeable uncorrelated networks as indicated by its information theory characterization and its network entropy. The use of the proposed network process as a null model is here tested on real data, indicating that the model offers a promising avenue for statistical network modelling.
Apparatus for observing a sample with a particle beam and an optical microscope

NARCIS (Netherlands)

2010-01-01

An apparatus for observing a sample (1) with a TEM column and an optical high resolution scanning microscope (10). The sample position when observing the sample with the TEM column differs from the sample position when observing the sample with the optical microscope in that in the latter case the
The statistical-inference approach to generalized thermodynamics

International Nuclear Information System (INIS)

Lavenda, B.H.; Scherer, C.

1987-01-01

Limit theorems, such as the central-limit theorem and the weak law of large numbers, are applicable to statistical thermodynamics for sufficiently large sample size of indipendent and identically distributed observations performed on extensive thermodynamic (chance) variables. The estimation of the intensive thermodynamic quantities is a problem in parametric statistical estimation. The normal approximation to the Gibbs' distribution is justified by the analysis of large deviations. Statistical thermodynamics is generalized to include the statistical estimation of variance as well as mean values
DWPF Sample Vial Insert Study-Statistical Analysis of DWPF Mock-Up Test Data

Energy Technology Data Exchange (ETDEWEB)

Harris, S.P. [Westinghouse Savannah River Company, AIKEN, SC (United States)

1997-09-18

This report is prepared as part of Technical/QA Task Plan WSRC-RP-97-351 which was issued in response to Technical Task Request HLW/DWPF/TTR-970132 submitted by DWPF. Presented in this report is a statistical analysis of DWPF Mock-up test data for evaluation of two new analytical methods which use insert samples from the existing HydragardTM sampler. The first is a new hydrofluoric acid based method called the Cold Chemical Method (Cold Chem) and the second is a modified fusion method.Either new DWPF analytical method could result in a two to three fold improvement in sample analysis time.Both new methods use the existing HydragardTM sampler to collect a smaller insert sample from the process sampling system. The insert testing methodology applies to the DWPF Slurry Mix Evaporator (SME) and the Melter Feed Tank (MFT) samples.The insert sample is named after the initial trials which placed the container inside the sample (peanut) vials. Samples in small 3 ml containers (Inserts) are analyzed by either the cold chemical method or a modified fusion method. The current analytical method uses a HydragardTM sample station to obtain nearly full 15 ml peanut vials. The samples are prepared by a multi-step process for Inductively Coupled Plasma (ICP) analysis by drying, vitrification, grinding and finally dissolution by either mixed acid or fusion. In contrast, the insert sample is placed directly in the dissolution vessel, thus eliminating the drying, vitrification and grinding operations for the Cold chem method. Although the modified fusion still requires drying and calcine conversion, the process is rapid due to the decreased sample size and that no vitrification step is required.A slurry feed simulant material was acquired from the TNX pilot facility from the test run designated as PX-7.The Mock-up test data were gathered on the basis of a statistical design presented in SRT-SCS-97004 (Rev. 0). Simulant PX-7 samples were taken in the DWPF Analytical Cell Mock
THE ORIGIN OF THE INFRARED EMISSION IN RADIO GALAXIES. II. ANALYSIS OF MID- TO FAR-INFRARED SPITZER OBSERVATIONS OF THE 2JY SAMPLE

NARCIS (Netherlands)

Dicken, D.; Tadhunter, C.; Axon, D.; Morganti, R.; Inskip, K. J.; Holt, J.; Delgado, R. Gonzalez; Groves, B.

2009-01-01

We present an analysis of deep mid- to far-infrared (MFIR) Spitzer photometric observations of the southern 2Jy sample of powerful radio sources (0.05 statistical investigation of the links between radio jet, active galactic nucleus (AGN), starburst activity and MFIR
A statistically rigorous sampling design to integrate avian monitoring and management within Bird Conservation Regions.

Science.gov (United States)

Pavlacky, David C; Lukacs, Paul M; Blakesley, Jennifer A; Skorkowsky, Robert C; Klute, David S; Hahn, Beth A; Dreitz, Victoria J; George, T Luke; Hanni, David J

2017-01-01

Monitoring is an essential component of wildlife management and conservation. However, the usefulness of monitoring data is often undermined by the lack of 1) coordination across organizations and regions, 2) meaningful management and conservation objectives, and 3) rigorous sampling designs. Although many improvements to avian monitoring have been discussed, the recommendations have been slow to emerge in large-scale programs. We introduce the Integrated Monitoring in Bird Conservation Regions (IMBCR) program designed to overcome the above limitations. Our objectives are to outline the development of a statistically defensible sampling design to increase the value of large-scale monitoring data and provide example applications to demonstrate the ability of the design to meet multiple conservation and management objectives. We outline the sampling process for the IMBCR program with a focus on the Badlands and Prairies Bird Conservation Region (BCR 17). We provide two examples for the Brewer's sparrow (Spizella breweri) in BCR 17 demonstrating the ability of the design to 1) determine hierarchical population responses to landscape change and 2) estimate hierarchical habitat relationships to predict the response of the Brewer's sparrow to conservation efforts at multiple spatial scales. The collaboration across organizations and regions provided economy of scale by leveraging a common data platform over large spatial scales to promote the efficient use of monitoring resources. We designed the IMBCR program to address the information needs and core conservation and management objectives of the participating partner organizations. Although it has been argued that probabilistic sampling designs are not practical for large-scale monitoring, the IMBCR program provides a precedent for implementing a statistically defensible sampling design from local to bioregional scales. We demonstrate that integrating conservation and management objectives with rigorous statistical
A statistically rigorous sampling design to integrate avian monitoring and management within Bird Conservation Regions.

Directory of Open Access Journals (Sweden)

David C Pavlacky

Full Text Available Monitoring is an essential component of wildlife management and conservation. However, the usefulness of monitoring data is often undermined by the lack of 1 coordination across organizations and regions, 2 meaningful management and conservation objectives, and 3 rigorous sampling designs. Although many improvements to avian monitoring have been discussed, the recommendations have been slow to emerge in large-scale programs. We introduce the Integrated Monitoring in Bird Conservation Regions (IMBCR program designed to overcome the above limitations. Our objectives are to outline the development of a statistically defensible sampling design to increase the value of large-scale monitoring data and provide example applications to demonstrate the ability of the design to meet multiple conservation and management objectives. We outline the sampling process for the IMBCR program with a focus on the Badlands and Prairies Bird Conservation Region (BCR 17. We provide two examples for the Brewer's sparrow (Spizella breweri in BCR 17 demonstrating the ability of the design to 1 determine hierarchical population responses to landscape change and 2 estimate hierarchical habitat relationships to predict the response of the Brewer's sparrow to conservation efforts at multiple spatial scales. The collaboration across organizations and regions provided economy of scale by leveraging a common data platform over large spatial scales to promote the efficient use of monitoring resources. We designed the IMBCR program to address the information needs and core conservation and management objectives of the participating partner organizations. Although it has been argued that probabilistic sampling designs are not practical for large-scale monitoring, the IMBCR program provides a precedent for implementing a statistically defensible sampling design from local to bioregional scales. We demonstrate that integrating conservation and management objectives with rigorous
New Hybrid Monte Carlo methods for efficient sampling. From physics to biology and statistics

International Nuclear Information System (INIS)

Akhmatskaya, Elena; Reich, Sebastian

2011-01-01

We introduce a class of novel hybrid methods for detailed simulations of large complex systems in physics, biology, materials science and statistics. These generalized shadow Hybrid Monte Carlo (GSHMC) methods combine the advantages of stochastic and deterministic simulation techniques. They utilize a partial momentum update to retain some of the dynamical information, employ modified Hamiltonians to overcome exponential performance degradation with the system’s size and make use of multi-scale nature of complex systems. Variants of GSHMCs were developed for atomistic simulation, particle simulation and statistics: GSHMC (thermodynamically consistent implementation of constant-temperature molecular dynamics), MTS-GSHMC (multiple-time-stepping GSHMC), meso-GSHMC (Metropolis corrected dissipative particle dynamics (DPD) method), and a generalized shadow Hamiltonian Monte Carlo, GSHmMC (a GSHMC for statistical simulations). All of these are compatible with other enhanced sampling techniques and suitable for massively parallel computing allowing for a range of multi-level parallel strategies. A brief description of the GSHMC approach, examples of its application on high performance computers and comparison with other existing techniques are given. Our approach is shown to resolve such problems as resonance instabilities of the MTS methods and non-preservation of thermodynamic equilibrium properties in DPD, and to outperform known methods in sampling efficiency by an order of magnitude. (author)
Micro-organism distribution sampling for bioassays

Science.gov (United States)

Nelson, B. A.

1975-01-01

Purpose of sampling distribution is to characterize sample-to-sample variation so statistical tests may be applied, to estimate error due to sampling (confidence limits) and to evaluate observed differences between samples. Distribution could be used for bioassays taken in hospitals, breweries, food-processing plants, and pharmaceutical plants.
Statistically Optimized Inversion Algorithm for Enhanced Retrieval of Aerosol Properties from Spectral Multi-Angle Polarimetric Satellite Observations

Science.gov (United States)

Dubovik, O; Herman, M.; Holdak, A.; Lapyonok, T.; Taure, D.; Deuze, J. L.; Ducos, F.; Sinyuk, A.

2011-01-01

The proposed development is an attempt to enhance aerosol retrieval by emphasizing statistical optimization in inversion of advanced satellite observations. This optimization concept improves retrieval accuracy relying on the knowledge of measurement error distribution. Efficient application of such optimization requires pronounced data redundancy (excess of the measurements number over number of unknowns) that is not common in satellite observations. The POLDER imager on board the PARASOL microsatellite registers spectral polarimetric characteristics of the reflected atmospheric radiation at up to 16 viewing directions over each observed pixel. The completeness of such observations is notably higher than for most currently operating passive satellite aerosol sensors. This provides an opportunity for profound utilization of statistical optimization principles in satellite data inversion. The proposed retrieval scheme is designed as statistically optimized multi-variable fitting of all available angular observations obtained by the POLDER sensor in the window spectral channels where absorption by gas is minimal. The total number of such observations by PARASOL always exceeds a hundred over each pixel and the statistical optimization concept promises to be efficient even if the algorithm retrieves several tens of aerosol parameters. Based on this idea, the proposed algorithm uses a large number of unknowns and is aimed at retrieval of extended set of parameters affecting measured radiation.
Statistical analysis of biomechanical properties of the adult skull and age-related structural changes by sex in a Japanese forensic sample.

Science.gov (United States)

Torimitsu, Suguru; Nishida, Yoshifumi; Takano, Tachio; Koizumi, Yoshinori; Makino, Yohsuke; Yajima, Daisuke; Hayakawa, Mutsumi; Inokuchi, Go; Motomura, Ayumi; Chiba, Fumiko; Otsuka, Katsura; Kobayashi, Kazuhiro; Odo, Yuriko; Iwase, Hirotaro

2014-01-01

The purpose of this research was to investigate the biomechanical properties of the adult human skull and the structural changes that occur with age in both sexes. The heads of 94 Japanese cadavers (54 male cadavers, 40 female cadavers) autopsied in our department were used in this research. A total of 376 cranial samples, four from each skull, were collected. Sample fracture load was measured by a bending test. A statistically significant negative correlation between the sample fracture load and cadaver age was found. This indicates that the stiffness of cranial bones in Japanese individuals decreases with age, and the risk of skull fracture thus probably increases with age. Prior to the bending test, the sample mass, the sample thickness, the ratio of the sample thickness to cadaver stature (ST/CS), and the sample density were measured and calculated. Significant negative correlations between cadaver age and sample thickness, ST/CS, and the sample density were observed only among the female samples. Computerized tomographic (CT) images of 358 cranial samples were available. The computed tomography value (CT value) of cancellous bone which refers to a quantitative scale for describing radiodensity, cancellous bone thickness and cortical bone thickness were measured and calculated. Significant negative correlation between cadaver age and the CT value or cortical bone thickness was observed only among the female samples. These findings suggest that the skull is substantially affected by decreased bone metabolism resulting from osteoporosis. Therefore, osteoporosis prevention and treatment may increase cranial stiffness and reinforce the skull structure, leading to a decrease in the risk of skull fractures. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

Statistical searches for microlensing events in large, non-uniformly sampled time-domain surveys: A test using palomar transient factory data

Energy Technology Data Exchange (ETDEWEB)

Price-Whelan, Adrian M.; Agüeros, Marcel A. [Department of Astronomy, Columbia University, 550 W 120th Street, New York, NY 10027 (United States); Fournier, Amanda P. [Department of Physics, Broida Hall, University of California, Santa Barbara, CA 93106 (United States); Street, Rachel [Las Cumbres Observatory Global Telescope Network, Inc., 6740 Cortona Drive, Suite 102, Santa Barbara, CA 93117 (United States); Ofek, Eran O. [Benoziyo Center for Astrophysics, Weizmann Institute of Science, 76100 Rehovot (Israel); Covey, Kevin R. [Lowell Observatory, 1400 West Mars Hill Road, Flagstaff, AZ 86001 (United States); Levitan, David; Sesar, Branimir [Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125 (United States); Laher, Russ R.; Surace, Jason, E-mail: adrn@astro.columbia.edu [Spitzer Science Center, California Institute of Technology, Mail Stop 314-6, Pasadena, CA 91125 (United States)

2014-01-20

Many photometric time-domain surveys are driven by specific goals, such as searches for supernovae or transiting exoplanets, which set the cadence with which fields are re-imaged. In the case of the Palomar Transient Factory (PTF), several sub-surveys are conducted in parallel, leading to non-uniform sampling over its ∼20,000 deg{sup 2} footprint. While the median 7.26 deg{sup 2} PTF field has been imaged ∼40 times in the R band, ∼2300 deg{sup 2} have been observed >100 times. We use PTF data to study the trade off between searching for microlensing events in a survey whose footprint is much larger than that of typical microlensing searches, but with far-from-optimal time sampling. To examine the probability that microlensing events can be recovered in these data, we test statistics used on uniformly sampled data to identify variables and transients. We find that the von Neumann ratio performs best for identifying simulated microlensing events in our data. We develop a selection method using this statistic and apply it to data from fields with >10 R-band observations, 1.1 × 10{sup 9} light curves, uncovering three candidate microlensing events. We lack simultaneous, multi-color photometry to confirm these as microlensing events. However, their number is consistent with predictions for the event rate in the PTF footprint over the survey's three years of operations, as estimated from near-field microlensing models. This work can help constrain all-sky event rate predictions and tests microlensing signal recovery in large data sets, which will be useful to future time-domain surveys, such as that planned with the Large Synoptic Survey Telescope.
Statistical characteristics of L1 carrier phase observations from four low-cost GPS receivers

DEFF Research Database (Denmark)

Cederholm, Jens Peter

2010-01-01

Statistical properties of L1 carrier phase observations from four low-cost GPS receivers are investigated through a case study. The observations are collected on a zero baseline with a frequency of 1 Hz and processed with a double difference model. The carrier phase residuals from an ambiguity...
Chemometric and Statistical Analyses of ToF-SIMS Spectra of Increasingly Complex Biological Samples

Energy Technology Data Exchange (ETDEWEB)

Berman, E S; Wu, L; Fortson, S L; Nelson, D O; Kulp, K S; Wu, K J

2007-10-24

Characterizing and classifying molecular variation within biological samples is critical for determining fundamental mechanisms of biological processes that will lead to new insights including improved disease understanding. Towards these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance, including monosaccharide isomers, pure proteins, complex protein mixtures, and mouse embryo tissues. The complex mass spectral data sets produced were analyzed using five common statistical and chemometric multivariate analysis techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision tree analysis by recursive partitioning. PCA was found to be a valuable first step in multivariate analysis, providing insight both into the relative groupings of samples and into the molecular basis for those groupings. For the monosaccharides, pure proteins and protein mixture samples, all of LDA, PLSDA, and SIMCA were found to produce excellent classification given a sufficient number of compound variables calculated. For the mouse embryo tissues, however, SIMCA did not produce as accurate a classification. The decision tree analysis was found to be the least successful for all the data sets, providing neither as accurate a classification nor chemical insight for any of the tested samples. Based on these results we conclude that as the complexity of the sample increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification analysis. This study demonstrates the strength of ToF-SIMS combined with multivariate statistical and chemometric techniques to classify increasingly complex biological samples
Statistic analyses of the color experience according to the age of the observer.

Science.gov (United States)

Hunjet, Anica; Parac-Osterman, Durdica; Vucaj, Edita

2013-04-01

Psychological experience of color is a real state of the communication between the environment and color, and it will depend on the source of the light, angle of the view, and particular on the observer and his health condition. Hering's theory or a theory of the opponent processes supposes that cones, which are situated in the retina of the eye, are not sensible on the three chromatic domains (areas, fields, zones) (red, green and purple-blue), but they produce a signal based on the principle of the opposed pairs of colors. A reason of this theory depends on the fact that certain disorders of the color eyesight, which include blindness to certain colors, cause blindness to pairs of opponent colors. This paper presents a demonstration of the experience of blue and yellow tone according to the age of the observer. For the testing of the statistically significant differences in the omission in the color experience according to the color of the background we use following statistical tests: Mann-Whitnney U Test, Kruskal-Wallis ANOVA and Median test. It was proven that the differences are statistically significant in the elderly persons (older than 35 years).
Examination of statistical noise in SPECT image and sampling pitch

International Nuclear Information System (INIS)

Takaki, Akihiro; Soma, Tsutomu; Murase, Kenya; Watanabe, Hiroyuki; Murakami, Tomonori; Kawakami, Kazunori; Teraoka, Satomi; Kojima, Akihiro; Matsumoto, Masanori

2008-01-01

Statistical noise in single photon emission computed tomography (SPECT) image was examined for its relation with total count and with sampling pitch by simulation and phantom experiment to obtain their projection data under defined conditions. The former SPECT simulation was performed on assumption of a virtual, homogeneous water column (20 cm diameter) as an absorbing mass. In the latter, used were 3D-Hoffman brain phantom (Data Spectrum Corp.) filled with 370 MBq of 99m Tc-pertechnetate solution and a facing 2-detector SPECT machine with a low-energy/high-resolution collimator, E-CAM (Siemens). Projected data by the two methods were reconstructed through the filtered back projection to make each transaxial image. The noise was evaluated by vision, by their root mean square uncertainty calculated from average count and standard deviation (SD) in the region of interest (ROI) defined in reconstructed images and by normalized mean squares calculated from the difference between the reference image obtained with common sampling pitch to and all of obtained slices of, the simulation and phantom. As a conclusion, the pitch was recommended to be set in the machine as to approximating the value calculated by the sampling theorem, though the projection counts per one angular direction were smaller with the same total time of data acquisition. (R.T.)
A statistical evaluation of asbestos air concentrations

Energy Technology Data Exchange (ETDEWEB)

Lange, J.H. [Envirosafe Training and Consultants, Pittsburgh, PA (United States)

1999-07-01

Both area and personal air samples collected during an asbestos abatement project were matched and statistically analysed. Among the many parameters studied were fibre concentrations and their variability. Mean values for area and personal samples were 0.005 and 0.024 f cm{sup -}-{sup 3} of air, respectively. Summary values for area and personal samples suggest that exposures are low with no single exposure value exceeding the current OSHA TWA value of 0.1 f cm{sup -3} of air. Within- and between-worker analysis suggests that these data are homogeneous. Comparison of within- and between-worker values suggests that the exposure source and variability for abatement are more related to the process than individual practices. This supports the importance of control measures for abatement. Study results also suggest that area and personal samples are not statistically related, that is, there is no association observed for these two sampling methods when data are analysed by correlation or regression analysis. Personal samples were statistically higher in concentration than area samples. Area sampling cannot be used as a surrogate exposure for asbestos abatement workers. (author)
Statistical energy as a tool for binning-free, multivariate goodness-of-fit tests, two-sample comparison and unfolding

International Nuclear Information System (INIS)

Aslan, B.; Zech, G.

2005-01-01

We introduce the novel concept of statistical energy as a statistical tool. We define statistical energy of statistical distributions in a similar way as for electric charge distributions. Charges of opposite sign are in a state of minimum energy if they are equally distributed. This property is used to check whether two samples belong to the same parent distribution, to define goodness-of-fit tests and to unfold distributions distorted by measurement. The approach is binning-free and especially powerful in multidimensional applications
On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.

Directory of Open Access Journals (Sweden)

Paul B Conn

Full Text Available Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook's notion of an independent variable hull (IVH, developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models.
Time-of-Flight Measurements as a Possible Method to Observe Anyonic Statistics

Science.gov (United States)

Umucalılar, R. O.; Macaluso, E.; Comparin, T.; Carusotto, I.

2018-06-01

We propose a standard time-of-flight experiment as a method for observing the anyonic statistics of quasiholes in a fractional quantum Hall state of ultracold atoms. The quasihole states can be stably prepared by pinning the quasiholes with localized potentials and a measurement of the mean square radius of the freely expanding cloud, which is related to the average total angular momentum of the initial state, offers direct signatures of the statistical phase. Our proposed method is validated by Monte Carlo calculations for ν =1 /2 and 1 /3 fractional quantum Hall liquids containing a realistic number of particles. Extensions to quantum Hall liquids of light and to non-Abelian anyons are briefly discussed.
Sample Reuse in Statistical Remodeling.

Science.gov (United States)

1987-08-01

as the jackknife and bootstrap, is an expansion of the functional, T(Fn), or of its distribution function or both. Frangos and Schucany (1987a) used...accelerated bootstrap. In the same report Frangos and Schucany demonstrated the small sample superiority of that approach over the proposals that take...higher order terms of an Edgeworth expansion into account. In a second report Frangos and Schucany (1987b) examined the small sample performance of
Seasonal rationalization of river water quality sampling locations: a comparative study of the modified Sanders and multivariate statistical approaches.

Science.gov (United States)

Varekar, Vikas; Karmakar, Subhankar; Jha, Ramakar

2016-02-01

approach outperforms FA/PCA when limited water quality and extensive watershed information is available. The available water quality dataset is limited and FA/PCA-based approach fails to identify monitoring locations with higher variation, as these multivariate statistical approaches are data-driven. The priority/hierarchy and number of sampling sites designed by modified Sanders approach are well justified by the land use practices and observed river basin characteristics of the study area.
Parametric statistical inference for discretely observed diffusion processes

DEFF Research Database (Denmark)

Pedersen, Asger Roer

Part 1: Theoretical results Part 2: Statistical applications of Gaussian diffusion processes in freshwater ecology......Part 1: Theoretical results Part 2: Statistical applications of Gaussian diffusion processes in freshwater ecology...
Assessment of statistical uncertainty in the quantitative analysis of solid samples in motion using laser-induced breakdown spectroscopy

Energy Technology Data Exchange (ETDEWEB)

Cabalin, L.M.; Gonzalez, A. [Department of Analytical Chemistry, University of Malaga, E-29071 Malaga (Spain); Ruiz, J. [Department of Applied Physics I, University of Malaga, E-29071 Malaga (Spain); Laserna, J.J., E-mail: laserna@uma.e [Department of Analytical Chemistry, University of Malaga, E-29071 Malaga (Spain)

2010-08-15

Statistical uncertainty in the quantitative analysis of solid samples in motion by laser-induced breakdown spectroscopy (LIBS) has been assessed. For this purpose, a LIBS demonstrator was designed and constructed in our laboratory. The LIBS system consisted of a laboratory-scale conveyor belt, a compact optical module and a Nd:YAG laser operating at 532 nm. The speed of the conveyor belt was variable and could be adjusted up to a maximum speed of 2 m s{sup -1}. Statistical uncertainty in the analytical measurements was estimated in terms of precision (reproducibility and repeatability) and accuracy. The results obtained by LIBS on shredded scrap samples under real conditions have demonstrated that the analytical precision and accuracy of LIBS is dependent on the sample geometry, position on the conveyor belt and surface cleanliness. Flat, relatively clean scrap samples exhibited acceptable reproducibility and repeatability; by contrast, samples with an irregular shape or a dirty surface exhibited a poor relative standard deviation.
Assessment of statistical uncertainty in the quantitative analysis of solid samples in motion using laser-induced breakdown spectroscopy

Science.gov (United States)

Cabalín, L. M.; González, A.; Ruiz, J.; Laserna, J. J.

2010-08-01

Statistical uncertainty in the quantitative analysis of solid samples in motion by laser-induced breakdown spectroscopy (LIBS) has been assessed. For this purpose, a LIBS demonstrator was designed and constructed in our laboratory. The LIBS system consisted of a laboratory-scale conveyor belt, a compact optical module and a Nd:YAG laser operating at 532 nm. The speed of the conveyor belt was variable and could be adjusted up to a maximum speed of 2 m s - 1 . Statistical uncertainty in the analytical measurements was estimated in terms of precision (reproducibility and repeatability) and accuracy. The results obtained by LIBS on shredded scrap samples under real conditions have demonstrated that the analytical precision and accuracy of LIBS is dependent on the sample geometry, position on the conveyor belt and surface cleanliness. Flat, relatively clean scrap samples exhibited acceptable reproducibility and repeatability; by contrast, samples with an irregular shape or a dirty surface exhibited a poor relative standard deviation.
Assessment of statistical uncertainty in the quantitative analysis of solid samples in motion using laser-induced breakdown spectroscopy

International Nuclear Information System (INIS)

Cabalin, L.M.; Gonzalez, A.; Ruiz, J.; Laserna, J.J.

2010-01-01

Statistical uncertainty in the quantitative analysis of solid samples in motion by laser-induced breakdown spectroscopy (LIBS) has been assessed. For this purpose, a LIBS demonstrator was designed and constructed in our laboratory. The LIBS system consisted of a laboratory-scale conveyor belt, a compact optical module and a Nd:YAG laser operating at 532 nm. The speed of the conveyor belt was variable and could be adjusted up to a maximum speed of 2 m s -1 . Statistical uncertainty in the analytical measurements was estimated in terms of precision (reproducibility and repeatability) and accuracy. The results obtained by LIBS on shredded scrap samples under real conditions have demonstrated that the analytical precision and accuracy of LIBS is dependent on the sample geometry, position on the conveyor belt and surface cleanliness. Flat, relatively clean scrap samples exhibited acceptable reproducibility and repeatability; by contrast, samples with an irregular shape or a dirty surface exhibited a poor relative standard deviation.
A Preliminary Study on Sensitivity and Uncertainty Analysis with Statistic Method: Uncertainty Analysis with Cross Section Sampling from Lognormal Distribution

Energy Technology Data Exchange (ETDEWEB)

Song, Myung Sub; Kim, Song Hyun; Kim, Jong Kyung [Hanyang Univ., Seoul (Korea, Republic of); Noh, Jae Man [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2013-10-15

The uncertainty evaluation with statistical method is performed by repetition of transport calculation with sampling the directly perturbed nuclear data. Hence, the reliable uncertainty result can be obtained by analyzing the results of the numerous transport calculations. One of the problems in the uncertainty analysis with the statistical approach is known as that the cross section sampling from the normal (Gaussian) distribution with relatively large standard deviation leads to the sampling error of the cross sections such as the sampling of the negative cross section. Some collection methods are noted; however, the methods can distort the distribution of the sampled cross sections. In this study, a sampling method of the nuclear data is proposed by using lognormal distribution. After that, the criticality calculations with sampled nuclear data are performed and the results are compared with that from the normal distribution which is conventionally used in the previous studies. In this study, the statistical sampling method of the cross section with the lognormal distribution was proposed to increase the sampling accuracy without negative sampling error. Also, a stochastic cross section sampling and writing program was developed. For the sensitivity and uncertainty analysis, the cross section sampling was pursued with the normal and lognormal distribution. The uncertainties, which are caused by covariance of (n,.) cross sections, were evaluated by solving GODIVA problem. The results show that the sampling method with lognormal distribution can efficiently solve the negative sampling problem referred in the previous studies. It is expected that this study will contribute to increase the accuracy of the sampling-based uncertainty analysis.
A Preliminary Study on Sensitivity and Uncertainty Analysis with Statistic Method: Uncertainty Analysis with Cross Section Sampling from Lognormal Distribution

International Nuclear Information System (INIS)

Song, Myung Sub; Kim, Song Hyun; Kim, Jong Kyung; Noh, Jae Man

2013-01-01

The uncertainty evaluation with statistical method is performed by repetition of transport calculation with sampling the directly perturbed nuclear data. Hence, the reliable uncertainty result can be obtained by analyzing the results of the numerous transport calculations. One of the problems in the uncertainty analysis with the statistical approach is known as that the cross section sampling from the normal (Gaussian) distribution with relatively large standard deviation leads to the sampling error of the cross sections such as the sampling of the negative cross section. Some collection methods are noted; however, the methods can distort the distribution of the sampled cross sections. In this study, a sampling method of the nuclear data is proposed by using lognormal distribution. After that, the criticality calculations with sampled nuclear data are performed and the results are compared with that from the normal distribution which is conventionally used in the previous studies. In this study, the statistical sampling method of the cross section with the lognormal distribution was proposed to increase the sampling accuracy without negative sampling error. Also, a stochastic cross section sampling and writing program was developed. For the sensitivity and uncertainty analysis, the cross section sampling was pursued with the normal and lognormal distribution. The uncertainties, which are caused by covariance of (n,.) cross sections, were evaluated by solving GODIVA problem. The results show that the sampling method with lognormal distribution can efficiently solve the negative sampling problem referred in the previous studies. It is expected that this study will contribute to increase the accuracy of the sampling-based uncertainty analysis
Parameter sampling capabilities of sequential and simultaneous data assimilation: II. Statistical analysis of numerical results

International Nuclear Information System (INIS)

Fossum, Kristian; Mannseth, Trond

2014-01-01

We assess and compare parameter sampling capabilities of one sequential and one simultaneous Bayesian, ensemble-based, joint state-parameter (JS) estimation method. In the companion paper, part I (Fossum and Mannseth 2014 Inverse Problems 30 114002), analytical investigations lead us to propose three claims, essentially stating that the sequential method can be expected to outperform the simultaneous method for weakly nonlinear forward models. Here, we assess the reliability and robustness of these claims through statistical analysis of results from a range of numerical experiments. Samples generated by the two approximate JS methods are compared to samples from the posterior distribution generated by a Markov chain Monte Carlo method, using four approximate measures of distance between probability distributions. Forward-model nonlinearity is assessed from a stochastic nonlinearity measure allowing for sufficiently large model dimensions. Both toy models (with low computational complexity, and where the nonlinearity is fairly easy to control) and two-phase porous-media flow models (corresponding to down-scaled versions of problems to which the JS methods have been frequently applied recently) are considered in the numerical experiments. Results from the statistical analysis show strong support of all three claims stated in part I. (paper)
Statistical scaling of pore-scale Lagrangian velocities in natural porous media.

Science.gov (United States)

Siena, M; Guadagnini, A; Riva, M; Bijeljic, B; Pereira Nunes, J P; Blunt, M J

2014-08-01

We investigate the scaling behavior of sample statistics of pore-scale Lagrangian velocities in two different rock samples, Bentheimer sandstone and Estaillades limestone. The samples are imaged using x-ray computer tomography with micron-scale resolution. The scaling analysis relies on the study of the way qth-order sample structure functions (statistical moments of order q of absolute increments) of Lagrangian velocities depend on separation distances, or lags, traveled along the mean flow direction. In the sandstone block, sample structure functions of all orders exhibit a power-law scaling within a clearly identifiable intermediate range of lags. Sample structure functions associated with the limestone block display two diverse power-law regimes, which we infer to be related to two overlapping spatially correlated structures. In both rocks and for all orders q, we observe linear relationships between logarithmic structure functions of successive orders at all lags (a phenomenon that is typically known as extended power scaling, or extended self-similarity). The scaling behavior of Lagrangian velocities is compared with the one exhibited by porosity and specific surface area, which constitute two key pore-scale geometric observables. The statistical scaling of the local velocity field reflects the behavior of these geometric observables, with the occurrence of power-law-scaling regimes within the same range of lags for sample structure functions of Lagrangian velocity, porosity, and specific surface area.
Visualizing Influential Observations in Dependent Data

KAUST Repository

Genton, Marc G.

2010-01-01

We introduce the hair-plot to visualize influential observations in dependent data. It consists of all trajectories of the value of an estimator when each observation is modified in turn by an additive perturbation. We define two measures of influence: the local influence which describes the rate of departure from the original estimate due to a small perturbation of each observation; and the asymptotic influence which indicates the influence on the original estimate of the most extreme contamination for each observation. The cases of estimators defined as quadratic forms or ratios of quadratic forms are investigated in detail. Sample autocovariances, covariograms, and variograms belong to the first case. Sample autocorrelations, correlograms, and indices of spatial autocorrelation such as Moran\\'s I belong to the second case.We illustrate our approach on various datasets from time series analysis and spatial statistics. This article has supplementary material online. © 2010 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

The outlier sample effects on multivariate statistical data processing geochemical stream sediment survey (Moghangegh region, North West of Iran)

International Nuclear Information System (INIS)

Ghanbari, Y.; Habibnia, A.; Memar, A.

2009-01-01

In geochemical stream sediment surveys in Moghangegh Region in north west of Iran, sheet 1:50,000, 152 samples were collected and after the analyze and processing of data, it revealed that Yb, Sc, Ni, Li, Eu, Cd, Co, as contents in one sample is far higher than other samples. After detecting this sample as an outlier sample, the effect of this sample on multivariate statistical data processing for destructive effects of outlier sample in geochemical exploration was investigated. Pearson and Spear man correlation coefficient methods and cluster analysis were used for multivariate studies and the scatter plot of some elements together the regression profiles are given in case of 152 and 151 samples and the results are compared. After investigation of multivariate statistical data processing results, it was realized that results of existence of outlier samples may appear as the following relations between elements: - true relation between two elements, which have no outlier frequency in the outlier sample. - false relation between two elements which one of them has outlier frequency in the outlier sample. - complete false relation between two elements which both have outlier frequency in the outlier sample
New complete sample of identified radio sources. Part 2. Statistical study

International Nuclear Information System (INIS)

Soltan, A.

1978-01-01

Complete sample of radio sources with known redshifts selected in Paper I is studied. Source counts in the sample and the luminosity - volume test show that both quasars and galaxies are subject to the evolution. Luminosity functions for different ranges of redshifts are obtained. Due to many uncertainties only simplified models of the evolution are tested. Exponential decline of the liminosity with time of all the bright sources is in a good agreement both with the luminosity- volume test and N(S) realtion in the entire range of observed flux densities. It is shown that sources in the sample are randomly distributed in scales greater than about 17 Mpc. (author)
Exact distributions of two-sample rank statistics and block rank statistics using computer algebra

NARCIS (Netherlands)

Wiel, van de M.A.

1998-01-01

We derive generating functions for various rank statistics and we use computer algebra to compute the exact null distribution of these statistics. We present various techniques for reducing time and memory space used by the computations. We use the results to write Mathematica notebooks for
Adaptive list sequential sampling method for population-based observational studies

NARCIS (Netherlands)

Hof, Michel H.; Ravelli, Anita C. J.; Zwinderman, Aeilko H.

2014-01-01

In population-based observational studies, non-participation and delayed response to the invitation to participate are complications that often arise during the recruitment of a sample. When both are not properly dealt with, the composition of the sample can be different from the desired
TRAN-STAT: statistics for environmental studies, Number 22. Comparison of soil-sampling techniques for plutonium at Rocky Flats

International Nuclear Information System (INIS)

Gilbert, R.O.; Bernhardt, D.E.; Hahn, P.B.

1983-01-01

A summary of a field soil sampling study conducted around the Rocky Flats Colorado plant in May 1977 is preseted. Several different soil sampling techniques that had been used in the area were applied at four different sites. One objective was to comparethe average 239 - 240 Pu concentration values obtained by the various soil sampling techniques used. There was also interest in determining whether there are differences in the reproducibility of the various techniques and how the techniques compared with the proposed EPA technique of sampling to 1 cm depth. Statistically significant differences in average concentrations between the techniques were found. The differences could be largely related to the differences in sampling depth-the primary physical variable between the techniques. The reproducibility of the techniques was evaluated by comparing coefficients of variation. Differences between coefficients of variation were not statistically significant. Average (median) coefficients ranged from 21 to 42 percent for the five sampling techniques. A laboratory study indicated that various sample treatment and particle sizing techniques could increase the concentration of plutonium in the less than 10 micrometer size fraction by up to a factor of about 4 compared to the 2 mm size fraction
Quantification of integrated HIV DNA by repetitive-sampling Alu-HIV PCR on the basis of poisson statistics.

Science.gov (United States)

De Spiegelaere, Ward; Malatinkova, Eva; Lynch, Lindsay; Van Nieuwerburgh, Filip; Messiaen, Peter; O'Doherty, Una; Vandekerckhove, Linos

2014-06-01

Quantification of integrated proviral HIV DNA by repetitive-sampling Alu-HIV PCR is a candidate virological tool to monitor the HIV reservoir in patients. However, the experimental procedures and data analysis of the assay are complex and hinder its widespread use. Here, we provide an improved and simplified data analysis method by adopting binomial and Poisson statistics. A modified analysis method on the basis of Poisson statistics was used to analyze the binomial data of positive and negative reactions from a 42-replicate Alu-HIV PCR by use of dilutions of an integration standard and on samples of 57 HIV-infected patients. Results were compared with the quantitative output of the previously described Alu-HIV PCR method. Poisson-based quantification of the Alu-HIV PCR was linearly correlated with the standard dilution series, indicating that absolute quantification with the Poisson method is a valid alternative for data analysis of repetitive-sampling Alu-HIV PCR data. Quantitative outputs of patient samples assessed by the Poisson method correlated with the previously described Alu-HIV PCR analysis, indicating that this method is a valid alternative for quantifying integrated HIV DNA. Poisson-based analysis of the Alu-HIV PCR data enables absolute quantification without the need of a standard dilution curve. Implementation of the CI estimation permits improved qualitative analysis of the data and provides a statistical basis for the required minimal number of technical replicates. © 2014 The American Association for Clinical Chemistry.
A Statistical Study of Interplanetary Type II Bursts: STEREO Observations

Science.gov (United States)

Krupar, V.; Eastwood, J. P.; Magdalenic, J.; Gopalswamy, N.; Kruparova, O.; Szabo, A.

2017-12-01

Coronal mass ejections (CMEs) are the primary cause of the most severe and disruptive space weather events such as solar energetic particle (SEP) events and geomagnetic storms at Earth. Interplanetary type II bursts are generated via the plasma emission mechanism by energetic electrons accelerated at CME-driven shock waves and hence identify CMEs that potentially cause space weather impact. As CMEs propagate outward from the Sun, radio emissions are generated at progressively at lower frequencies corresponding to a decreasing ambient solar wind plasma density. We have performed a statistical study of 153 interplanetary type II bursts observed by the two STEREO spacecraft between March 2008 and August 2014. These events have been correlated with manually-identified CMEs contained in the Heliospheric Cataloguing, Analysis and Techniques Service (HELCATS) catalogue. Our results confirm that faster CMEs are more likely to produce interplanetary type II radio bursts. We have compared observed frequency drifts with white-light observations to estimate angular deviations of type II burst propagation directions from radial. We have found that interplanetary type II bursts preferably arise from CME flanks. Finally, we discuss a visibility of radio emissions in relation to the CME propagation direction.
Finite-sample instrumental variables Inference using an Asymptotically Pivotal Statistic

NARCIS (Netherlands)

Bekker, P.; Kleibergen, F.R.

2001-01-01

The paper considers the K-statistic, Kleibergen’s (2000) adaptation ofthe Anderson-Rubin (AR) statistic in instrumental variables regression.Compared to the AR-statistic this K-statistic shows improvedasymptotic efficiency in terms of degrees of freedom in overidentifiedmodels and yet it shares,
A preliminary study on identification of Thai rice samples by INAA and statistical analysis

Science.gov (United States)

Kongsri, S.; Kukusamude, C.

2017-09-01

This study aims to investigate the elemental compositions in 93 Thai rice samples using instrumental neutron activation analysis (INAA) and to identify rice according to their types and rice cultivars using statistical analysis. As, Mg, Cl, Al, Br, Mn, K, Rb and Zn in Thai jasmine rice and Sung Yod rice samples were successfully determined by INAA. The accuracy and precision of the INAA method were verified by SRM 1568a Rice Flour. All elements were found to be in a good agreement with the certified values. The precisions in term of %RSD were lower than 7%. The LODs were obtained in range of 0.01 to 29 mg kg-1. The concentration of 9 elements distributed in Thai rice samples was evaluated and used as chemical indicators to identify the type of rice samples. The result found that Mg, Cl, As, Br, Mn, K, Rb, and Zn concentrations in Thai jasmine rice samples are significantly different but there was no evidence that Al is significantly different from concentration in Sung Yod rice samples at 95% confidence interval. Our results may provide preliminary information for discrimination of rice samples and may be useful database of Thai rice.
Finite-sample instrumental variables inference using an asymptotically pivotal statistic

NARCIS (Netherlands)

Bekker, Paul A.; Kleibergen, Frank

2001-01-01

The paper considers the K-statistic, Kleibergen’s (2000) adaptation of the Anderson-Rubin (AR) statistic in instrumental variables regression. Compared to the AR-statistic this K-statistic shows improved asymptotic efficiency in terms of degrees of freedom in overidenti?ed models and yet it shares,
CONFIDENCE LEVELS AND/VS. STATISTICAL HYPOTHESIS TESTING IN STATISTICAL ANALYSIS. CASE STUDY

Directory of Open Access Journals (Sweden)

ILEANA BRUDIU

2009-05-01

Full Text Available Estimated parameters with confidence intervals and testing statistical assumptions used in statistical analysis to obtain conclusions on research from a sample extracted from the population. Paper to the case study presented aims to highlight the importance of volume of sample taken in the study and how this reflects on the results obtained when using confidence intervals and testing for pregnant. If statistical testing hypotheses not only give an answer "yes" or "no" to some questions of statistical estimation using statistical confidence intervals provides more information than a test statistic, show high degree of uncertainty arising from small samples and findings build in the "marginally significant" or "almost significant (p very close to 0.05.
Notes on the MUF-D statistic

International Nuclear Information System (INIS)

Picard, R.R.

1987-01-01

Verification of an inventory or of a reported material unaccounted for (MUF) calls for the remeasurement of a sample of items by an inspector followed by comparison of the inspector's data to the facility's reported values. Such comparison is intended to protect against falsification of accounting data that could conceal material loss. In the international arena, the observed discrepancies between the inspector's data and the reported data are quantified using the D statistic. If data have been falsified by the facility, the standard deviations of the D and MUF-D statistics are inflated owing to the sampling distribution. Moreover, under certain conditions the distributions of those statistics can depart markedly from normality, complicating evaluation of an inspection plan's performance. Detection probabilities estimated using standard deviations appropriate for the no-falsification case in conjunction with assumed normality can be far too optimistic. Under very general conditions regarding the facility's and/or the inspector's measurement error procedures and the inspector's sampling regime, the variance of the MUF-D statistic can be broken into three components. The inspection's sensitivity against various falsification scenarios can be traced to one or more of these components. Obvious implications exist for the planning of effective inspections, particularly in the area of resource optimization
Adaptive sampling rate control for networked systems based on statistical characteristics of packet disordering.

Science.gov (United States)

Li, Jin-Na; Er, Meng-Joo; Tan, Yen-Kheng; Yu, Hai-Bin; Zeng, Peng

2015-09-01

This paper investigates an adaptive sampling rate control scheme for networked control systems (NCSs) subject to packet disordering. The main objectives of the proposed scheme are (a) to avoid heavy packet disordering existing in communication networks and (b) to stabilize NCSs with packet disordering, transmission delay and packet loss. First, a novel sampling rate control algorithm based on statistical characteristics of disordering entropy is proposed; secondly, an augmented closed-loop NCS that consists of a plant, a sampler and a state-feedback controller is transformed into an uncertain and stochastic system, which facilitates the controller design. Then, a sufficient condition for stochastic stability in terms of Linear Matrix Inequalities (LMIs) is given. Moreover, an adaptive tracking controller is designed such that the sampling period tracks a desired sampling period, which represents a significant contribution. Finally, experimental results are given to illustrate the effectiveness and advantages of the proposed scheme. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Determination of Sr-90 in milk samples from the study of statistical results

Directory of Open Access Journals (Sweden)

Otero-Pazos Alberto

2017-01-01

Full Text Available The determination of 90Sr in milk samples is the main objective of radiation monitoring laboratories because of its environmental importance. In this paper the concentration of activity of 39 milk samples was obtained through radiochemical separation based on selective retention of Sr in a cationic resin (Dowex 50WX8, 50-100 mesh and subsequent determination by a low-level proportional gas counter. The results were checked by performing the measurement of the Sr concentration by using the flame atomic absorption spectroscopy technique, to finally obtain the mass of 90Sr. From the data obtained a statistical treatment was performed using linear regressions. A reliable estimate of the mass of 90Sr was obtained based on the gravimetric technique, and secondly, the counts per minute of the third measurement in the 90Sr and 90Y equilibrium, without having to perform the analysis. These estimates have been verified with 19 milk samples, obtaining overlapping results. The novelty of the manuscript is the possibility of determining the concentration of 90Sr in milk samples, without the need to perform the third measurement in the equilibrium.
Mitigating Observation Perturbation Sampling Errors in the Stochastic EnKF

KAUST Repository

Hoteit, Ibrahim

2015-03-17

The stochastic ensemble Kalman filter (EnKF) updates its ensemble members with observations perturbed with noise sampled from the distribution of the observational errors. This was shown to introduce noise into the system and may become pronounced when the ensemble size is smaller than the rank of the observational error covariance, which is often the case in real oceanic and atmospheric data assimilation applications. This work introduces an efficient serial scheme to mitigate the impact of observations’ perturbations sampling in the analysis step of the EnKF, which should provide more accurate ensemble estimates of the analysis error covariance matrices. The new scheme is simple to implement within the serial EnKF algorithm, requiring only the approximation of the EnKF sample forecast error covariance matrix by a matrix with one rank less. The new EnKF scheme is implemented and tested with the Lorenz-96 model. Results from numerical experiments are conducted to compare its performance with the EnKF and two standard deterministic EnKFs. This study shows that the new scheme enhances the behavior of the EnKF and may lead to better performance than the deterministic EnKFs even when implemented with relatively small ensembles.
Mitigating Observation Perturbation Sampling Errors in the Stochastic EnKF

KAUST Repository

Hoteit, Ibrahim; Pham, D.-T.; El Gharamti, Mohamad; Luo, X.

2015-01-01

The stochastic ensemble Kalman filter (EnKF) updates its ensemble members with observations perturbed with noise sampled from the distribution of the observational errors. This was shown to introduce noise into the system and may become pronounced when the ensemble size is smaller than the rank of the observational error covariance, which is often the case in real oceanic and atmospheric data assimilation applications. This work introduces an efficient serial scheme to mitigate the impact of observations’ perturbations sampling in the analysis step of the EnKF, which should provide more accurate ensemble estimates of the analysis error covariance matrices. The new scheme is simple to implement within the serial EnKF algorithm, requiring only the approximation of the EnKF sample forecast error covariance matrix by a matrix with one rank less. The new EnKF scheme is implemented and tested with the Lorenz-96 model. Results from numerical experiments are conducted to compare its performance with the EnKF and two standard deterministic EnKFs. This study shows that the new scheme enhances the behavior of the EnKF and may lead to better performance than the deterministic EnKFs even when implemented with relatively small ensembles.
Statistical and physical evolution of QSO's

International Nuclear Information System (INIS)

Caditz, D.; Petrosian, V.

1989-09-01

The relationship between the physical evolution of discrete extragalactic sources, the statistical evolution of the observed population of sources, and the cosmological model is discussed. Three simple forms of statistical evolution: pure luminosity evolution (PLE), pure density evolution (PDE), and generalized luminosity evolution (GLE), are considered in detail together with what these forms imply about the physical evolution of individual sources. Two methods are used to analyze the statistical evolution of the observed distribution of QSO's (quasars) from combined flux limited samples. It is shown that both PLE and PDE are inconsistent with the data over the redshift range 0 less than z less than 2.2, and that a more complicated form of evolution such as GLE is required, independent of the cosmological model. This result is important for physical models of AGN, and in particular, for the accretion disk model which recent results show may be inconsistent with PLE
Sampling the Mouse Hippocampal Dentate Gyrus

Directory of Open Access Journals (Sweden)

Lisa Basler

2017-12-01

Full Text Available Sampling is a critical step in procedures that generate quantitative morphological data in the neurosciences. Samples need to be representative to allow statistical evaluations, and samples need to deliver a precision that makes statistical evaluations not only possible but also meaningful. Sampling generated variability should, e.g., not be able to hide significant group differences from statistical detection if they are present. Estimators of the coefficient of error (CE have been developed to provide tentative answers to the question if sampling has been “good enough” to provide meaningful statistical outcomes. We tested the performance of the commonly used Gundersen-Jensen CE estimator, using the layers of the mouse hippocampal dentate gyrus as an example (molecular layer, granule cell layer and hilus. We found that this estimator provided useful estimates of the precision that can be expected from samples of different sizes. For all layers, we found that a smoothness factor (m of 0 generally provided better estimates than an m of 1. Only for the combined layers, i.e., the entire dentate gyrus, better CE estimates could be obtained using an m of 1. The orientation of the sections impacted on CE sizes. Frontal (coronal sections are typically most efficient by providing the smallest CEs for a given amount of work. Applying the estimator to 3D-reconstructed layers and using very intense sampling, we observed CE size plots with m = 0 to m = 1 transitions that should also be expected but are not often observed in real section series. The data we present also allows the reader to approximate the sampling intervals in frontal, horizontal or sagittal sections that provide CEs of specified sizes for the layers of the mouse dentate gyrus.
Search for lesions in mammograms: Statistical characterization of observer responses

International Nuclear Information System (INIS)

Bochud, Francois O.; Abbey, Craig K.; Eckstein, Miguel P.

2004-01-01

We investigate human performance for visually detecting simulated microcalcifications and tumors embedded in x-ray mammograms as a function of signal contrast and the number of possible signal locations. Our results show that performance degradation with an increasing number of locations is well approximated by signal detection theory (SDT) with the usual Gaussian assumption. However, more stringent statistical analysis finds a departure from Gaussian assumptions for the detection of microcalcifications. We investigated whether these departures from the SDT Gaussian model could be accounted for by an increase in human internal response correlations arising from the image-pixel correlations present in 1/f spectrum backgrounds and/or observer internal response distributions that departed from the Gaussian assumption. Results were consistent with a departure from the Gaussian response distributions and suggested that the human observer internal responses were more compact than the Gaussian distribution. Finally, we conducted a free search experiment where the signal could appear anywhere within the image. Results show that human performance in a multiple-alternative forced-choice experiment can be used to predict performance in the clinically realistic free search experiment when the investigator takes into account the search area and the observers' inherent spatial imprecision to localize the targets
Statistical validation of earthquake related observations

Science.gov (United States)

Kossobokov, V. G.

2011-12-01

The confirmed fractal nature of earthquakes and their distribution in space and time implies that many traditional estimations of seismic hazard (from term-less to short-term ones) are usually based on erroneous assumptions of easy tractable or, conversely, delicately-designed models. The widespread practice of deceptive modeling considered as a "reasonable proxy" of the natural seismic process leads to seismic hazard assessment of unknown quality, which errors propagate non-linearly into inflicted estimates of risk and, eventually, into unexpected societal losses of unacceptable level. The studies aimed at forecast/prediction of earthquakes must include validation in the retro- (at least) and, eventually, in prospective tests. In the absence of such control a suggested "precursor/signal" remains a "candidate", which link to target seismic event is a model assumption. Predicting in advance is the only decisive test of forecast/predictions and, therefore, the score-card of any "established precursor/signal" represented by the empirical probabilities of alarms and failures-to-predict achieved in prospective testing must prove statistical significance rejecting the null-hypothesis of random coincidental occurrence in advance target earthquakes. We reiterate suggesting so-called "Seismic Roulette" null-hypothesis as the most adequate undisturbed random alternative accounting for the empirical spatial distribution of earthquakes: (i) Consider a roulette wheel with as many sectors as the number of earthquake locations from a sample catalog representing seismic locus, a sector per each location and (ii) make your bet according to prediction (i.e., determine, which locations are inside area of alarm, and put one chip in each of the corresponding sectors); (iii) Nature turns the wheel; (iv) accumulate statistics of wins and losses along with the number of chips spent. If a precursor in charge of prediction exposes an imperfection of Seismic Roulette then, having in mind

Modified Distribution-Free Goodness-of-Fit Test Statistic.

Science.gov (United States)

Chun, So Yeon; Browne, Michael W; Shapiro, Alexander

2018-03-01

Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62-83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices.
Municipal solid waste composition: Sampling methodology, statistical analyses, and case study evaluation

International Nuclear Information System (INIS)

Edjabou, Maklawe Essonanawe; Jensen, Morten Bang; Götze, Ramona; Pivnenko, Kostyantyn; Petersen, Claus; Scheutz, Charlotte; Astrup, Thomas Fruergaard

2015-01-01

Highlights: • Tiered approach to waste sorting ensures flexibility and facilitates comparison of solid waste composition data. • Food and miscellaneous wastes are the main fractions contributing to the residual household waste. • Separation of food packaging from food leftovers during sorting is not critical for determination of the solid waste composition. - Abstract: Sound waste management and optimisation of resource recovery require reliable data on solid waste generation and composition. In the absence of standardised and commonly accepted waste characterisation methodologies, various approaches have been reported in literature. This limits both comparability and applicability of the results. In this study, a waste sampling and sorting methodology for efficient and statistically robust characterisation of solid waste was introduced. The methodology was applied to residual waste collected from 1442 households distributed among 10 individual sub-areas in three Danish municipalities (both single and multi-family house areas). In total 17 tonnes of waste were sorted into 10–50 waste fractions, organised according to a three-level (tiered approach) facilitating comparison of the waste data between individual sub-areas with different fractionation (waste from one municipality was sorted at “Level III”, e.g. detailed, while the two others were sorted only at “Level I”). The results showed that residual household waste mainly contained food waste (42 ± 5%, mass per wet basis) and miscellaneous combustibles (18 ± 3%, mass per wet basis). The residual household waste generation rate in the study areas was 3–4 kg per person per week. Statistical analyses revealed that the waste composition was independent of variations in the waste generation rate. Both, waste composition and waste generation rates were statistically similar for each of the three municipalities. While the waste generation rates were similar for each of the two housing types (single
Municipal solid waste composition: Sampling methodology, statistical analyses, and case study evaluation

Energy Technology Data Exchange (ETDEWEB)

Edjabou, Maklawe Essonanawe, E-mail: vine@env.dtu.dk [Department of Environmental Engineering, Technical University of Denmark, 2800 Kgs. Lyngby (Denmark); Jensen, Morten Bang; Götze, Ramona; Pivnenko, Kostyantyn [Department of Environmental Engineering, Technical University of Denmark, 2800 Kgs. Lyngby (Denmark); Petersen, Claus [Econet AS, Omøgade 8, 2.sal, 2100 Copenhagen (Denmark); Scheutz, Charlotte; Astrup, Thomas Fruergaard [Department of Environmental Engineering, Technical University of Denmark, 2800 Kgs. Lyngby (Denmark)

2015-02-15

Highlights: • Tiered approach to waste sorting ensures flexibility and facilitates comparison of solid waste composition data. • Food and miscellaneous wastes are the main fractions contributing to the residual household waste. • Separation of food packaging from food leftovers during sorting is not critical for determination of the solid waste composition. - Abstract: Sound waste management and optimisation of resource recovery require reliable data on solid waste generation and composition. In the absence of standardised and commonly accepted waste characterisation methodologies, various approaches have been reported in literature. This limits both comparability and applicability of the results. In this study, a waste sampling and sorting methodology for efficient and statistically robust characterisation of solid waste was introduced. The methodology was applied to residual waste collected from 1442 households distributed among 10 individual sub-areas in three Danish municipalities (both single and multi-family house areas). In total 17 tonnes of waste were sorted into 10–50 waste fractions, organised according to a three-level (tiered approach) facilitating comparison of the waste data between individual sub-areas with different fractionation (waste from one municipality was sorted at “Level III”, e.g. detailed, while the two others were sorted only at “Level I”). The results showed that residual household waste mainly contained food waste (42 ± 5%, mass per wet basis) and miscellaneous combustibles (18 ± 3%, mass per wet basis). The residual household waste generation rate in the study areas was 3–4 kg per person per week. Statistical analyses revealed that the waste composition was independent of variations in the waste generation rate. Both, waste composition and waste generation rates were statistically similar for each of the three municipalities. While the waste generation rates were similar for each of the two housing types (single
Statistical Process Control in a Modern Production Environment

DEFF Research Database (Denmark)

Windfeldt, Gitte Bjørg

gathered here and standard statistical software. In Paper 2 a new method for process monitoring is introduced. The method uses a statistical model of the quality characteristic and a sliding window of observations to estimate the probability that the next item will not respect the specications......Paper 1 is aimed at practicians to help them test the assumption that the observations in a sample are independent and identically distributed. An assumption that is essential when using classical Shewhart charts. The test can easily be performed in the control chart setup using the samples....... If the estimated probability exceeds a pre-determined threshold the process will be stopped. The method is exible, allowing a complexity in modeling that remains invisible to the end user. Furthermore, the method allows to build diagnostic plots based on the parameters estimates that can provide valuable insight...
The nature of the redshift and directly observed quasar statistics.

Science.gov (United States)

Segal, I E; Nicoll, J F; Wu, P; Zhou, Z

1991-07-01

The nature of the cosmic redshift is one of the most fundamental questions in modern science. Hubble's discovery of the apparent Expansion of the Universe is derived from observations on a small number of galaxies at very low redshifts. Today, quasar redshifts have a range more than 1000 times greater than those in Hubble's sample, and represent more than 100 times as many objects. A recent comprehensive compilation of published measurements provides the basis for a study indicating that quasar observations are not in good agreement with the original predictions of the Expanding Universe theory, but are well fit by the predictions of an alternative theory having fewer adjustable parameters.
Multiparametric statistics

CERN Document Server

Serdobolskii, Vadim Ivanovich

2007-01-01

This monograph presents mathematical theory of statistical models described by the essentially large number of unknown parameters, comparable with sample size but can also be much larger. In this meaning, the proposed theory can be called "essentially multiparametric". It is developed on the basis of the Kolmogorov asymptotic approach in which sample size increases along with the number of unknown parameters.This theory opens a way for solution of central problems of multivariate statistics, which up until now have not been solved. Traditional statistical methods based on the idea of an infinite sampling often break down in the solution of real problems, and, dependent on data, can be inefficient, unstable and even not applicable. In this situation, practical statisticians are forced to use various heuristic methods in the hope the will find a satisfactory solution.Mathematical theory developed in this book presents a regular technique for implementing new, more efficient versions of statistical procedures. ...
Statistical sampling methods for soils monitoring

Science.gov (United States)

Ann M. Abbott

2010-01-01

Development of the best sampling design to answer a research question should be an interactive venture between the land manager or researcher and statisticians, and is the result of answering various questions. A series of questions that can be asked to guide the researcher in making decisions that will arrive at an effective sampling plan are described, and a case...
Connecting HL Tau to the observed exoplanet sample

Science.gov (United States)

Simbulan, Christopher; Tamayo, Daniel; Petrovich, Cristobal; Rein, Hanno; Murray, Norman

2017-08-01

The Atacama Large Millimeter/submilimeter Array (ALMA) recently revealed a set of nearly concentric gaps in the protoplanetary disc surrounding the young star HL Tauri (HL Tau). If these are carved by forming gas giants, this provides the first set of orbital initial conditions for planets as they emerge from their birth discs. Using N-body integrations, we have followed the evolution of the system for 5 Gyr to explore the possible outcomes. We find that HL Tau initial conditions scaled down to the size of typically observed exoplanet orbits naturally produce several populations in the observed exoplanet sample. First, for a plausible range of planetary masses, we can match the observed eccentricity distribution of dynamically excited radial velocity giant planets with eccentricities >0.2. Secondly, we roughly obtain the observed rate of hot Jupiters around FGK stars. Finally, we obtain a large efficiency of planetary ejections of ≈2 per HL Tau-like system, but the small fraction of stars observed to host giant planets makes it hard to match the rate of free-floating planets inferred from microlensing observations. In view of upcoming Gaia results, we also provide predictions for the expected mutual inclination distribution, which is significantly broader than the absolute inclination distributions typically considered by previous studies.
A statistic to estimate the variance of the histogram-based mutual information estimator based on dependent pairs of observations

NARCIS (Netherlands)

Moddemeijer, R

In the case of two signals with independent pairs of observations (x(n),y(n)) a statistic to estimate the variance of the histogram based mutual information estimator has been derived earlier. We present such a statistic for dependent pairs. To derive this statistic it is necessary to avail of a
Relativistic beaming and quasar statistics

International Nuclear Information System (INIS)

Orr, M.J.L.; Browne, I.W.A.

1982-01-01

The statistical predictions of a unified scheme for the radio emission from quasars are explored. This scheme attributes the observed differences between flat- and steep-spectrum quasars to projection and the effects of relativistic beaming of the emission from the nuclear components. We use a simple quasar model consisting of a compact relativistically beamed core with spectral index zero and unbeamed lobes, spectral index - 1, to predict the proportion of flat-spectrum sources in flux-limited samples selected at different frequencies. In our model this fraction depends on the core Lorentz factor, γ and we find that a value of approximately 5 gives satisfactory agreement with observation. In a similar way the model is used to construct the expected number/flux density counts for flat-spectrum quasars from the observed steep-spectrum counts. Again, good agreement with the observations is obtained if the average core Lorentz factor is about 5. Independent estimates of γ from observations of superluminal motion in quasars are of the same order of magnitude. We conclude that the statistical properties of quasars are entirely consistent with the predictions of simple relativistic-beam models. (author)
Statistical analysis of tiny SXR flares observed by SphinX

Science.gov (United States)

Gryciuk, Magdalena; Siarkowski, Marek; Sylwester, Janusz; Kepa, Anna; Gburek, Szymon; Mrozek, Tomasz; Podgórski, Piotr

2015-08-01

The Solar Photometer in X-rays (SphinX) was designed to observe soft X-ray solar emission in the energy range between ~1 keV and 15 keV with the resolution better than 0.5 keV. The instrument operated from February until November 2009 aboard CORONAS-Photon satellite, during the phase of exceptionally low minimum of solar activity. Here we use SphinX data for analysis of micro-flares and brightenings. Despite a very low activity more than a thousand small X-ray events have been recognized by semi-automatic inspection of SphinX light curves. A catalogue of temporal and physical characteristics of these events is shown and discussed and results of the statistical analysis of the catalogue data are presented.
Statistics

CERN Document Server

Hayslett, H T

1991-01-01

Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
MULTI-LEVEL SAMPLING APPROACH FOR CONTINOUS LOSS DETECTION USING ITERATIVE WINDOW AND STATISTICAL MODEL

OpenAIRE

Mohd Fo'ad Rohani; Mohd Aizaini Maarof; Ali Selamat; Houssain Kettani

2010-01-01

This paper proposes a Multi-Level Sampling (MLS) approach for continuous Loss of Self-Similarity (LoSS) detection using iterative window. The method defines LoSS based on Second Order Self-Similarity (SOSS) statistical model. The Optimization Method (OM) is used to estimate self-similarity parameter since it is fast and more accurate in comparison with other estimation methods known in the literature. Probability of LoSS detection is introduced to measure continuous LoSS detection performance...
STATISTICAL EVALUATION OF SMALL SCALE MIXING DEMONSTRATION SAMPLING AND BATCH TRANSFER PERFORMANCE - 12093

Energy Technology Data Exchange (ETDEWEB)

GREER DA; THIEN MG

2012-01-12

The ability to effectively mix, sample, certify, and deliver consistent batches of High Level Waste (HLW) feed from the Hanford Double Shell Tanks (DST) to the Waste Treatment and Immobilization Plant (WTP) presents a significant mission risk with potential to impact mission length and the quantity of HLW glass produced. DOE's Tank Operations Contractor, Washington River Protection Solutions (WRPS) has previously presented the results of mixing performance in two different sizes of small scale DSTs to support scale up estimates of full scale DST mixing performance. Currently, sufficient sampling of DSTs is one of the largest programmatic risks that could prevent timely delivery of high level waste to the WTP. WRPS has performed small scale mixing and sampling demonstrations to study the ability to sufficiently sample the tanks. The statistical evaluation of the demonstration results which lead to the conclusion that the two scales of small DST are behaving similarly and that full scale performance is predictable will be presented. This work is essential to reduce the risk of requiring a new dedicated feed sampling facility and will guide future optimization work to ensure the waste feed delivery mission will be accomplished successfully. This paper will focus on the analytical data collected from mixing, sampling, and batch transfer testing from the small scale mixing demonstration tanks and how those data are being interpreted to begin to understand the relationship between samples taken prior to transfer and samples from the subsequent batches transferred. An overview of the types of data collected and examples of typical raw data will be provided. The paper will then discuss the processing and manipulation of the data which is necessary to begin evaluating sampling and batch transfer performance. This discussion will also include the evaluation of the analytical measurement capability with regard to the simulant material used in the demonstration tests. The
Hybrid algorithm of ensemble transform and importance sampling for assimilation of non-Gaussian observations

Directory of Open Access Journals (Sweden)

Shin'ya Nakano

2014-05-01

Full Text Available A hybrid algorithm that combines the ensemble transform Kalman filter (ETKF and the importance sampling approach is proposed. Since the ETKF assumes a linear Gaussian observation model, the estimate obtained by the ETKF can be biased in cases with nonlinear or non-Gaussian observations. The particle filter (PF is based on the importance sampling technique, and is applicable to problems with nonlinear or non-Gaussian observations. However, the PF usually requires an unrealistically large sample size in order to achieve a good estimation, and thus it is computationally prohibitive. In the proposed hybrid algorithm, we obtain a proposal distribution similar to the posterior distribution by using the ETKF. A large number of samples are then drawn from the proposal distribution, and these samples are weighted to approximate the posterior distribution according to the importance sampling principle. Since the importance sampling provides an estimate of the probability density function (PDF without assuming linearity or Gaussianity, we can resolve the bias due to the nonlinear or non-Gaussian observations. Finally, in the next forecast step, we reduce the sample size to achieve computational efficiency based on the Gaussian assumption, while we use a relatively large number of samples in the importance sampling in order to consider the non-Gaussian features of the posterior PDF. The use of the ETKF is also beneficial in terms of the computational simplicity of generating a number of random samples from the proposal distribution and in weighting each of the samples. The proposed algorithm is not necessarily effective in case that the ensemble is located distant from the true state. However, monitoring the effective sample size and tuning the factor for covariance inflation could resolve this problem. In this paper, the proposed hybrid algorithm is introduced and its performance is evaluated through experiments with non-Gaussian observations.
Statistical survey of day-side magnetospheric current flow using Cluster observations: magnetopause

Directory of Open Access Journals (Sweden)

E. Liebert

2017-05-01

Full Text Available We present a statistical survey of current structures observed by the Cluster spacecraft at high-latitude day-side magnetopause encounters in the close vicinity of the polar cusps. Making use of the curlometer technique and the fluxgate magnetometer data, we calculate the 3-D current densities and investigate the magnetopause current direction, location, and magnitude during varying solar wind conditions. We find that the orientation of the day-side current structures is in accordance with existing magnetopause current models. Based on the ambient plasma properties, we distinguish five different transition regions at the magnetopause surface and observe distinctive current properties for each region. Additionally, we find that the location of currents varies with respect to the onset of the changes in the plasma environment during magnetopause crossings.
Observer-Based Stabilization of Spacecraft Rendezvous with Variable Sampling and Sensor Nonlinearity

Directory of Open Access Journals (Sweden)

Zhuoshi Li

2013-01-01

Full Text Available This paper addresses the observer-based control problem of spacecraft rendezvous with nonuniform sampling period. The relative dynamic model is based on the classical Clohessy-Wiltshire equation, and sensor nonlinearity and sampling are considered together in a unified framework. The purpose of this paper is to perform an observer-based controller synthesis by using sampled and saturated output measurements, such that the resulting closed-loop system is exponentially stable. A time-dependent Lyapunov functional is developed which depends on time and the upper bound of the sampling period and also does not grow along the input update times. The controller design problem is solved in terms of the linear matrix inequality method, and the obtained results are less conservative than using the traditional Lyapunov functionals. Finally, a numerical simulation example is built to show the validity of the developed sampled-data control strategy.
THE zCOSMOS-SINFONI PROJECT. I. SAMPLE SELECTION AND NATURAL-SEEING OBSERVATIONS

Energy Technology Data Exchange (ETDEWEB)

Mancini, C.; Renzini, A. [INAF-OAPD, Osservatorio Astronomico di Padova, Vicolo Osservatorio 5, I-35122 Padova (Italy); Foerster Schreiber, N. M.; Hicks, E. K. S.; Genzel, R.; Tacconi, L.; Davies, R. [Max-Planck-Institut fuer Extraterrestrische Physik, Giessenbachstrasse, D-85748 Garching (Germany); Cresci, G. [Osservatorio Astrofisico di Arcetri (OAF), INAF-Firenze, Largo E. Fermi 5, I-50125 Firenze (Italy); Peng, Y.; Lilly, S.; Carollo, M.; Oesch, P. [Institute of Astronomy, Department of Physics, Eidgenossische Technische Hochschule, ETH Zurich CH-8093 (Switzerland); Vergani, D.; Pozzetti, L.; Zamorani, G. [INAF-Bologna, Via Ranzani, I-40127 Bologna (Italy); Daddi, E. [CEA-Saclay, DSM/DAPNIA/Service d' Astrophysique, F-91191 Gif-Sur Yvette Cedex (France); Maraston, C. [Institute of Cosmology and Gravitation, University of Portsmouth, Dennis Sciama Building, Burnaby Road, PO1 3HE Portsmouth (United Kingdom); McCracken, H. J. [IAP, 98bis bd Arago, F-75014 Paris (France); Bouche, N. [Department of Physics, University of California, Santa Barbara, CA 93106 (United States); Shapiro, K. [Aerospace Research Laboratories, Northrop Grumman Aerospace Systems, Redondo Beach, CA 90278 (United States); and others

2011-12-10

The zCOSMOS-SINFONI project is aimed at studying the physical and kinematical properties of a sample of massive z {approx} 1.4-2.5 star-forming galaxies, through SINFONI near-infrared integral field spectroscopy (IFS), combined with the multiwavelength information from the zCOSMOS (COSMOS) survey. The project is based on one hour of natural-seeing observations per target, and adaptive optics (AO) follow-up for a major part of the sample, which includes 30 galaxies selected from the zCOSMOS/VIMOS spectroscopic survey. This first paper presents the sample selection, and the global physical characterization of the target galaxies from multicolor photometry, i.e., star formation rate (SFR), stellar mass, age, etc. The H{alpha} integrated properties, such as, flux, velocity dispersion, and size, are derived from the natural-seeing observations, while the follow-up AO observations will be presented in the next paper of this series. Our sample appears to be well representative of star-forming galaxies at z {approx} 2, covering a wide range in mass and SFR. The H{alpha} integrated properties of the 25 H{alpha} detected galaxies are similar to those of other IFS samples at the same redshifts. Good agreement is found among the SFRs derived from H{alpha} luminosity and other diagnostic methods, provided the extinction affecting the H{alpha} luminosity is about twice that affecting the continuum. A preliminary kinematic analysis, based on the maximum observed velocity difference across the source and on the integrated velocity dispersion, indicates that the sample splits nearly 50-50 into rotation-dominated and velocity-dispersion-dominated galaxies, in good agreement with previous surveys.
Comparing identified and statistically significant lipids and polar metabolites in 15-year old serum and dried blood spot samples for longitudinal studies: Comparing lipids and metabolites in serum and DBS samples

Energy Technology Data Exchange (ETDEWEB)

Kyle, Jennifer E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Casey, Cameron P. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Stratton, Kelly G. [National Security Directorate, Pacific Northwest National Laboratory, Richland WA USA; Zink, Erika M. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Kim, Young-Mo [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Zheng, Xueyun [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Monroe, Matthew E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Weitz, Karl K. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Bloodsworth, Kent J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Orton, Daniel J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Ibrahim, Yehia M. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Moore, Ronald J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Lee, Christine G. [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Research Service, Portland Veterans Affairs Medical Center, Portland OR USA; Pedersen, Catherine [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Orwoll, Eric [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Smith, Richard D. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Burnum-Johnson, Kristin E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Baker, Erin S. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA

2017-02-05

The use of dried blood spots (DBS) has many advantages over traditional plasma and serum samples such as smaller blood volume required, storage at room temperature, and ability for sampling in remote locations. However, understanding the robustness of different analytes in DBS samples is essential, especially in older samples collected for longitudinal studies. Here we analyzed DBS samples collected in 2000-2001 and stored at room temperature and compared them to matched serum samples stored at -80°C to determine if they could be effectively used as specific time points in a longitudinal study following metabolic disease. Four hundred small molecules were identified in both the serum and DBS samples using gas chromatograph-mass spectrometry (GC-MS), liquid chromatography-MS (LC-MS) and LC-ion mobility spectrometry-MS (LC-IMS-MS). The identified polar metabolites overlapped well between the sample types, though only one statistically significant polar metabolite in a case-control study was conserved, indicating degradation occurs in the DBS samples affecting quantitation. Differences in the lipid identifications indicated that some oxidation occurs in the DBS samples. However, thirty-six statistically significant lipids correlated in both sample types indicating that lipid quantitation was more stable across the sample types.
Statistics of EMIC Rising Tones Observed by the Van Allen Probes

Science.gov (United States)

Sigsbee, K. M.; Kletzing, C.; Smith, C. W.; Santolik, O.

2017-12-01

We will present results from an ongoing statistical study of electromagnetic ion cyclotron (EMIC) wave rising tones observed by the Van Allen Probes. Using data from the Electric and Magnetic Field Instrument Suite and Integrated Science (EMFISIS) fluxgate magnetometer, we have identified orbits by both Van Allen Probes with EMIC wave events from the start of the mission in fall 2012 through fall 2016. Orbits with EMIC wave events were further examined for evidence of rising tones. Most EMIC wave rising tones were found during H+ band EMIC wave events. In Fourier time-frequency power spectrograms of the fluxgate magnetometer data, H+ band rising tones generally took the form of triggered emission type events, where the discrete rising tone structures rapidly rise in frequency out of the main band of observed H+ EMIC waves. A smaller percentage of EMIC wave rising tone events were found in the He+ band, where rising tones may appear as discrete structures with a positive slope embedded within the main band of observed He+ EMIC waves, similar in appearance to whistler-mode chorus elements. Understanding the occurrence rate and properties of rising tone EMIC waves will provide observational context for theoretical studies indicating that EMIC waves exhibiting non-linear behavior, such as rising tones, may be more effective at scattering radiation belt electrons than ordinary EMIC waves.

Environmental restoration and statistics: Issues and needs

International Nuclear Information System (INIS)

Gilbert, R.O.

1991-10-01

Statisticians have a vital role to play in environmental restoration (ER) activities. One facet of that role is to point out where additional work is needed to develop statistical sampling plans and data analyses that meet the needs of ER. This paper is an attempt to show where statistics fits into the ER process. The statistician, as member of the ER planning team, works collaboratively with the team to develop the site characterization sampling design, so that data of the quality and quantity required by the specified data quality objectives (DQOs) are obtained. At the same time, the statistician works with the rest of the planning team to design and implement, when appropriate, the observational approach to streamline the ER process and reduce costs. The statistician will also provide the expertise needed to select or develop appropriate tools for statistical analysis that are suited for problems that are common to waste-site data. These data problems include highly heterogeneous waste forms, large variability in concentrations over space, correlated data, data that do not have a normal (Gaussian) distribution, and measurements below detection limits. Other problems include environmental transport and risk models that yield highly uncertain predictions, and the need to effectively communicate to the public highly technical information, such as sampling plans, site characterization data, statistical analysis results, and risk estimates. Even though some statistical analysis methods are available ''off the shelf'' for use in ER, these problems require the development of additional statistical tools, as discussed in this paper. 29 refs
Statistics of counter-streaming solar wind suprathermal electrons at solar minimum: STEREO observations

Directory of Open Access Journals (Sweden)

B. Lavraud

2010-01-01

Full Text Available Previous work has shown that solar wind suprathermal electrons can display a number of features in terms of their anisotropy. Of importance is the occurrence of counter-streaming electron patterns, i.e., with "beams" both parallel and anti-parallel to the local magnetic field, which is believed to shed light on the heliospheric magnetic field topology. In the present study, we use STEREO data to obtain the statistical properties of counter-streaming suprathermal electrons (CSEs in the vicinity of corotating interaction regions (CIRs during the period March–December 2007. Because this period corresponds to a minimum of solar activity, the results are unrelated to the sampling of large-scale coronal mass ejections, which can lead to CSE owing to their closed magnetic field topology. The present study statistically confirms that CSEs are primarily the result of suprathermal electron leakage from the compressed CIR into the upstream regions with the combined occurrence of halo depletion at 90° pitch angle. The occurrence rate of CSE is found to be about 15–20% on average during the period analyzed (depending on the criteria used, but superposed epoch analysis demonstrates that CSEs are preferentially observed both before and after the passage of the stream interface (with peak occurrence rate >35% in the trailing high speed stream, as well as both inside and outside CIRs. The results quantitatively show that CSEs are common in the solar wind during solar minimum, but yet they suggest that such distributions would be much more common if pitch angle scattering were absent. We further argue that (1 the formation of shocks contributes to the occurrence of enhanced counter-streaming sunward-directed fluxes, but does not appear to be a necessary condition, and (2 that the presence of small-scale transients with closed-field topologies likely also contributes to the occurrence of counter-streaming patterns, but only in the slow solar wind prior to
Characteristics of electrostatic solitary waves observed in the plasma sheet boundary: Statistical analyses

Directory of Open Access Journals (Sweden)

H. Kojima

1999-01-01

Full Text Available We present the characteristics of the Electrostatic Solitary Waves (ESW observed by the Geotail spacecraft in the plasma sheet boundary layer based on the statistical analyses. We also discuss the results referring to a model of ESW generation due to electron beams, which is proposed by computer simulations. In this generation model, the nonlinear evolution of Langmuir waves excited by electron bump-on-tail instabilities leads to formation of isolated electrostatic potential structures corresponding to "electron hole" in the phase space. The statistical analyses of the Geotail data, which we conducted under the assumption that polarity of ESW potentials is positive, show that most of ESW propagate in the same direction of electron beams, which are observed by the plasma instrument, simultaneously. Further, we also find that the ESW potential energy is much smaller than the background electron thermal energy and that the ESW potential widths are typically shorter than 60 times of local electron Debye length when we assume that the ESW potentials travel in the same velocity of electron beams. These results are very consistent with the ESW generation model that the nonlinear evolution of electron bump-on-tail instability leads to the formation of electron holes in the phase space.
Application of binomial and multinomial probability statistics to the sampling design process of a global grain tracing and recall system

Science.gov (United States)

Small, coded, pill-sized tracers embedded in grain are proposed as a method for grain traceability. A sampling process for a grain traceability system was designed and investigated by applying probability statistics using a science-based sampling approach to collect an adequate number of tracers fo...
Municipal solid waste composition: Sampling methodology, statistical analyses, and case study evaluation

DEFF Research Database (Denmark)

Edjabou, Vincent Maklawe Essonanawe; Jensen, Morten Bang; Götze, Ramona

2015-01-01

Sound waste management and optimisation of resource recovery require reliable data on solid waste generation and composition. In the absence of standardised and commonly accepted waste characterisation methodologies, various approaches have been reported in literature. This limits both...... comparability and applicability of the results. In this study, a waste sampling and sorting methodology for efficient and statistically robust characterisation of solid waste was introduced. The methodology was applied to residual waste collected from 1442 households distributed among 10 individual sub......-areas in three Danish municipalities (both single and multi-family house areas). In total 17 tonnes of waste were sorted into 10-50 waste fractions, organised according to a three-level (tiered approach) facilitating,comparison of the waste data between individual sub-areas with different fractionation (waste...
A Statistical Primer: Understanding Descriptive and Inferential Statistics

OpenAIRE

Gillian Byrne

2007-01-01

As libraries and librarians move more towards evidence‐based decision making, the data being generated in libraries is growing. Understanding the basics of statistical analysis is crucial for evidence‐based practice (EBP), in order to correctly design and analyze researchas well as to evaluate the research of others. This article covers the fundamentals of descriptive and inferential statistics, from hypothesis construction to sampling to common statistical techniques including chi‐square, co...
[Statistical study of the incidence of agenesis in a sample of 1529 subjects].

Science.gov (United States)

Lo Muzio, L; Mignogna, M D; Bucci, P; Sorrentino, F

1989-09-01

Following a short review of the main aetiopathogenetic theories on dental agenesia, a personal statistical study of this pathology is reported. 1529 orthopantomographs of juveniles aged between 7 and 14 were examined. 79 cases of hypodentia were observed (5.2%), 32 in males (4.05%) and 47 in females (6.78%). The most interesting tooth was the second premolar with an incidence of 58.9% followed by the lateral incisor, with an incidence of 26.38%. This is in agreement with the international literature.
Repetitive Observation of Coniferous Samples in ESEM and SEM

Czech Academy of Sciences Publication Activity Database

Tihlaříková, Eva; Neděla, Vilém

2015-01-01

Roč. 21, S3 (2015), s. 1695-1696 ISSN 1431-9276 R&D Projects: GA ČR(CZ) GA14-22777S; GA MŠk(CZ) LO1212; GA MŠk ED0017/01/01 Institutional support: RVO:68081731 Keywords : SEM * ESEM * biological samples * repetitive observation Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering Impact factor: 1.730, year: 2015
Relationship between accuracy and number of samples on statistical quantity and contour map of environmental gamma-ray dose rate. Example of random sampling

International Nuclear Information System (INIS)

Matsuda, Hideharu; Minato, Susumu

2002-01-01

The accuracy of statistical quantity like the mean value and contour map obtained by measurement of the environmental gamma-ray dose rate was evaluated by random sampling of 5 different model distribution maps made by the mean slope, -1.3, of power spectra calculated from the actually measured values. The values were derived from 58 natural gamma dose rate data reported worldwide ranging in the means of 10-100 Gy/h rates and 10 -3 -10 7 km 2 areas. The accuracy of the mean value was found around ±7% even for 60 or 80 samplings (the most frequent number) and the standard deviation had the accuracy less than 1/4-1/3 of the means. The correlation coefficient of the frequency distribution was found 0.860 or more for 200-400 samplings (the most frequent number) but of the contour map, 0.502-0.770. (K.H.)
Dynamical 'in situ' observation of biological samples using variable pressure scanning electron microscope

International Nuclear Information System (INIS)

Nedela, V

2008-01-01

Possibilities of 'in-situ' observation of non-conductive biological samples free of charging artefacts in dynamically changed surrounding conditions are the topic of this work. The observed biological sample, the tongue of a rat, was placed on a cooled Peltier stage. We studied the visibility of topographical structure depending on transition between liquid and gas state of water in the specimen chamber of VP SEM.
Statistical modeling of static strengths of nuclear graphites with relevance to structural design

International Nuclear Information System (INIS)

Arai, Taketoshi

1992-02-01

Use of graphite materials for structural members poses a problem as to how to take into account of statistical properties of static strength, especially tensile fracture stresses, in component structural design. The present study concerns comprehensive examinations on statistical data base and modelings on nuclear graphites. First, the report provides individual samples and their analyses on strengths of IG-110 and PGX graphites for HTTR components. Those statistical characteristics on other HTGR graphites are also exemplified from the literature. Most of statistical distributions of individual samples are found to be approximately normal. The goodness of fit to normal distributions is more satisfactory with larger sample sizes. Molded and extruded graphites, however, possess a variety of statistical properties depending of samples from different with-in-log locations and/or different orientations. Second, the previous statistical models including the Weibull theory are assessed from the viewpoint of applicability to design procedures. This leads to a conclusion that the Weibull theory and its modified ones are satisfactory only for limited parts of tensile fracture behavior. They are not consistent for whole observations. Only normal statistics are justifiable as practical approaches to discuss specified minimum ultimate strengths as statistical confidence limits for individual samples. Third, the assessment of various statistical models emphasizes the need to develop advanced analytical ones which should involve modeling of microstructural features of actual graphite materials. Improvements of other structural design methodologies are also presented. (author)
Analysis of Statistical Methods Currently used in Toxicology Journals.

Science.gov (United States)

Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min

2014-09-01

Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.

Science.gov (United States)

Breunig, Nancy A.

Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
Technical note: Instantaneous sampling intervals validated from continuous video observation for behavioral recording of feedlot lambs.

Science.gov (United States)

Pullin, A N; Pairis-Garcia, M D; Campbell, B J; Campler, M R; Proudfoot, K L

2017-11-01

When considering methodologies for collecting behavioral data, continuous sampling provides the most complete and accurate data set whereas instantaneous sampling can provide similar results and also increase the efficiency of data collection. However, instantaneous time intervals require validation to ensure accurate estimation of the data. Therefore, the objective of this study was to validate scan sampling intervals for lambs housed in a feedlot environment. Feeding, lying, standing, drinking, locomotion, and oral manipulation were measured on 18 crossbred lambs housed in an indoor feedlot facility for 14 h (0600-2000 h). Data from continuous sampling were compared with data from instantaneous scan sampling intervals of 5, 10, 15, and 20 min using a linear regression analysis. Three criteria determined if a time interval accurately estimated behaviors: 1) ≥ 0.90, 2) slope not statistically different from 1 ( > 0.05), and 3) intercept not statistically different from 0 ( > 0.05). Estimations for lying behavior were accurate up to 20-min intervals, whereas feeding and standing behaviors were accurate only at 5-min intervals (i.e., met all 3 regression criteria). Drinking, locomotion, and oral manipulation demonstrated poor associations () for all tested intervals. The results from this study suggest that a 5-min instantaneous sampling interval will accurately estimate lying, feeding, and standing behaviors for lambs housed in a feedlot, whereas continuous sampling is recommended for the remaining behaviors. This methodology will contribute toward the efficiency, accuracy, and transparency of future behavioral data collection in lamb behavior research.
Understanding Computational Bayesian Statistics

CERN Document Server

Bolstad, William M

2011-01-01

A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
[A comparison of convenience sampling and purposive sampling].

Science.gov (United States)

Suen, Lee-Jen Wu; Huang, Hui-Man; Lee, Hao-Hsien

2014-06-01

Convenience sampling and purposive sampling are two different sampling methods. This article first explains sampling terms such as target population, accessible population, simple random sampling, intended sample, actual sample, and statistical power analysis. These terms are then used to explain the difference between "convenience sampling" and purposive sampling." Convenience sampling is a non-probabilistic sampling technique applicable to qualitative or quantitative studies, although it is most frequently used in quantitative studies. In convenience samples, subjects more readily accessible to the researcher are more likely to be included. Thus, in quantitative studies, opportunity to participate is not equal for all qualified individuals in the target population and study results are not necessarily generalizable to this population. As in all quantitative studies, increasing the sample size increases the statistical power of the convenience sample. In contrast, purposive sampling is typically used in qualitative studies. Researchers who use this technique carefully select subjects based on study purpose with the expectation that each participant will provide unique and rich information of value to the study. As a result, members of the accessible population are not interchangeable and sample size is determined by data saturation not by statistical power analysis.
Statistical behavior of foreshock Langmuir waves observed by the Cluster wideband data plasma wave receiver

Directory of Open Access Journals (Sweden)

K. Sigsbee

2004-07-01

Full Text Available We present the statistics of Langmuir wave amplitudes in the Earth's foreshock using Cluster Wideband Data (WBD Plasma Wave Receiver electric field waveforms from spacecraft 2, 3 and 4 on 26 March 2002. The largest amplitude Langmuir waves were observed by Cluster near the boundary between the foreshock and solar wind, in agreement with earlier studies. The characteristics of the waves were similar for all three spacecraft, suggesting that variations in foreshock structure must occur on scales greater than the 50-100km spacecraft separations. The electric field amplitude probability distributions constructed using waveforms from the Cluster WBD Plasma Wave Receiver generally followed the log-normal statistics predicted by stochastic growth theory for the event studied. Comparison with WBD receiver data from 17 February 2002, when spacecraft 4 was set in a special manual gain mode, suggests non-optimal auto-ranging of the instrument may have had some influence on the statistics.
Statistical behavior of foreshock Langmuir waves observed by the Cluster wideband data plasma wave receiver

Directory of Open Access Journals (Sweden)

K. Sigsbee

2004-07-01

Full Text Available We present the statistics of Langmuir wave amplitudes in the Earth's foreshock using Cluster Wideband Data (WBD Plasma Wave Receiver electric field waveforms from spacecraft 2, 3 and 4 on 26 March 2002. The largest amplitude Langmuir waves were observed by Cluster near the boundary between the foreshock and solar wind, in agreement with earlier studies. The characteristics of the waves were similar for all three spacecraft, suggesting that variations in foreshock structure must occur on scales greater than the 50-100km spacecraft separations. The electric field amplitude probability distributions constructed using waveforms from the Cluster WBD Plasma Wave Receiver generally followed the log-normal statistics predicted by stochastic growth theory for the event studied. Comparison with WBD receiver data from 17 February 2002, when spacecraft 4 was set in a special manual gain mode, suggests non-optimal auto-ranging of the instrument may have had some influence on the statistics.
Pointwise probability reinforcements for robust statistical inference.

Science.gov (United States)

Frénay, Benoît; Verleysen, Michel

2014-02-01

Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. Copyright © 2013 Elsevier Ltd. All rights reserved.
Computational statistics handbook with Matlab

CERN Document Server

Martinez, Wendy L

2007-01-01

Prefaces Introduction What Is Computational Statistics? An Overview of the Book Probability Concepts Introduction Probability Conditional Probability and Independence Expectation Common Distributions Sampling Concepts Introduction Sampling Terminology and Concepts Sampling Distributions Parameter Estimation Empirical Distribution Function Generating Random Variables Introduction General Techniques for Generating Random Variables Generating Continuous Random Variables Generating Discrete Random Variables Exploratory Data Analysis Introduction Exploring Univariate Data Exploring Bivariate and Trivariate Data Exploring Multidimensional Data Finding Structure Introduction Projecting Data Principal Component Analysis Projection Pursuit EDA Independent Component Analysis Grand Tour Nonlinear Dimensionality Reduction Monte Carlo Methods for Inferential Statistics Introduction Classical Inferential Statistics Monte Carlo Methods for Inferential Statist...

Acceptance sampling using judgmental and randomly selected samples

Energy Technology Data Exchange (ETDEWEB)

Sego, Landon H.; Shulman, Stanley A.; Anderson, Kevin K.; Wilson, John E.; Pulsipher, Brent A.; Sieber, W. Karl

2010-09-01

We present a Bayesian model for acceptance sampling where the population consists of two groups, each with different levels of risk of containing unacceptable items. Expert opinion, or judgment, may be required to distinguish between the high and low-risk groups. Hence, high-risk items are likely to be identifed (and sampled) using expert judgment, while the remaining low-risk items are sampled randomly. We focus on the situation where all observed samples must be acceptable. Consequently, the objective of the statistical inference is to quantify the probability that a large percentage of the unsampled items in the population are also acceptable. We demonstrate that traditional (frequentist) acceptance sampling and simpler Bayesian formulations of the problem are essentially special cases of the proposed model. We explore the properties of the model in detail, and discuss the conditions necessary to ensure that required samples sizes are non-decreasing function of the population size. The method is applicable to a variety of acceptance sampling problems, and, in particular, to environmental sampling where the objective is to demonstrate the safety of reoccupying a remediated facility that has been contaminated with a lethal agent.
A homogeneous sample of binary galaxies: Basic observational properties

Science.gov (United States)

Karachentsev, I. D.

1990-01-01

A survey of optical characteristics for 585 binary systems, satisfying a condition of apparent isolation on the sky, is presented. Influences of various selection effects distorting the average parameters of the sample are noted. The pair components display mutual similarity over all the global properties: luminosity, diameter, morphological type, mass-to-luminosity ratio, angular momentum etc., which is not due only to selection effects. The observed correlations must be caused by common origin of pair members. Some features (nuclear activity, color index) could acquire similarity during synchronous evolution of double galaxies. Despite the observed isolation, the sample of double systems is seriously contaminated by accidental pairs, and also by members of groups and clusters. After removing false pairs estimates of orbital mass-to-luminosity ratio range from 0 to 30 f(solar), with the mean value (7.8 plus or minus 0.7) f(solar). Binary galaxies possess nearly circular orbits with a typical eccentrity e = 0.25, probably resulting from evolutionary selection driven by component mergers under dynamical friction. The double-galaxy population with space abundance 0.12 plus or minus 0.02 and characteristic merger timescale 0.2 H(exp -1) may significantly influence the rate of dynamical evolution of galaxies.
Statistical comparison of the geometry of second-phase particles

Energy Technology Data Exchange (ETDEWEB)

Benes, Viktor, E-mail: benesv@karlin.mff.cuni.cz [Charles University in Prague, Faculty of Mathematics and Physics, Department of Probability and Mathematical Statistics, Sokolovska 83, 186 75 Prague 8-Karlin (Czech Republic); Lechnerova, Radka, E-mail: radka.lech@seznam.cz [Private College on Economical Studies, Ltd., Lindnerova 575/1, 180 00 Prague 8-Liben (Czech Republic); Klebanov, Lev [Charles University in Prague, Faculty of Mathematics and Physics, Department of Probability and Mathematical Statistics, Sokolovska 83, 186 75 Prague 8-Karlin (Czech Republic); Slamova, Margarita, E-mail: slamova@vyzkum-kovu.cz [Research Institute for Metals, Ltd., Panenske Brezany 50, 250 70 Odolena Voda (Czech Republic); Slama, Peter [Research Institute for Metals, Ltd., Panenske Brezany 50, 250 70 Odolena Voda (Czech Republic)

2009-10-15

In microscopic studies of materials, there is often a need to provide a statistical test as to whether two microstructures are different or not. Typically, there are some random objects (particles, grains, pores) and the comparison concerns their density, individual geometrical parameters and their spatial distribution. The problem is that neighbouring objects observed in a single window cannot be assumed to be stochastically independent, therefore classical statistical testing based on random sampling is not applicable. The aim of the present paper is to develop a test based on N-distances in probability theory. Using the measurements from a few independent windows, we consider a two-sample test, which involves a large amount of information collected from each window. An application is presented consisting in a comparison of metallographic samples of aluminium alloys, and the results are interpreted.
Statistical comparison of the geometry of second-phase particles

International Nuclear Information System (INIS)

Benes, Viktor; Lechnerova, Radka; Klebanov, Lev; Slamova, Margarita; Slama, Peter

2009-01-01

In microscopic studies of materials, there is often a need to provide a statistical test as to whether two microstructures are different or not. Typically, there are some random objects (particles, grains, pores) and the comparison concerns their density, individual geometrical parameters and their spatial distribution. The problem is that neighbouring objects observed in a single window cannot be assumed to be stochastically independent, therefore classical statistical testing based on random sampling is not applicable. The aim of the present paper is to develop a test based on N-distances in probability theory. Using the measurements from a few independent windows, we consider a two-sample test, which involves a large amount of information collected from each window. An application is presented consisting in a comparison of metallographic samples of aluminium alloys, and the results are interpreted.
Reliability and statistical power analysis of cortical and subcortical FreeSurfer metrics in a large sample of healthy elderly.

Science.gov (United States)

Liem, Franziskus; Mérillat, Susan; Bezzola, Ladina; Hirsiger, Sarah; Philipp, Michel; Madhyastha, Tara; Jäncke, Lutz

2015-03-01

FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface area and volume. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10% difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α=0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.
Soft X-Ray Observations of a Complete Sample of X-Ray--selected BL Lacertae Objects

Science.gov (United States)

Perlman, Eric S.; Stocke, John T.; Wang, Q. Daniel; Morris, Simon L.

1996-01-01

We present the results of ROSAT PSPC observations of the X-ray selected BL Lacertae objects (XBLs) in the complete Einstein Extended Medium Sensitivity Survey (EM MS) sample. None of the objects is resolved in their respective PSPC images, but all are easily detected. All BL Lac objects in this sample are well-fitted by single power laws. Their X-ray spectra exhibit a variety of spectral slopes, with best-fit energy power-law spectral indices between α = 0.5-2.3. The PSPC spectra of this sample are slightly steeper than those typical of flat ratio-spectrum quasars. Because almost all of the individual PSPC spectral indices are equal to or slightly steeper than the overall optical to X-ray spectral indices for these same objects, we infer that BL Lac soft X-ray continua are dominated by steep-spectrum synchrotron radiation from a broad X-ray jet, rather than flat-spectrum inverse Compton radiation linked to the narrower radio/millimeter jet. The softness of the X-ray spectra of these XBLs revives the possibility proposed by Guilbert, Fabian, & McCray (1983) that BL Lac objects are lineless because the circumnuclear gas cannot be heated sufficiently to permit two stable gas phases, the cooler of which would comprise the broad emission-line clouds. Because unified schemes predict that hard self-Compton radiation is beamed only into a small solid angle in BL Lac objects, the steep-spectrum synchrotron tail controls the temperature of the circumnuclear gas at r ≤ 1018 cm and prevents broad-line cloud formation. We use these new ROSAT data to recalculate the X-ray luminosity function and cosmological evolution of the complete EMSS sample by determining accurate K-corrections for the sample and estimating the effects of variability and the possibility of incompleteness in the sample. Our analysis confirms that XBLs are evolving "negatively," opposite in sense to quasars, with Ve/Va = 0.331±0.060. The statistically significant difference between the values for X
An X-Ray/SDSS Sample: Observational Characterization of The Outflowing Gas

Science.gov (United States)

Perna, Michele; Brusa, M.; Lanzuisi, G.; Mignoli, M.

2016-10-01

Powerful ionised AGN-driven outflows, commonly detected both locally and at high redshift, are invoked to contribute to the co-evolution of SMBH and galaxies through feedback phenomena. Our recent works (Brusa+2015; 2016; Perna+2015a,b) have shown that the XMM-COSMOS targets with evidence of outflows collected so far ( 10 sources) appear to be associated with low X-ray kbol corrections (Lbol /LX ˜ 18), in spite of their spread in obscuration, in the locations on the SFR-Mstar diagram, in their radio emission. A higher statistical significance is required to validate a connection between outflow phenomena and a X-ray loudness. Moreover, in order to validate their binding nature to the galaxy fate, it is crucial to correctly determine the outflow energetics. This requires time consuming integral field spectroscopic (IFS) observations, which are, at present, mostly limited to high luminosity objectsThe study of SDSS data offers a complementary strategy to IFS efforts. I will present physical and demographic characterization of the AGN-galaxy system during the feedback phase obtained studying a sample of 500 X-ray/SDSS AGNs, at zdispersion) and X-ray properties (intrinsic X-ray luminosity, obscuration and X-ray kbol correction), to determine what drives ionised winds. Several diagnostic line ratios have been used to infer the physical properties of the ionised outflowing gas. The knowledge of these properties can reduce the actual uncertainties in the outflow energetics by a factor of ten, pointing to improve our understanding of the AGN outflow phenomenon and its impact on galaxy evolution.
Statistical techniques for sampling and monitoring natural resources

Science.gov (United States)

Hans T. Schreuder; Richard Ernst; Hugo Ramirez-Maldonado

2004-01-01

We present the statistical theory of inventory and monitoring from a probabilistic point of view. We start with the basics and show the interrelationships between designs and estimators illustrating the methods with a small artificial population as well as with a mapped realistic population. For such applications, useful open source software is given in Appendix 4....
Kappa statistic to measure agreement beyond chance in free-response assessments.

Science.gov (United States)

Carpentier, Marc; Combescure, Christophe; Merlini, Laura; Perneger, Thomas V

2017-04-19

The usual kappa statistic requires that all observations be enumerated. However, in free-response assessments, only positive (or abnormal) findings are notified, but negative (or normal) findings are not. This situation occurs frequently in imaging or other diagnostic studies. We propose here a kappa statistic that is suitable for free-response assessments. We derived the equivalent of Cohen's kappa statistic for two raters under the assumption that the number of possible findings for any given patient is very large, as well as a formula for sampling variance that is applicable to independent observations (for clustered observations, a bootstrap procedure is proposed). The proposed statistic was applied to a real-life dataset, and compared with the common practice of collapsing observations within a finite number of regions of interest. The free-response kappa is computed from the total numbers of discordant (b and c) and concordant positive (d) observations made in all patients, as 2d/(b + c + 2d). In 84 full-body magnetic resonance imaging procedures in children that were evaluated by 2 independent raters, the free-response kappa statistic was 0.820. Aggregation of results within regions of interest resulted in overestimation of agreement beyond chance. The free-response kappa provides an estimate of agreement beyond chance in situations where only positive findings are reported by raters.
A method for three-dimensional quantitative observation of the microstructure of biological samples

Science.gov (United States)

Wang, Pengfei; Chen, Dieyan; Ma, Wanyun; Wu, Hongxin; Ji, Liang; Sun, Jialin; Lv, Danyu; Zhang, Lu; Li, Ying; Tian, Ning; Zheng, Jinggao; Zhao, Fengying

2009-07-01

Contemporary biology has developed into the era of cell biology and molecular biology, and people try to study the mechanism of all kinds of biological phenomena at the microcosmic level now. Accurate description of the microstructure of biological samples is exigent need from many biomedical experiments. This paper introduces a method for 3-dimensional quantitative observation on the microstructure of vital biological samples based on two photon laser scanning microscopy (TPLSM). TPLSM is a novel kind of fluorescence microscopy, which has excellence in its low optical damage, high resolution, deep penetration depth and suitability for 3-dimensional (3D) imaging. Fluorescent stained samples were observed by TPLSM, and afterward the original shapes of them were obtained through 3D image reconstruction. The spatial distribution of all objects in samples as well as their volumes could be derived by image segmentation and mathematic calculation. Thus the 3-dimensionally and quantitatively depicted microstructure of the samples was finally derived. We applied this method to quantitative analysis of the spatial distribution of chromosomes in meiotic mouse oocytes at metaphase, and wonderful results came out last.
Statistical significance of epidemiological data. Seminar: Evaluation of epidemiological studies

International Nuclear Information System (INIS)

Weber, K.H.

1993-01-01

In stochastic damages, the numbers of events, e.g. the persons who are affected by or have died of cancer, and thus the relative frequencies (incidence or mortality) are binomially distributed random variables. Their statistical fluctuations can be characterized by confidence intervals. For epidemiologic questions, especially for the analysis of stochastic damages in the low dose range, the following issues are interesting: - Is a sample (a group of persons) with a definite observed damage frequency part of the whole population? - Is an observed frequency difference between two groups of persons random or statistically significant? - Is an observed increase or decrease of the frequencies with increasing dose random or statistically significant and how large is the regression coefficient (= risk coefficient) in this case? These problems can be solved by sttistical tests. So-called distribution-free tests and tests which are not bound to the supposition of normal distribution are of particular interest, such as: - χ 2 -independence test (test in contingency tables); - Fisher-Yates-test; - trend test according to Cochran; - rank correlation test given by Spearman. These tests are explained in terms of selected epidemiologic data, e.g. of leukaemia clusters, of the cancer mortality of the Japanese A-bomb survivors especially in the low dose range as well as on the sample of the cancer mortality in the high background area in Yangjiang (China). (orig.) [de
Statistical Analysis of Langmuir Waves Associated with Type III Radio Bursts: I. Wind Observations

Directory of Open Access Journals (Sweden)

Vidojević S.

2011-12-01

Full Text Available Interplanetary electron beams are unstable in the solar wind and they generate Langmuir waves at the local plasma frequency or its harmonic. Radio observations of the waves in the range 4-256 kHz, observed in 1994-2010 with the WAVES experiment onboard the WIND spacecraft, are statistically analyzed. A subset of 36 events with Langmuir waves and type III bursts occurring at the same time was selected. After removal of the background, the remaining power spectral density is modeled by the Pearson system of probability distributions (types I, IV and VI. The Stochastic Growth Theory (SGT predicts log-normal distribution for the power spectrum density of the Langmuir waves. Our results indicate that SGT possibly requires further verification.
Statistics in experimental design, preprocessing, and analysis of proteomics data.

Science.gov (United States)

Jung, Klaus

2011-01-01

High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.
Peak Bagging of red giant stars observed by Kepler: first results with a new method based on Bayesian nested sampling

Science.gov (United States)

Corsaro, Enrico; De Ridder, Joris

2015-09-01

The peak bagging analysis, namely the fitting and identification of single oscillation modes in stars' power spectra, coupled to the very high-quality light curves of red giant stars observed by Kepler, can play a crucial role for studying stellar oscillations of different flavor with an unprecedented level of detail. A thorough study of stellar oscillations would thus allow for deeper testing of stellar structure models and new insights in stellar evolution theory. However, peak bagging inferences are in general very challenging problems due to the large number of observed oscillation modes, hence of free parameters that can be involved in the fitting models. Efficiency and robustness in performing the analysis is what may be needed to proceed further. For this purpose, we developed a new code implementing the Nested Sampling Monte Carlo (NSMC) algorithm, a powerful statistical method well suited for Bayesian analyses of complex problems. In this talk we show the peak bagging of a sample of high signal-to-noise red giant stars by exploiting recent Kepler datasets and a new criterion for the detection of an oscillation mode based on the computation of the Bayesian evidence. Preliminary results for frequencies and lifetimes for single oscillation modes, together with acoustic glitches, are therefore presented.
Peak Bagging of red giant stars observed by Kepler: first results with a new method based on Bayesian nested sampling

Directory of Open Access Journals (Sweden)

Corsaro Enrico

2015-01-01

Full Text Available The peak bagging analysis, namely the fitting and identification of single oscillation modes in stars’ power spectra, coupled to the very high-quality light curves of red giant stars observed by Kepler, can play a crucial role for studying stellar oscillations of different flavor with an unprecedented level of detail. A thorough study of stellar oscillations would thus allow for deeper testing of stellar structure models and new insights in stellar evolution theory. However, peak bagging inferences are in general very challenging problems due to the large number of observed oscillation modes, hence of free parameters that can be involved in the fitting models. Efficiency and robustness in performing the analysis is what may be needed to proceed further. For this purpose, we developed a new code implementing the Nested Sampling Monte Carlo (NSMC algorithm, a powerful statistical method well suited for Bayesian analyses of complex problems. In this talk we show the peak bagging of a sample of high signal-to-noise red giant stars by exploiting recent Kepler datasets and a new criterion for the detection of an oscillation mode based on the computation of the Bayesian evidence. Preliminary results for frequencies and lifetimes for single oscillation modes, together with acoustic glitches, are therefore presented.
Adaptive sampling based on the cumulative distribution function of order statistics to delineate heavy-metal contaminated soils using kriging

International Nuclear Information System (INIS)

Juang, K.-W.; Lee, D.-Y.; Teng, Y.-L.

2005-01-01

Correctly classifying 'contaminated' areas in soils, based on the threshold for a contaminated site, is important for determining effective clean-up actions. Pollutant mapping by means of kriging is increasingly being used for the delineation of contaminated soils. However, those areas where the kriged pollutant concentrations are close to the threshold have a high possibility for being misclassified. In order to reduce the misclassification due to the over- or under-estimation from kriging, an adaptive sampling using the cumulative distribution function of order statistics (CDFOS) was developed to draw additional samples for delineating contaminated soils, while kriging. A heavy-metal contaminated site in Hsinchu, Taiwan was used to illustrate this approach. The results showed that compared with random sampling, adaptive sampling using CDFOS reduced the kriging estimation errors and misclassification rates, and thus would appear to be a better choice than random sampling, as additional sampling is required for delineating the 'contaminated' areas. - A sampling approach was derived for drawing additional samples while kriging
[The research protocol VI: How to choose the appropriate statistical test. Inferential statistics].

Science.gov (United States)

Flores-Ruiz, Eric; Miranda-Novales, María Guadalupe; Villasís-Keever, Miguel Ángel

2017-01-01

The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
The research protocol VI: How to choose the appropriate statistical test. Inferential statistics

Directory of Open Access Journals (Sweden)

Eric Flores-Ruiz

2017-10-01

Full Text Available The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
Multiple category-lot quality assurance sampling: a new classification system with application to schistosomiasis control.

Directory of Open Access Journals (Sweden)

Casey Olives

Full Text Available Originally a binary classifier, Lot Quality Assurance Sampling (LQAS has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%, and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa.We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n=15 and n=25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa.Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87. In three of the studies, the kappa-statistic for a design with n=15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50, the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error.This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools.
Sample sizes and model comparison metrics for species distribution models

Science.gov (United States)

B.B. Hanberry; H.S. He; D.C. Dey

2012-01-01

Species distribution models use small samples to produce continuous distribution maps. The question of how small a sample can be to produce an accurate model generally has been answered based on comparisons to maximum sample sizes of 200 observations or fewer. In addition, model comparisons often are made with the kappa statistic, which has become controversial....

Quantal bookkeeping of samples and locality

International Nuclear Information System (INIS)

Groenewold, H.J.

1983-01-01

The skeptical pruned ensemble interpretation of quantal measurements is described in the conventional representation and in an equivalent hedge-hog representation. (A symmetric hedge-hog is displayed by a spiny array of Hilbert projectors.) A fundamental problem of any individual interpretation is the distinction in the formalism of individual samples and their mutual independence. In the formal hedge-hog bookkeeping an auxiliary hedge-hog hypothesis is proposed, which associates separate real samples of quantal ensembles with separate fictions hedge-hogs. In accordance with its private unobservable hedge-hog each sample has so to speak to every potential observable question its own definite potential answer in store. The statistical distribution of the answers of the various samples to all questions is represented by the ensemble operator, which can only be attributed to the entire ensemble as a whole. Observable answers can be obtained from an individual real sample only one at a time. In this fictitous finer grained model the hedge-hogs metaphorically represent a kind of individual memory of the corresponding samples. The principle of mutual kinematical and dynamical independence of samples as well as the principle of locality in retarded correlation is satisfied. This has to be paid for by indefinite statistics of the hedge-hog distribution in the ensemble. Even if the hedge-hog model is afterwards thrown away as an untestable fake, the logical compatibility of these two fundamental principles (whatever their significance may be) with standard quantum mechanics holds firm. The physical compatibility remains an open question. (orig.)
Some statistical and sampling needs for detecting spills or migration at commercial low-level radioactive waste disposal sites

International Nuclear Information System (INIS)

Thomas, J.M.; Eberhardt, L.L.; Skalski, J.R.; Simmons, M.A.

1984-05-01

As part of a larger study funded by the US Nuclear Regulatory Commission we have been investigating field sampling strategies and compositing as a means of detecting spills or migration at commercial low-level radioactive waste disposal sites. The overall project is designed to produce information for developing guidance on implementing 10 CFR part 61. Compositing (pooling samples) for detection is discussed first, followed by our development of a statistical test to allow a decision as to whether any component of a composite exceeds a prescribed maximum acceptable level. The question of optimal field sampling designs and an Apple computer program designed to show the difficulties in constructing efficient field designs and using compositing schemes are considered. 6 references, 3 figures, 3 tables
Handling missing data in ranked set sampling

CERN Document Server

Bouza-Herrera, Carlos N

2013-01-01

The existence of missing observations is a very important aspect to be considered in the application of survey sampling, for example. In human populations they may be caused by a refusal of some interviewees to give the true value for the variable of interest. Traditionally, simple random sampling is used to select samples. Most statistical models are supported by the use of samples selected by means of this design. In recent decades, an alternative design has started being used, which, in many cases, shows an improvement in terms of accuracy compared with traditional sampling. It is called R
The Statistics of Emission and Detection of Neutrons and Photons from Fissile Samples for Safeguard Applications

International Nuclear Information System (INIS)

Enqvist, Andreas

2008-03-01

One particular purpose of nuclear safeguards, in addition to accounting for known materials, is the detection, identifying and quantifying unknown material, to prevent accidental and clandestine transports and uses of nuclear materials. This can be achieved in a non-destructive way through the various physical and statistical properties of particle emission and detection from such materials. This thesis addresses some fundamental aspects of nuclear materials and the way they can be detected and quantified by such methods. Factorial moments or multiplicities have long been used within the safeguard area. These are low order moments of the underlying number distributions of emission and detection. One objective of the present work was to determine the full probability distribution and its dependence on the sample mass and the detection process. Derivation and analysis of the full probability distribution and its dependence on the above factors constitutes the first part of the thesis. Another possibility of identifying unknown samples lies in the information in the 'fingerprints' (pulse shape distribution) left by a detected neutron or photon. A study of the statistical properties of the interaction of the incoming radiation (neutrons and photons) with the detectors constitutes the second part of the thesis. The interaction between fast neutrons and organic scintillation detectors is derived, and compared to Monte Carlo simulations. An experimental approach is also addressed in which cross correlation measurements were made using liquid scintillation detectors. First the dependence of the pulse height distribution on the energy and collision number of an incoming neutron was derived analytically and compared to numerical simulations. Then an algorithm was elaborated which can discriminate neutron pulses from photon pulses. The resulting cross correlation graphs are analyzed and discussed whether they can be used in applications to distinguish possible sample
The Statistics of Emission and Detection of Neutrons and Photons from Fissile Samples for Safeguard Applications

Energy Technology Data Exchange (ETDEWEB)

Enqvist, Andreas

2008-03-15

One particular purpose of nuclear safeguards, in addition to accounting for known materials, is the detection, identifying and quantifying unknown material, to prevent accidental and clandestine transports and uses of nuclear materials. This can be achieved in a non-destructive way through the various physical and statistical properties of particle emission and detection from such materials. This thesis addresses some fundamental aspects of nuclear materials and the way they can be detected and quantified by such methods. Factorial moments or multiplicities have long been used within the safeguard area. These are low order moments of the underlying number distributions of emission and detection. One objective of the present work was to determine the full probability distribution and its dependence on the sample mass and the detection process. Derivation and analysis of the full probability distribution and its dependence on the above factors constitutes the first part of the thesis. Another possibility of identifying unknown samples lies in the information in the 'fingerprints' (pulse shape distribution) left by a detected neutron or photon. A study of the statistical properties of the interaction of the incoming radiation (neutrons and photons) with the detectors constitutes the second part of the thesis. The interaction between fast neutrons and organic scintillation detectors is derived, and compared to Monte Carlo simulations. An experimental approach is also addressed in which cross correlation measurements were made using liquid scintillation detectors. First the dependence of the pulse height distribution on the energy and collision number of an incoming neutron was derived analytically and compared to numerical simulations. Then an algorithm was elaborated which can discriminate neutron pulses from photon pulses. The resulting cross correlation graphs are analyzed and discussed whether they can be used in applications to distinguish possible
Early pack-off diagnosis in drilling using an adaptive observer and statistical change detection

DEFF Research Database (Denmark)

Willersrud, Anders; Imsland, Lars; Blanke, Mogens

2015-01-01

in the well. A model-based adaptive observer is used to estimate these friction parameters as well as flow rates. Detecting changes to these estimates can then be used for pack-off diagnosis, which due to measurement noise is done using statistical change detection. Isolation of incident type and location...... is done using a multivariate generalized likelihood ratio test, determining the change direction of the estimated mean values. The method is tested on simulated data from the commercial high-fidelity multi-phase simulator OLGA, where three different pack-offs at different locations and with different...
A simple and robust statistical framework for planning, analysing and interpreting faecal egg count reduction test (FECRT) studies

DEFF Research Database (Denmark)

Denwood, M.J.; McKendrick, I.J.; Matthews, L.

Introduction. There is an urgent need for a method of analysing FECRT data that is computationally simple and statistically robust. A method for evaluating the statistical power of a proposed FECRT study would also greatly enhance the current guidelines. Methods. A novel statistical framework has...... been developed that evaluates observed FECRT data against two null hypotheses: (1) the observed efficacy is consistent with the expected efficacy, and (2) the observed efficacy is inferior to the expected efficacy. The method requires only four simple summary statistics of the observed data. Power...... that the notional type 1 error rate of the new statistical test is accurate. Power calculations demonstrate a power of only 65% with a sample size of 20 treatment and control animals, which increases to 69% with 40 control animals or 79% with 40 treatment animals. Discussion. The method proposed is simple...
Baysian estimation of P(X > x) from a small sample of Gaussian data

DEFF Research Database (Denmark)

Ditlevsen, Ove Dalager

2017-01-01

The classical statistical uncertainty problem of estimation of upper tail probabilities on the basis of a small sample of observations of a Gaussian random variable is considered. Predictive posterior estimation is discussed, adopting the standard statistical model with diffuse priors of the two...
Advances in statistics

Science.gov (United States)

Howard Stauffer; Nadav Nur

2005-01-01

The papers included in the Advances in Statistics section of the Partners in Flight (PIF) 2002 Proceedings represent a small sample of statistical topics of current importance to Partners In Flight research scientists: hierarchical modeling, estimation of detection probabilities, and Bayesian applications. Sauer et al. (this volume) examines a hierarchical model...
Geometric statistical inference

International Nuclear Information System (INIS)

Periwal, Vipul

1999-01-01

A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined
Frontiers in statistical quality control 11

CERN Document Server

Schmid, Wolfgang

2015-01-01

The main focus of this edited volume is on three major areas of statistical quality control: statistical process control (SPC), acceptance sampling and design of experiments. The majority of the papers deal with statistical process control, while acceptance sampling and design of experiments are also treated to a lesser extent. The book is organized into four thematic parts, with Part I addressing statistical process control. Part II is devoted to acceptance sampling. Part III covers the design of experiments, while Part IV discusses related fields. The twenty-three papers in this volume stem from The 11th International Workshop on Intelligent Statistical Quality Control, which was held in Sydney, Australia from August 20 to August 23, 2013. The event was hosted by Professor Ross Sparks, CSIRO Mathematics, Informatics and Statistics, North Ryde, Australia and was jointly organized by Professors S. Knoth, W. Schmid and Ross Sparks. The papers presented here were carefully selected and reviewed by the scientifi...
Statistical properties of a utility measure of observer performance compared to area under the ROC curve

Science.gov (United States)

Abbey, Craig K.; Samuelson, Frank W.; Gallas, Brandon D.; Boone, John M.; Niklason, Loren T.

2013-03-01

The receiver operating characteristic (ROC) curve has become a common tool for evaluating diagnostic imaging technologies, and the primary endpoint of such evaluations is the area under the curve (AUC), which integrates sensitivity over the entire false positive range. An alternative figure of merit for ROC studies is expected utility (EU), which focuses on the relevant region of the ROC curve as defined by disease prevalence and the relative utility of the task. However if this measure is to be used, it must also have desirable statistical properties keep the burden of observer performance studies as low as possible. Here, we evaluate effect size and variability for EU and AUC. We use two observer performance studies recently submitted to the FDA to compare the EU and AUC endpoints. The studies were conducted using the multi-reader multi-case methodology in which all readers score all cases in all modalities. ROC curves from the study were used to generate both the AUC and EU values for each reader and modality. The EU measure was computed assuming an iso-utility slope of 1.03. We find mean effect sizes, the reader averaged difference between modalities, to be roughly 2.0 times as big for EU as AUC. The standard deviation across readers is roughly 1.4 times as large, suggesting better statistical properties for the EU endpoint. In a simple power analysis of paired comparison across readers, the utility measure required 36% fewer readers on average to achieve 80% statistical power compared to AUC.
Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out

DEFF Research Database (Denmark)

Tvedebrink, Torben; Eriksen, Poul Svante; Mogensen, Helle Smidt

2012-01-01

Abstract DNA samples found at a scene of crime or obtained from the debris of a mass disaster accident are often subject to degradation. When using the STR DNA technology, the DNA profile is observed via a so-called electropherogram (EPG), where the alleles are identified as signal peaks above...... data from degraded DNA, where cases with varying amounts of DNA and levels of degradation are investigated....
Classroom Research: Assessment of Student Understanding of Sampling Distributions of Means and the Central Limit Theorem in Post-Calculus Probability and Statistics Classes

Science.gov (United States)

Lunsford, M. Leigh; Rowell, Ginger Holmes; Goodson-Espy, Tracy

2006-01-01

We applied a classroom research model to investigate student understanding of sampling distributions of sample means and the Central Limit Theorem in post-calculus introductory probability and statistics courses. Using a quantitative assessment tool developed by previous researchers and a qualitative assessment tool developed by the authors, we…
Comparing geological and statistical approaches for element selection in sediment tracing research

Science.gov (United States)

Laceby, J. Patrick; McMahon, Joe; Evrard, Olivier; Olley, Jon

2015-04-01

Elevated suspended sediment loads reduce reservoir capacity and significantly increase the cost of operating water treatment infrastructure, making the management of sediment supply to reservoirs of increasingly importance. Sediment fingerprinting techniques can be used to determine the relative contributions of different sources of sediment accumulating in reservoirs. The objective of this research is to compare geological and statistical approaches to element selection for sediment fingerprinting modelling. Time-integrated samplers (n=45) were used to obtain source samples from four major subcatchments flowing into the Baroon Pocket Dam in South East Queensland, Australia. The geochemistry of potential sources were compared to the geochemistry of sediment cores (n=12) sampled in the reservoir. The geochemical approach selected elements for modelling that provided expected, observed and statistical discrimination between sediment sources. Two statistical approaches selected elements for modelling with the Kruskal-Wallis H-test and Discriminatory Function Analysis (DFA). In particular, two different significance levels (0.05 & 0.35) for the DFA were included to investigate the importance of element selection on modelling results. A distribution model determined the relative contributions of different sources to sediment sampled in the Baroon Pocket Dam. Elemental discrimination was expected between one subcatchment (Obi Obi Creek) and the remaining subcatchments (Lexys, Falls and Bridge Creek). Six major elements were expected to provide discrimination. Of these six, only Fe2O3 and SiO2 provided expected, observed and statistical discrimination. Modelling results with this geological approach indicated 36% (+/- 9%) of sediment sampled in the reservoir cores were from mafic-derived sources and 64% (+/- 9%) were from felsic-derived sources. The geological and the first statistical approach (DFA0.05) differed by only 1% (σ 5%) for 5 out of 6 model groupings with only
Limitations of Poisson statistics in describing radioactive decay.

Science.gov (United States)

Sitek, Arkadiusz; Celler, Anna M

2015-12-01

The assumption that nuclear decays are governed by Poisson statistics is an approximation. This approximation becomes unjustified when data acquisition times longer than or even comparable with the half-lives of the radioisotope in the sample are considered. In this work, the limits of the Poisson-statistics approximation are investigated. The formalism for the statistics of radioactive decay based on binomial distribution is derived. The theoretical factor describing the deviation of variance of the number of decays predicated by the Poisson distribution from the true variance is defined and investigated for several commonly used radiotracers such as (18)F, (15)O, (82)Rb, (13)N, (99m)Tc, (123)I, and (201)Tl. The variance of the number of decays estimated using the Poisson distribution is significantly different than the true variance for a 5-minute observation time of (11)C, (15)O, (13)N, and (82)Rb. Durations of nuclear medicine studies often are relatively long; they may be even a few times longer than the half-lives of some short-lived radiotracers. Our study shows that in such situations the Poisson statistics is unsuitable and should not be applied to describe the statistics of the number of decays in radioactive samples. However, the above statement does not directly apply to counting statistics at the level of event detection. Low sensitivities of detectors which are used in imaging studies make the Poisson approximation near perfect. Copyright © 2015 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
A statistical method to get surface level air-temperature from satellite observations of precipitable water

Digital Repository Service at National Institute of Oceanography (India)

Pankajakshan, T.; Shikauchi, A; Sugimori, Y.; Kubota, M.

-T a and precipitable water. The rms errors of the SSMI-T a , in this case are found to be reduced to 1.0°C. 1. Introduction Satellite derived surface-level meteorological parameters are considered to be a better alternative to sparse ship... Vol. 49, pp. 551 to 558. 1993 A Statistical Method to Get Surface Level Air-Temperature from Satellite Observations of Precipitable Water PANKAJAKSHAN THADATHIL*, AKIRA SHIKAUCHI, YASUHIRO SUGIMORI and MASAHISA KUBOTA School of Marine Science...
A high-resolution open biomass burning emission inventory based on statistical data and MODIS observations in mainland China

Science.gov (United States)

Xu, Y.; Fan, M.; Huang, Z.; Zheng, J.; Chen, L.

2017-12-01

Open biomass burning which has adverse effects on air quality and human health is an important source of gas and particulate matter (PM) in China. Current emission estimations of open biomass burning are generally based on single source (alternative to statistical data and satellite-derived data) and thus contain large uncertainty due to the limitation of data. In this study, to quantify the 2015-based amount of open biomass burning, we established a new estimation method for open biomass burning activity levels by combining the bottom-up statistical data and top-down MODIS observations. And three sub-category sources which used different activity data were considered. For open crop residue burning, the "best estimate" of activity data was obtained by averaging the statistical data from China statistical yearbooks and satellite observations from MODIS burned area product MCD64A1 weighted by their uncertainties. For the forest and grassland fires, their activity levels were represented by the combination of statistical data and MODIS active fire product MCD14ML. Using the fire radiative power (FRP) which is considered as a better indicator of active fire level as the spatial allocation surrogate, coarse gridded emissions were reallocated into 3km ×3km grids to get a high-resolution emission inventory. Our results showed that emissions of CO, NOx, SO2, NH3, VOCs, PM2.5, PM10, BC and OC in mainland China were 6607, 427, 84, 79, 1262, 1198, 1222, 159 and 686 Gg/yr, respectively. Among all provinces of China, Henan, Shandong and Heilongjiang were the top three contributors to the total emissions. In this study, the developed open biomass burning emission inventory with a high-resolution could support air quality modeling and policy-making for pollution control.
Significance levels for studies with correlated test statistics.

Science.gov (United States)

Shi, Jianxin; Levinson, Douglas F; Whittemore, Alice S

2008-07-01

When testing large numbers of null hypotheses, one needs to assess the evidence against the global null hypothesis that none of the hypotheses is false. Such evidence typically is based on the test statistic of the largest magnitude, whose statistical significance is evaluated by permuting the sample units to simulate its null distribution. Efron (2007) has noted that correlation among the test statistics can induce substantial interstudy variation in the shapes of their histograms, which may cause misleading tail counts. Here, we show that permutation-based estimates of the overall significance level also can be misleading when the test statistics are correlated. We propose that such estimates be conditioned on a simple measure of the spread of the observed histogram, and we provide a method for obtaining conditional significance levels. We justify this conditioning using the conditionality principle described by Cox and Hinkley (1974). Application of the method to gene expression data illustrates the circumstances when conditional significance levels are needed.
How does observation uncertainty influence which stream water samples are most informative for model calibration?

Science.gov (United States)

Wang, Ling; van Meerveld, Ilja; Seibert, Jan

2016-04-01

Streamflow isotope samples taken during rainfall-runoff events are very useful for multi-criteria model calibration because they can help decrease parameter uncertainty and improve internal model consistency. However, the number of samples that can be collected and analysed is often restricted by practical and financial constraints. It is, therefore, important to choose an appropriate sampling strategy and to obtain samples that have the highest information content for model calibration. We used the Birkenes hydrochemical model and synthetic rainfall, streamflow and isotope data to explore which samples are most informative for model calibration. Starting with error-free observations, we investigated how many samples are needed to obtain a certain model fit. Based on different parameter sets, representing different catchments, and different rainfall events, we also determined which sampling times provide the most informative data for model calibration. Our results show that simulation performance for models calibrated with the isotopic data from two intelligently selected samples was comparable to simulations based on isotopic data for all 100 time steps. The models calibrated with the intelligently selected samples also performed better than the model calibrations with two benchmark sampling strategies (random selection and selection based on hydrologic information). Surprisingly, samples on the rising limb and at the peak were less informative than expected and, generally, samples taken at the end of the event were most informative. The timing of the most informative samples depends on the proportion of different flow components (baseflow, slow response flow, fast response flow and overflow). For events dominated by baseflow and slow response flow, samples taken at the end of the event after the fast response flow has ended were most informative; when the fast response flow was dominant, samples taken near the peak were most informative. However when overflow

Robust functional statistics applied to Probability Density Function shape screening of sEMG data.

Science.gov (United States)

Boudaoud, S; Rix, H; Al Harrach, M; Marin, F

2014-01-01

Recent studies pointed out possible shape modifications of the Probability Density Function (PDF) of surface electromyographical (sEMG) data according to several contexts like fatigue and muscle force increase. Following this idea, criteria have been proposed to monitor these shape modifications mainly using High Order Statistics (HOS) parameters like skewness and kurtosis. In experimental conditions, these parameters are confronted with small sample size in the estimation process. This small sample size induces errors in the estimated HOS parameters restraining real-time and precise sEMG PDF shape monitoring. Recently, a functional formalism, the Core Shape Model (CSM), has been used to analyse shape modifications of PDF curves. In this work, taking inspiration from CSM method, robust functional statistics are proposed to emulate both skewness and kurtosis behaviors. These functional statistics combine both kernel density estimation and PDF shape distances to evaluate shape modifications even in presence of small sample size. Then, the proposed statistics are tested, using Monte Carlo simulations, on both normal and Log-normal PDFs that mimic observed sEMG PDF shape behavior during muscle contraction. According to the obtained results, the functional statistics seem to be more robust than HOS parameters to small sample size effect and more accurate in sEMG PDF shape screening applications.
Analysis of Variance with Summary Statistics in Microsoft® Excel®

Science.gov (United States)

Larson, David A.; Hsu, Ko-Cheng

2010-01-01

Students regularly are asked to solve Single Factor Analysis of Variance problems given only the sample summary statistics (number of observations per category, category means, and corresponding category standard deviations). Most undergraduate students today use Excel for data analysis of this type. However, Excel, like all other statistical…
Mathematical background and attitudes toward statistics in a sample of Spanish college students.

Science.gov (United States)

Carmona, José; Martínez, Rafael J; Sánchez, Manuel

2005-08-01

To examine the relation of mathematical background and initial attitudes toward statistics of Spanish college students in social sciences the Survey of Attitudes Toward Statistics was given to 827 students. Multivariate analyses tested the effects of two indicators of mathematical background (amount of exposure and achievement in previous courses) on the four subscales. Analysis suggested grades in previous courses are more related to initial attitudes toward statistics than the number of mathematics courses taken. Mathematical background was related with students' affective responses to statistics but not with their valuing of statistics. Implications of possible research are discussed.
A statistical rationale for establishing process quality control limits using fixed sample size, for critical current verification of SSC superconducting wire

International Nuclear Information System (INIS)

Pollock, D.A.; Brown, G.; Capone, D.W. II; Christopherson, D.; Seuntjens, J.M.; Woltz, J.

1992-01-01

This work has demonstrated the statistical concepts behind the XBAR R method for determining sample limits to verify billet I c performance and process uniformity. Using a preliminary population estimate for μ and σ from a stable production lot of only 5 billets, we have shown that reasonable sensitivity to systematic process drift and random within billet variation may be achieved, by using per billet subgroup sizes of moderate proportions. The effects of subgroup size (n) and sampling risk (α and β) on the calculated control limits have been shown to be important factors that need to be carefully considered when selecting an actual number of measurements to be used per billet, for each supplier process. Given the present method of testing in which individual wire samples are ramped to I c only once, with measurement uncertainty due to repeatability and reproducibility (typically > 1.4%), large subgroups (i.e. >30 per billet) appear to be unnecessary, except as an inspection tool to confirm wire process history for each spool. The introduction of the XBAR R method or a similar Statistical Quality Control procedure is recommend for use in the superconducing wire production program, particularly when the program transitions from requiring tests for all pieces of wire to sampling each production unit
Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out

DEFF Research Database (Denmark)

Tvedebrink, Torben; Eriksen, Poul Svante; Mogensen, Helle Smidt

2012-01-01

DNA samples found at a scene of crime or obtained from the debris of a mass disaster accident are often subject to degradation. When using the STR DNA technology, the DNA profile is observed via a so-called electropherogram (EPG), where the alleles are identified as signal peaks above a certain...... data from degraded DNA, where cases with varying amounts of DNA and levels of degradation are investigated....
Statistical Analysis of Clinical Data on a Pocket Calculator, Part 2 Statistics on a Pocket Calculator, Part 2

CERN Document Server

Cleophas, Ton J

2012-01-01

The first part of this title contained all statistical tests relevant to starting clinical investigations, and included tests for continuous and binary data, power, sample size, multiple testing, variability, confounding, interaction, and reliability. The current part 2 of this title reviews methods for handling missing data, manipulated data, multiple confounders, predictions beyond observation, uncertainty of diagnostic tests, and the problems of outliers. Also robust tests, non-linear modeling , goodness of fit testing, Bhatacharya models, item response modeling, superiority testing, variab
Sample representativeness verification of the FADN CZ farm business sample

Directory of Open Access Journals (Sweden)

Marie Prášilová

2011-01-01

Full Text Available Sample representativeness verification is one of the key stages of statistical work. After having joined the European Union the Czech Republic joined also the Farm Accountancy Data Network system of the Union. This is a sample of bodies and companies doing business in agriculture. Detailed production and economic data on the results of farming business are collected from that sample annually and results for the entire population of the country´s farms are then estimated and assessed. It is important hence, that the sample be representative. Representativeness is to be assessed as to the number of farms included in the survey and also as to the degree of accordance of the measures and indices as related to the population. The paper deals with the special statistical techniques and methods of the FADN CZ sample representativeness verification including the necessary sample size statement procedure. The Czech farm population data have been obtained from the Czech Statistical Office data bank.
Contributions to statistics

CERN Document Server

Mahalanobis, P C

1965-01-01

Contributions to Statistics focuses on the processes, methodologies, and approaches involved in statistics. The book is presented to Professor P. C. Mahalanobis on the occasion of his 70th birthday. The selection first offers information on the recovery of ancillary information and combinatorial properties of partially balanced designs and association schemes. Discussions focus on combinatorial applications of the algebra of association matrices, sample size analogy, association matrices and the algebra of association schemes, and conceptual statistical experiments. The book then examines latt
Statistics Clinic

Science.gov (United States)

Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James

2014-01-01

Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
A statistical test for outlier identification in data envelopment analysis

Directory of Open Access Journals (Sweden)

Morteza Khodabin

2010-09-01

Full Text Available In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the presented method, each observation is deleted from the sample once and the resulting linear program is solved, leading to a distribution of efficiency estimates. Based on the achieved distribution, a pared test is designed to identify the potential outlier(s. We illustrate the method through a real data set. The method could be used in a first step, as an exploratory data analysis, before using any frontier estimation.
Design-based Sample and Probability Law-Assumed Sample: Their Role in Scientific Investigation.

Science.gov (United States)

Ojeda, Mario Miguel; Sahai, Hardeo

2002-01-01

Discusses some key statistical concepts in probabilistic and non-probabilistic sampling to provide an overview for understanding the inference process. Suggests a statistical model constituting the basis of statistical inference and provides a brief review of the finite population descriptive inference and a quota sampling inferential theory.…
Using polarimetric radar observations and probabilistic inference to develop the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), a novel microphysical parameterization framework

Science.gov (United States)

van Lier-Walqui, M.; Morrison, H.; Kumjian, M. R.; Prat, O. P.

2016-12-01

Microphysical parameterization schemes have reached an impressive level of sophistication: numerous prognostic hydrometeor categories, and either size-resolved (bin) particle size distributions, or multiple prognostic moments of the size distribution. Yet, uncertainty in model representation of microphysical processes and the effects of microphysics on numerical simulation of weather has not shown a improvement commensurate with the advanced sophistication of these schemes. We posit that this may be caused by unconstrained assumptions of these schemes, such as ad-hoc parameter value choices and structural uncertainties (e.g. choice of a particular form for the size distribution). We present work on development and observational constraint of a novel microphysical parameterization approach, the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), which seeks to address these sources of uncertainty. Our framework avoids unnecessary a priori assumptions, and instead relies on observations to provide probabilistic constraint of the scheme structure and sensitivities to environmental and microphysical conditions. We harness the rich microphysical information content of polarimetric radar observations to develop and constrain BOSS within a Bayesian inference framework using a Markov Chain Monte Carlo sampler (see Kumjian et al., this meeting for details on development of an associated polarimetric forward operator). Our work shows how knowledge of microphysical processes is provided by polarimetric radar observations of diverse weather conditions, and which processes remain highly uncertain, even after considering observations.
Statistical data fusion for cross-tabulation

NARCIS (Netherlands)

Kamakura, W.A.; Wedel, M.

The authors address the situation in which a researcher wants to cross-tabulate two sets of discrete variables collected in independent samples, but a subset of the variables is common to both samples. The authors propose a statistical data-fusion model that allows for statistical tests of
Business statistics I essentials

CERN Document Server

Clark, Louise

2014-01-01

REA's Essentials provide quick and easy access to critical information in a variety of different fields, ranging from the most basic to the most advanced. As its name implies, these concise, comprehensive study guides summarize the essentials of the field covered. Essentials are helpful when preparing for exams, doing homework and will remain a lasting reference source for students, teachers, and professionals. Business Statistics I includes descriptive statistics, introduction to probability, probability distributions, sampling and sampling distributions, interval estimation, and hypothesis t
Latent spatial models and sampling design for landscape genetics

Science.gov (United States)

Hanks, Ephraim M.; Hooten, Mevin B.; Knick, Steven T.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Cross, Todd B.; Schwartz, Michael K.

2016-01-01

We propose a spatially-explicit approach for modeling genetic variation across space and illustrate how this approach can be used to optimize spatial prediction and sampling design for landscape genetic data. We propose a multinomial data model for categorical microsatellite allele data commonly used in landscape genetic studies, and introduce a latent spatial random effect to allow for spatial correlation between genetic observations. We illustrate how modern dimension reduction approaches to spatial statistics can allow for efficient computation in landscape genetic statistical models covering large spatial domains. We apply our approach to propose a retrospective spatial sampling design for greater sage-grouse (Centrocercus urophasianus) population genetics in the western United States.
Sampling, Probability Models and Statistical Reasoning -RE ...

Indian Academy of Sciences (India)

random sampling allows data to be modelled with the help of probability ... g based on different trials to get an estimate of the experimental error. ... research interests lie in the .... if e is indeed the true value of the proportion of defectives in the.
The enhanced greenhouse signal versus natural variations in observed climate time series: a statistical approach

Energy Technology Data Exchange (ETDEWEB)

Schoenwiese, C D [J.W. Goethe Univ., Frankfurt (Germany). Inst. for Meteorology and Geophysics

1996-12-31

It is a well-known fact that human activities lead to an atmospheric concentration increase of some IR-active trace gases (greenhouse gases GHG) and that this influence enhances the `greenhouse effect`. However, there are major quantitative and regional uncertainties in the related climate model projections and the observational data reflect the whole complex of both anthropogenic and natural forcing of the climate system. This contribution aims at the separation of the anthropogenic enhanced greenhouse signal in observed global surface air temperature data versus other forcing using statistical methods such as multiple (multiforced) regressions and neural networks. The competitive natural forcing considered are volcanic and solar activity, in addition the ENSO (El Nino/Southern Oscillation) mechanism. This analysis will be extended also to the NAO (North Atlantic Oscillation) and anthropogenic sulfate formation in the troposphere
The enhanced greenhouse signal versus natural variations in observed climate time series: a statistical approach

Energy Technology Data Exchange (ETDEWEB)

Schoenwiese, C.D. [J.W. Goethe Univ., Frankfurt (Germany). Inst. for Meteorology and Geophysics

1995-12-31

It is a well-known fact that human activities lead to an atmospheric concentration increase of some IR-active trace gases (greenhouse gases GHG) and that this influence enhances the `greenhouse effect`. However, there are major quantitative and regional uncertainties in the related climate model projections and the observational data reflect the whole complex of both anthropogenic and natural forcing of the climate system. This contribution aims at the separation of the anthropogenic enhanced greenhouse signal in observed global surface air temperature data versus other forcing using statistical methods such as multiple (multiforced) regressions and neural networks. The competitive natural forcing considered are volcanic and solar activity, in addition the ENSO (El Nino/Southern Oscillation) mechanism. This analysis will be extended also to the NAO (North Atlantic Oscillation) and anthropogenic sulfate formation in the troposphere
Statistical properties of the surface velocity field in the northern Gulf of Mexico sampled by GLAD drifters

NARCIS (Netherlands)

Mariano, A.J.; Ryan, E.H.; Huntley, H.S.; Laurindo, L.C.; Coelho, E.; Ozgokmen, TM; Berta, M.; Bogucki, D; Chen, S.S.; Curcic, M.; Drouin, K.L.; Gough, M; Haus, BK; Haza, A.C.; Hogan, P; Iskandarani, M; Jacobs, G; Kirwan Jr., A.D.; Laxague, N; Lipphardt Jr., B.; Magaldi, M.G.; Novelli, G.; Reniers, A.J.H.M.; Restrepo, J.M.; Smith, C; Valle-Levinson, A.; Wei, M.

2016-01-01

The Grand LAgrangian Deployment (GLAD) used multiscale sampling and GPS technology to observe time series of drifter positions with initial drifter separation of O(100 m) to O(10 km), and nominal 5 min sampling, during the summer and fall of 2012 in the northern Gulf of Mexico. Histograms of the
Statistical inference

CERN Document Server

Rohatgi, Vijay K

2003-01-01

Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth

PLANETARY CANDIDATES OBSERVED BY KEPLER IV: PLANET SAMPLE FROM Q1-Q8 (22 MONTHS)

International Nuclear Information System (INIS)

Burke, Christopher J.; Mullally, F.; Rowe, Jason F.; Thompson, Susan E.; Coughlin, Jeffrey L.; Caldwell, Douglas A.; Jenkins, Jon M.; Bryson, Stephen T.; Haas, Michael R.; Batalha, Natalie M.; Borucki, William J.; Christiansen, Jessie L.; Ciardi, David R.; Still, Martin; Barclay, Thomas; Chaplin, William J.; Clarke, Bruce D.; Cochran, William D.; Demory, Brice-Olivier; Esquerdo, Gilbert A.

2014-01-01

We provide updates to the Kepler planet candidate sample based upon nearly two years of high-precision photometry (i.e., Q1-Q8). From an initial list of nearly 13,400 threshold crossing events, 480 new host stars are identified from their flux time series as consistent with hosting transiting planets. Potential transit signals are subjected to further analysis using the pixel-level data, which allows background eclipsing binaries to be identified through small image position shifts during transit. We also re-evaluate Kepler Objects of Interest (KOIs) 1-1609, which were identified early in the mission, using substantially more data to test for background false positives and to find additional multiple systems. Combining the new and previous KOI samples, we provide updated parameters for 2738 Kepler planet candidates distributed across 2017 host stars. From the combined Kepler planet candidates, 472 are new from the Q1-Q8 data examined in this study. The new Kepler planet candidates represent ∼40% of the sample with R P ∼ 1 R ⊕ and represent ∼40% of the low equilibrium temperature (T eq < 300 K) sample. We review the known biases in the current sample of Kepler planet candidates relevant to evaluating planet population statistics with the current Kepler planet candidate sample
Statistics of high-altitude and high-latitude O+ ion outflows observed by Cluster/CIS

Directory of Open Access Journals (Sweden)

A. Korth

2005-07-01

Full Text Available The persistent outflows of O+ ions observed by the Cluster CIS/CODIF instrument were studied statistically in the high-altitude (from 3 up to 11 RE and high-latitude (from 70 to ~90 deg invariant latitude, ILAT polar region. The principal results are: (1 Outflowing O+ ions with more than 1keV are observed above 10 RE geocentric distance and above 85deg ILAT location; (2 at 6-8 RE geocentric distance, the latitudinal distribution of O+ ion outflow is consistent with velocity filter dispersion from a source equatorward and below the spacecraft (e.g. the cusp/cleft; (3 however, at 8-12 RE geocentric distance the distribution of O+ outflows cannot be explained by velocity filter only. The results suggest that additional energization or acceleration processes for outflowing O+ ions occur at high altitudes and high latitudes in the dayside polar region. Keywords. Magnetospheric physics (Magnetospheric configuration and dynamics, Solar wind-magnetosphere interactions
Observer variability in the assessment of type and dysplasia of colorectal adenomas, analyzed using kappa statistics

DEFF Research Database (Denmark)

Jensen, P; Krogsgaard, M R; Christiansen, J

1995-01-01

. The kappa values for Observer A vs. B and Observer C vs. B were 0.3480 and 0.3770, respectively (both type and dysplasia). Values for type were better than for dysplasia, but agreement was only fair to moderate. CONCLUSION: The interobserver agreement was moderate to almost perfect, but the intraobserver...... agreement was only fair to moderate. A simpler classification system or a centralization of assessments would probably increase kappa values....... of adenomas were assessed twice by three experienced pathologists, with an interval of two months. Results were analyzed using kappa statistics. RESULTS: For agreement between first and second assessment (both type and grade of dysplasia), kappa values for the three specialists were 0.5345, 0.9022, and 0...
Observed Characteristics and Teacher Quality: Impacts of Sample Selection on a Value Added Model

Science.gov (United States)

Winters, Marcus A.; Dixon, Bruce L.; Greene, Jay P.

2012-01-01

We measure the impact of observed teacher characteristics on student math and reading proficiency using a rich dataset from Florida. We expand upon prior work by accounting directly for nonrandom attrition of teachers from the classroom in a sample selection framework. We find evidence that sample selection is present in the estimation of the…
Balanced sampling

NARCIS (Netherlands)

Brus, D.J.

2015-01-01

In balanced sampling a linear relation between the soil property of interest and one or more covariates with known means is exploited in selecting the sampling locations. Recent developments make this sampling design attractive for statistical soil surveys. This paper introduces balanced sampling
Combining censored and uncensored data in a U-statistic: design and sample size implications for cell therapy research.

Science.gov (United States)

Moyé, Lemuel A; Lai, Dejian; Jing, Kaiyan; Baraniuk, Mary Sarah; Kwak, Minjung; Penn, Marc S; Wu, Colon O

2011-01-01

The assumptions that anchor large clinical trials are rooted in smaller, Phase II studies. In addition to specifying the target population, intervention delivery, and patient follow-up duration, physician-scientists who design these Phase II studies must select the appropriate response variables (endpoints). However, endpoint measures can be problematic. If the endpoint assesses the change in a continuous measure over time, then the occurrence of an intervening significant clinical event (SCE), such as death, can preclude the follow-up measurement. Finally, the ideal continuous endpoint measurement may be contraindicated in a fraction of the study patients, a change that requires a less precise substitution in this subset of participants.A score function that is based on the U-statistic can address these issues of 1) intercurrent SCE's and 2) response variable ascertainments that use different measurements of different precision. The scoring statistic is easy to apply, clinically relevant, and provides flexibility for the investigators' prospective design decisions. Sample size and power formulations for this statistic are provided as functions of clinical event rates and effect size estimates that are easy for investigators to identify and discuss. Examples are provided from current cardiovascular cell therapy research.
Testing statistical hypotheses

CERN Document Server

Lehmann, E L

2005-01-01

The third edition of Testing Statistical Hypotheses updates and expands upon the classic graduate text, emphasizing optimality theory for hypothesis testing and confidence sets. The principal additions include a rigorous treatment of large sample optimality, together with the requisite tools. In addition, an introduction to the theory of resampling methods such as the bootstrap is developed. The sections on multiple testing and goodness of fit testing are expanded. The text is suitable for Ph.D. students in statistics and includes over 300 new problems out of a total of more than 760. E.L. Lehmann is Professor of Statistics Emeritus at the University of California, Berkeley. He is a member of the National Academy of Sciences and the American Academy of Arts and Sciences, and the recipient of honorary degrees from the University of Leiden, The Netherlands and the University of Chicago. He is the author of Elements of Large-Sample Theory and (with George Casella) he is also the author of Theory of Point Estimat...
Powerful Statistical Inference for Nested Data Using Sufficient Summary Statistics

Science.gov (United States)

Dowding, Irene; Haufe, Stefan

2018-01-01

Hierarchically-organized data arise naturally in many psychology and neuroscience studies. As the standard assumption of independent and identically distributed samples does not hold for such data, two important problems are to accurately estimate group-level effect sizes, and to obtain powerful statistical tests against group-level null hypotheses. A common approach is to summarize subject-level data by a single quantity per subject, which is often the mean or the difference between class means, and treat these as samples in a group-level t-test. This “naive” approach is, however, suboptimal in terms of statistical power, as it ignores information about the intra-subject variance. To address this issue, we review several approaches to deal with nested data, with a focus on methods that are easy to implement. With what we call the sufficient-summary-statistic approach, we highlight a computationally efficient technique that can improve statistical power by taking into account within-subject variances, and we provide step-by-step instructions on how to apply this approach to a number of frequently-used measures of effect size. The properties of the reviewed approaches and the potential benefits over a group-level t-test are quantitatively assessed on simulated data and demonstrated on EEG data from a simulated-driving experiment. PMID:29615885
Statistical evolution of quiet-Sun small-scale magnetic features using Sunrise observations

Science.gov (United States)

Anusha, L. S.; Solanki, S. K.; Hirzberger, J.; Feller, A.

2017-02-01

The evolution of small magnetic features in quiet regions of the Sun provides a unique window for probing solar magneto-convection. Here we analyze small-scale magnetic features in the quiet Sun, using the high resolution, seeing-free observations from the Sunrise balloon borne solar observatory. Our aim is to understand the contribution of different physical processes, such as splitting, merging, emergence and cancellation of magnetic fields to the rearrangement, addition and removal of magnetic flux in the photosphere. We have employed a statistical approach for the analysis and the evolution studies are carried out using a feature-tracking technique. In this paper we provide a detailed description of the feature-tracking algorithm that we have newly developed and we present the results of a statistical study of several physical quantities. The results on the fractions of the flux in the emergence, appearance, splitting, merging, disappearance and cancellation qualitatively agrees with other recent studies. To summarize, the total flux gained in unipolar appearance is an order of magnitude larger than the total flux gained in emergence. On the other hand, the bipolar cancellation contributes nearly an equal amount to the loss of magnetic flux as unipolar disappearance. The total flux lost in cancellation is nearly six to eight times larger than the total flux gained in emergence. One big difference between our study and previous similar studies is that, thanks to the higher spatial resolution of Sunrise, we can track features with fluxes as low as 9 × 1014 Mx. This flux is nearly an order of magnitude lower than the smallest fluxes of the features tracked in the highest resolution previous studies based on Hinode data. The area and flux of the magnetic features follow power-law type distribution, while the lifetimes show either power-law or exponential type distribution depending on the exact definitions used to define various birth and death events. We have
Statistical compilation of NAPAP chemical erosion observations

Science.gov (United States)

Mossotti, Victor G.; Eldeeb, A. Raouf; Reddy, Michael M.; Fries, Terry L.; Coombs, Mary Jane; Schmiermund, Ron L.; Sherwood, Susan I.

2001-01-01

In the mid 1980s, the National Acid Precipitation Assessment Program (NAPAP), in cooperation with the National Park Service (NPS) and the U.S. Geological Survey (USGS), initiated a Materials Research Program (MRP) that included a series of field and laboratory studies with the broad objective of providing scientific information on acid rain effects on calcareous building stone. Among the several effects investigated, the chemical dissolution of limestone and marble by rainfall was given particular attention because of the pervasive appearance of erosion effects on cultural materials situated outdoors. In order to track the chemical erosion of stone objects in the field and in the laboratory, the Ca 2+ ion concentration was monitored in the runoff solution from a variety of test objects located both outdoors and under more controlled conditions in the laboratory. This report provides a graphical and statistical overview of the Ca 2+ chemistry in the runoff solutions from (1) five urban and rural sites (DC, NY, NJ, NC, and OH) established by the MRP for materials studies over the period 1984 to 1989, (2) subevent study at the New York MRP site, (3) in situ study of limestone and marble monuments at Gettysburg, (4) laboratory experiments on calcite dissolution conducted by Baedecker, (5) laboratory simulations by Schmiermund, and (6) laboratory investigation of the surface reactivity of calcareous stone conducted by Fries and Mossotti. The graphical representations provided a means for identifying erroneous data that can randomly appear in a database when field operations are semi-automated; a purged database suitable for the evaluation of quantitative models of stone erosion is appended to this report. An analysis of the sources of statistical variability in the data revealed that the rate of stone erosion is weakly dependent on the type of calcareous stone, the ambient temperature, and the H + concentration delivered in the incident rain. The analysis also showed
Evolution in Cloud Population Statistics of the MJO: From AMIE Field Observations to Global Cloud-Permiting Models

Energy Technology Data Exchange (ETDEWEB)

Zhang, Chidong [Univ. of Miami, Coral Gables, FL (United States)

2016-08-14

Motivated by the success of the AMIE/DYNAMO field campaign, which collected unprecedented observations of cloud and precipitation from the tropical Indian Ocean in Octber 2011 – March 2012, this project explored how such observations can be applied to assist the development of global cloud-permitting models through evaluating and correcting model biases in cloud statistics. The main accomplishment of this project were made in four categories: generating observational products for model evaluation, using AMIE/DYNAMO observations to validate global model simulations, using AMIE/DYNAMO observations in numerical studies of cloud-permitting models, and providing leadership in the field. Results from this project provide valuable information for building a seamless bridge between DOE ASR program’s component on process level understanding of cloud processes in the tropics and RGCM focus on global variability and regional extremes. In particular, experience gained from this project would be directly applicable to evaluation and improvements of ACME, especially as it transitions to a non-hydrostatic variable resolution model.
A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.

Science.gov (United States)

Tang, Liansheng Larry; Balakrishnan, N

2011-01-01

The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.
NDT oriented equipment for observing the Doppler broadening of radiation produced by the annihilation of positrons in cylindrical samples

International Nuclear Information System (INIS)

Coleman, C.F.; Smith, F.A.; Hughes, A.E.

1976-11-01

This report describes the development of equipment for measuring annihilation line broadening in cylindrical samples a few millimetres in diameter, suitable for use in fatigue testing programs. A detached positron source is employed, allowing the samples to be scanned both longitudinally (resolution approximately 1 cm) and in azimuth. Some of the advantages of and problems associated with this configuration are discussed. The statistical precision of a number of parameters
Continuous quality control of the blood sampling procedure using a structured observation scheme

DEFF Research Database (Denmark)

Seemann, T. L.; Nybo, M.

2015-01-01

Background: An important preanalytical factor is the blood sampling procedure and its adherence to the guidelines, i.e. CLSI and ISO 15189, in order to ensure a consistent quality of the blood collection. Therefore, it is critically important to introduce quality control on this part of the process....... As suggested by the EFLM working group on the preanalytical phase we introduced continuous quality control of the blood sampling procedure using a structured observation scheme to monitor the quality of blood sampling performed on an everyday basis. Materials and methods: Based on our own routines the EFLM....... Conclusion: It is possible to establish a continuous quality control on blood sampling. It has been well accepted by the staff and we have already been able to identify critical areas in the sampling process. We find that continuous auditing increase focus on the quality of blood collection which ensures...
A statistical procedure for the qualification of indoor dust

International Nuclear Information System (INIS)

Scapin, Valdirene O.; Scapin, Marcos A.; Ribeiro, Andreza P.; Sato, Ivone M.

2009-01-01

The materials science advance has contributed to the humanity. Notwithstanding, serious environmental and human health problems are often observed. Thereby, many worldwide researchers have focused their work to diagnose, assess and monitor several environmental systems. In this work, a statistical procedure (on a 0.05 significance level) that allows verifying if indoor dust samples have characteristics of soil/sediment is presented. Dust samples were collected from 69 residences using a domestic vacuum cleaner in four neighborhoods of the Sao Paulo metropolitan region, Brazil, between 2006 and 2008. The samples were sieved in the fractions of 150-75 (C), 75-63 (M) and <63 μm (F). The elemental concentrations were determined by X-ray fluorescence (WDXRF). Afterwards, the indoor samples results (group A) were compared to the group of 109 certificated reference materials, which included different kinds of geological matrices, such as clay, sediment, sand and sludge (group B) and to the continental crust values (group C). Initially, the Al/Si ratio was calculated for the groups (A, B, C). The variance analysis (ANOVA), followed by Tukey test, was used to find out if there was a significant difference between the concentration means of the considered groups. According to the statistical tests; the group B presented results that are considered different from others. The interquartile range (IQR) was used to detected outlier values. ANOVA was applied again and the results (p ≥ 0.05) showed equality between ratios means of the three groups. Accordingly, the results suggest that the indoor dust samples have characteristic of soil/sediment. The statistical procedure may be used as a tool to clear the information about contaminants in dust samples, since they have characteristic of soil and may be compared with values reported by environmental control organisms. (author)
MANAGERIAL DECISION IN INNOVATIVE EDUCATION SYSTEMS STATISTICAL SURVEY BASED ON SAMPLE THEORY

Directory of Open Access Journals (Sweden)

Gheorghe SĂVOIU

2012-12-01

Full Text Available Before formulating the statistical hypotheses and the econometrictesting itself, a breakdown of some of the technical issues is required, which are related to managerial decision in innovative educational systems, the educational managerial phenomenon tested through statistical and mathematical methods, respectively the significant difference in perceiving the current qualities, knowledge, experience, behaviour and desirable health, obtained through a questionnaire applied to a stratified population at the end,in the educational environment, either with educational activities, or with simultaneously managerial and educational activities. The details having to do with research focused on the survey theory, turning into a working tool the questionnaires and statistical data that are processed from those questionnaires, are summarized below.
Statistical Study of the Properties of Magnetosheath Lion Roars using MMS observations

Science.gov (United States)

Giagkiozis, S.; Wilson, L. B., III

2017-12-01

Intense whistler-mode waves of very short duration are frequently encountered in the magnetosheath. These emissions have been linked to mirror mode waves and the Earth's bow shock. They can efficiently transfer energy between different plasma populations. These electromagnetic waves are commonly referred to as Lion roars (LR), due to the sound generated when the signals are sonified. They are generally observed during dips of the magnetic field that are anti-correlated with increases of density. Using MMS data, we have identified more than 1750 individual LR burst intervals. Each emission was band-pass filtered and further split into >35,000 subintervals, for which the direction of propagation and the polarization were calculated. The analysis of subinterval properties provides a more accurate representation of their true nature than the more commonly used time- and frequency-averaged dynamic spectra analysis. The results of the statistical analysis of the wave properties will be presented.
A statistical study of gravity waves from radiosonde observations at Wuhan (30° N, 114° E China

Directory of Open Access Journals (Sweden)

S. D. Zhang

2005-03-01

Full Text Available Several works concerning the dynamical and thermal structures and inertial gravity wave activities in the troposphere and lower stratosphere (TLS from the radiosonde observation have been reported before, but these works were concentrated on either equatorial or polar regions. In this paper, background atmosphere and gravity wave activities in the TLS over Wuhan (30° N, 114° E (a medium latitudinal region were statistically studied by using the data from radiosonde observations on a twice daily basis at 08:00 and 20:00 LT in the period between 2000 and 2002. The monthly-averaged temperature and horizontal winds exhibit the essential dynamic and thermal structures of the background atmosphere. For avoiding the extreme values of background winds and temperature in the height range of 11-18km, we studied gravity waves, respectively, in two separate height regions, one is from ground surface to 10km (lower part, and the other is within 18-25km (upper part. In total, 791 and 1165 quasi-monochromatic inertial gravity waves were extracted from our data set for the lower and upper parts, respectively. The gravity wave parameters (intrinsic frequencies, amplitudes, wavelengths, intrinsic phase velocities and wave energies are calculated and statistically studied. The statistical results revealed that in the lower part, there were 49.4% of gravity waves propagating upward, and the percentage was 76.4% in the upper part. Moreover, the average wave amplitudes and energies are less than those at the lower latitudinal regions, which indicates that the gravity wave parameters have a latitudinal dependence. The correlated temporal evolution of the monthly-averaged wave energies in the lower and upper parts and a subsequent quantitative analysis strongly suggested that at the observation site, dynamical instability (strong wind shear induced by the tropospheric jet is the main excitation source of inertial gravity waves in the TLS.
Applied statistical methods in agriculture, health and life sciences

CERN Document Server

Lawal, Bayo

2014-01-01

This textbook teaches crucial statistical methods to answer research questions using a unique range of statistical software programs, including MINITAB and R. This textbook is developed for undergraduate students in agriculture, nursing, biology and biomedical research. Graduate students will also find it to be a useful way to refresh their statistics skills and to reference software options. The unique combination of examples is approached using MINITAB and R for their individual strengths. Subjects covered include among others data description, probability distributions, experimental design, regression analysis, randomized design and biological assay. Unlike other biostatistics textbooks, this text also includes outliers, influential observations in regression and an introduction to survival analysis. Material is taken from the author's extensive teaching and research in Africa, USA and the UK. Sample problems, references and electronic supplementary material accompany each chapter.
The Atmospheric Scanning Electron Microscope with open sample space observes dynamic phenomena in liquid or gas.

Science.gov (United States)

Suga, Mitsuo; Nishiyama, Hidetoshi; Konyuba, Yuji; Iwamatsu, Shinnosuke; Watanabe, Yoshiyuki; Yoshiura, Chie; Ueda, Takumi; Sato, Chikara

2011-12-01

Although conventional electron microscopy (EM) requires samples to be in vacuum, most chemical and physical reactions occur in liquid or gas. The Atmospheric Scanning Electron Microscope (ASEM) can observe dynamic phenomena in liquid or gas under atmospheric pressure in real time. An electron-permeable window made of pressure-resistant 100 nm-thick silicon nitride (SiN) film, set into the bottom of the open ASEM sample dish, allows an electron beam to be projected from underneath the sample. A detector positioned below captures backscattered electrons. Using the ASEM, we observed the radiation-induced self-organization process of particles, as well as phenomena accompanying volume change, including evaporation-induced crystallization. Using the electrochemical ASEM dish, we observed tree-like electrochemical depositions on the cathode. In silver nitrate solution, we observed silver depositions near the cathode forming incidental internal voids. The heated ASEM dish allowed observation of patterns of contrast in melting and solidifying solder. Finally, to demonstrate its applicability for monitoring and control of industrial processes, silver paste and solder paste were examined at high throughput. High resolution, imaging speed, flexibility, adaptability, and ease of use facilitate the observation of previously difficult-to-image phenomena, and make the ASEM applicable to various fields. Copyright © 2011 Elsevier B.V. All rights reserved.

Statistical Analysis of Hubble/WFC3 Transit Spectroscopy of Extrasolar Planets

Science.gov (United States)

Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan

2018-01-01

Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST/WFC3 transit spectra for 1.1-1.65 micron water vapor absorption, and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-Transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.
Statistical Analysis of Hubble /WFC3 Transit Spectroscopy of Extrasolar Planets

Energy Technology Data Exchange (ETDEWEB)

Fu, Guangwei; Deming, Drake [Department of Astronomy, University of Maryland, College Park, MD 20742 (United States); Knutson, Heather [Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA 91125 (United States); Madhusudhan, Nikku [Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge CB3 0HA (United Kingdom); Mandell, Avi [Planetary Systems Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD 20771 (United States); Fraine, Jonathan, E-mail: gfu@astro.umd.edu [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States)

2017-10-01

Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST /WFC3 transit spectra for 1.1–1.65 μ m water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.
Statistical Analysis of Hubble /WFC3 Transit Spectroscopy of Extrasolar Planets

International Nuclear Information System (INIS)

Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan

2017-01-01

Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST /WFC3 transit spectra for 1.1–1.65 μ m water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.
Statistical Analysis of a Comprehensive List of Visual Binaries

Directory of Open Access Journals (Sweden)

Kovaleva D.

2015-12-01

Full Text Available Visual binary stars are the most abundant class of observed binaries. The most comprehensive list of data on visual binaries compiled recently by cross-matching the largest catalogues of visual binaries allowed a statistical investigation of observational parameters of these systems. The dataset was cleaned by correcting uncertainties and misclassifications, and supplemented with available parallax data. The refined dataset is free from technical biases and contains 3676 presumably physical visual pairs of luminosity class V with known angular separations, magnitudes of the components, spectral types, and parallaxes. We also compiled a restricted sample of 998 pairs free from observational biases due to the probability of binary discovery. Certain distributions of observational and physical parameters of stars of our dataset are discussed.
Observing System Simulation Experiments for the assessment of temperature sampling strategies in the Mediterranean Sea

Directory of Open Access Journals (Sweden)

F. Raicich

2003-01-01

Full Text Available For the first time in the Mediterranean Sea various temperature sampling strategies are studied and compared to each other by means of the Observing System Simulation Experiment technique. Their usefulness in the framework of the Mediterranean Forecasting System (MFS is assessed by quantifying their impact in a Mediterranean General Circulation Model in numerical twin experiments via univariate data assimilation of temperature profiles in summer and winter conditions. Data assimilation is performed by means of the optimal interpolation algorithm implemented in the SOFA (System for Ocean Forecasting and Analysis code. The sampling strategies studied here include various combinations of eXpendable BathyThermograph (XBT profiles collected along Volunteer Observing Ship (VOS tracks, Airborne XBTs (AXBTs and sea surface temperatures. The actual sampling strategy adopted in the MFS Pilot Project during the Targeted Operational Period (TOP, winter-spring 2000 is also studied. The data impact is quantified by the error reduction relative to the free run. The most effective sampling strategies determine 25–40% error reduction, depending on the season, the geographic area and the depth range. A qualitative relationship can be recognized in terms of the spread of information from the data positions, between basin circulation features and spatial patterns of the error reduction fields, as a function of different spatial and seasonal characteristics of the dynamics. The largest error reductions are observed when samplings are characterized by extensive spatial coverages, as in the cases of AXBTs and the combination of XBTs and surface temperatures. The sampling strategy adopted during the TOP is characterized by little impact, as a consequence of a sampling frequency that is too low. Key words. Oceanography: general (marginal and semi-enclosed seas; numerical modelling
Observing System Simulation Experiments for the assessment of temperature sampling strategies in the Mediterranean Sea

Directory of Open Access Journals (Sweden)

F. Raicich

Full Text Available For the first time in the Mediterranean Sea various temperature sampling strategies are studied and compared to each other by means of the Observing System Simulation Experiment technique. Their usefulness in the framework of the Mediterranean Forecasting System (MFS is assessed by quantifying their impact in a Mediterranean General Circulation Model in numerical twin experiments via univariate data assimilation of temperature profiles in summer and winter conditions. Data assimilation is performed by means of the optimal interpolation algorithm implemented in the SOFA (System for Ocean Forecasting and Analysis code. The sampling strategies studied here include various combinations of eXpendable BathyThermograph (XBT profiles collected along Volunteer Observing Ship (VOS tracks, Airborne XBTs (AXBTs and sea surface temperatures. The actual sampling strategy adopted in the MFS Pilot Project during the Targeted Operational Period (TOP, winter-spring 2000 is also studied.

The data impact is quantified by the error reduction relative to the free run. The most effective sampling strategies determine 25–40% error reduction, depending on the season, the geographic area and the depth range. A qualitative relationship can be recognized in terms of the spread of information from the data positions, between basin circulation features and spatial patterns of the error reduction fields, as a function of different spatial and seasonal characteristics of the dynamics. The largest error reductions are observed when samplings are characterized by extensive spatial coverages, as in the cases of AXBTs and the combination of XBTs and surface temperatures. The sampling strategy adopted during the TOP is characterized by little impact, as a consequence of a sampling frequency that is too low.

Key words. Oceanography: general (marginal and semi-enclosed seas; numerical modelling
SWIFT X-RAY OBSERVATIONS OF CLASSICAL NOVAE. II. THE SUPER SOFT SOURCE SAMPLE

Energy Technology Data Exchange (ETDEWEB)

Schwarz, Greg J. [American Astronomical Society, 2000 Florida Avenue, NW, Suite 400, Washington, DC 20009-1231 (United States); Ness, Jan-Uwe [XMM-Newton Science Operations Centre, ESAC, Apartado 78, 28691 Villanueva de la Canada, Madrid (Spain); Osborne, J. P.; Page, K. L.; Evans, P. A.; Beardmore, A. P. [Department of Physics and Astronomy, University of Leicester, Leicester LE1 7RH (United Kingdom); Walter, Frederick M. [Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY 11794-3800 (United States); Andrew Helton, L. [SOFIA Science Center, USRA, NASA Ames Research Center, M.S. N211-3, Moffett Field, CA 94035 (United States); Woodward, Charles E. [Minnesota Institute of Astrophysics, 116 Church Street S.E., University of Minnesota, Minneapolis, MN 55455 (United States); Bode, Mike [Astrophysics Research Institute, Liverpool John Moores University, Birkenhead CH41 1LD (United Kingdom); Starrfield, Sumner [School of Earth and Space Exploration, Arizona State University, P.O. Box 871404, Tempe, AZ 85287-1404 (United States); Drake, Jeremy J., E-mail: Greg.Schwarz@aas.org [Smithsonian Astrophysical Observatory, 60 Garden Street, MS 3, Cambridge, MA 02138 (United States)

2011-12-01

The Swift gamma-ray burst satellite is an excellent facility for studying novae. Its rapid response time and sensitive X-ray detector provides an unparalleled opportunity to investigate the previously poorly sampled evolution of novae in the X-ray regime. This paper presents Swift observations of 52 Galactic/Magellanic Cloud novae. We included the X-Ray Telescope (0.3-10 keV) instrument count rates and the UltraViolet and Optical Telescope (1700-8000 A) filter photometry. Also included in the analysis are the publicly available pointed observations of 10 additional novae the X-ray archives. This is the largest X-ray sample of Galactic/Magellanic Cloud novae yet assembled and consists of 26 novae with Super Soft X-ray emission, 19 from Swift observations. The data set shows that the faster novae have an early hard X-ray phase that is usually missing in slower novae. The Super Soft X-ray phase occurs earlier and does not last as long in fast novae compared to slower novae. All the Swift novae with sufficient observations show that novae are highly variable with rapid variability and different periodicities. In the majority of cases, nuclear burning ceases less than three years after the outburst begins. Previous relationships, such as the nuclear burning duration versus t{sub 2} or the expansion velocity of the eject and nuclear burning duration versus the orbital period, are shown to be poorly correlated with the full sample indicating that additional factors beyond the white dwarf mass and binary separation play important roles in the evolution of a nova outburst. Finally, we confirm two optical phenomena that are correlated with strong, soft X-ray emission which can be used to further increase the efficiency of X-ray campaigns.
SWIFT X-RAY OBSERVATIONS OF CLASSICAL NOVAE. II. THE SUPER SOFT SOURCE SAMPLE

International Nuclear Information System (INIS)

Schwarz, Greg J.; Ness, Jan-Uwe; Osborne, J. P.; Page, K. L.; Evans, P. A.; Beardmore, A. P.; Walter, Frederick M.; Andrew Helton, L.; Woodward, Charles E.; Bode, Mike; Starrfield, Sumner; Drake, Jeremy J.

2011-01-01

The Swift gamma-ray burst satellite is an excellent facility for studying novae. Its rapid response time and sensitive X-ray detector provides an unparalleled opportunity to investigate the previously poorly sampled evolution of novae in the X-ray regime. This paper presents Swift observations of 52 Galactic/Magellanic Cloud novae. We included the X-Ray Telescope (0.3-10 keV) instrument count rates and the UltraViolet and Optical Telescope (1700-8000 Å) filter photometry. Also included in the analysis are the publicly available pointed observations of 10 additional novae the X-ray archives. This is the largest X-ray sample of Galactic/Magellanic Cloud novae yet assembled and consists of 26 novae with Super Soft X-ray emission, 19 from Swift observations. The data set shows that the faster novae have an early hard X-ray phase that is usually missing in slower novae. The Super Soft X-ray phase occurs earlier and does not last as long in fast novae compared to slower novae. All the Swift novae with sufficient observations show that novae are highly variable with rapid variability and different periodicities. In the majority of cases, nuclear burning ceases less than three years after the outburst begins. Previous relationships, such as the nuclear burning duration versus t 2 or the expansion velocity of the eject and nuclear burning duration versus the orbital period, are shown to be poorly correlated with the full sample indicating that additional factors beyond the white dwarf mass and binary separation play important roles in the evolution of a nova outburst. Finally, we confirm two optical phenomena that are correlated with strong, soft X-ray emission which can be used to further increase the efficiency of X-ray campaigns.
The relation between statistical power and inference in fMRI.

Directory of Open Access Journals (Sweden)

Henk R Cremers

Full Text Available Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects, and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial-especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20-30 display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate prediction methods and meta-analyses with related synthesis-oriented approaches.
The significance of Sampling Design on Inference: An Analysis of Binary Outcome Model of Children’s Schooling Using Indonesian Large Multi-stage Sampling Data

OpenAIRE

Ekki Syamsulhakim

2008-01-01

This paper aims to exercise a rather recent trend in applied microeconometrics, namely the effect of sampling design on statistical inference, especially on binary outcome model. Many theoretical research in econometrics have shown the inappropriateness of applying i.i.dassumed statistical analysis on non-i.i.d data. These research have provided proofs showing that applying the iid-assumed analysis on a non-iid observations would result in an inflated standard errors which could make the esti...
Variable Sampling Composite Observer Based Frequency Locked Loop and its Application in Grid Connected System

Directory of Open Access Journals (Sweden)

ARUN, K.

2016-05-01

Full Text Available A modified digital signal processing procedure is described for the on-line estimation of DC, fundamental and harmonics of periodic signal. A frequency locked loop (FLL incorporated within the parallel structure of observers is proposed to accommodate a wide range of frequency drift. The error in frequency generated under drifting frequencies has been used for changing the sampling frequency of the composite observer, so that the number of samples per cycle of the periodic waveform remains constant. A standard coupled oscillator with automatic gain control is used as numerically controlled oscillator (NCO to generate the enabling pulses for the digital observer. The NCO gives an integer multiple of the fundamental frequency making it suitable for power quality applications. Another observer with DC and second harmonic blocks in the feedback path act as filter and reduces the double frequency content. A systematic study of the FLL is done and a method has been proposed to design the controller. The performance of FLL is validated through simulation and experimental studies. To illustrate applications of the new FLL, estimation of individual harmonics from nonlinear load and the design of a variable sampling resonant controller, for a single phase grid-connected inverter have been presented.
Statistical concepts a second course

CERN Document Server

Lomax, Richard G

2012-01-01

Statistical Concepts consists of the last 9 chapters of An Introduction to Statistical Concepts, 3rd ed. Designed for the second course in statistics, it is one of the few texts that focuses just on intermediate statistics. The book highlights how statistics work and what they mean to better prepare students to analyze their own data and interpret SPSS and research results. As such it offers more coverage of non-parametric procedures used when standard assumptions are violated since these methods are more frequently encountered when working with real data. Determining appropriate sample sizes
An introduction to statistical computing a simulation-based approach

CERN Document Server

Voss, Jochen

2014-01-01

A comprehensive introduction to sampling-based methods in statistical computing The use of computers in mathematics and statistics has opened up a wide range of techniques for studying otherwise intractable problems. Sampling-based simulation techniques are now an invaluable tool for exploring statistical models. This book gives a comprehensive introduction to the exciting area of sampling-based methods. An Introduction to Statistical Computing introduces the classical topics of random number generation and Monte Carlo methods. It also includes some advanced met
Predicted versus observed cosmic-ray-produced noble gases in lunar samples: improved Kr production ratios

International Nuclear Information System (INIS)

Regnier, S.; Hohenberg, C.M.; Marti, K.; Reedy, R.C.

1979-01-01

New sets of cross sections for the production of krypton isotopes from targets of Rb, Sr, Y, and Zr were constructed primarily on the bases of experimental excitation functions for Kr production from Y. These cross sections were used to calculate galactic-cosmic-ray and solar-proton production rates for Kr isotopes in the moon. Spallation Kr data obtained from ilmenite separates of rocks 10017 and 10047 are reported. Production rates and isotopic ratios for cosmogenic Kr observed in ten well-documented lunar samples and in ilmenite separates and bulk samples from several lunar rocks with long but unknown irradiation histories were compared with predicted rates and ratios. The agreements were generally quite good. Erosion of rock surfaces affected rates or ratios for only near-surface samples, where solar-proton production is important. There were considerable spreads in predicted-to-observed production rates of 83 Kr, due at least in part to uncertainties in chemical abundances. The 78 Kr/ 83 Kr ratios were predicted quite well for samples with a wide range of Zr/Sr abundance ratios. The calculated 80 Kr/ 83 Kr ratios were greater than the observed ratios when production by the 79 Br(n,γ) reaction was included, but were slightly undercalculated if the Br reaction was omitted; these results suggest that Br(n,γ)-produced Kr is not retained well by lunar rocks. The productions of 81 Kr and 82 Kr were overcalculated by approximately 10% relative to 83 Kr. Predicted-to-observed 84 Kr/ 83 ratios scattered considerably, possibly because of uncertainties in corrections for trapped and fission components and in cross sections for 84 Kr production. Most predicted 84 Kr and 86 Kr production rates were lower than observed. Shielding depths of several Apollo 11 rocks were determined from the measured 78 Kr/ 83 Kr ratios of ilmenite separates. 4 figures, 5 tables
Sampling the Mouse Hippocampal Dentate Gyrus

OpenAIRE

Lisa Basler; Lisa Basler; Stephan Gerdes; David P. Wolfer; David P. Wolfer; David P. Wolfer; Lutz Slomianka; Lutz Slomianka

2017-01-01

Sampling is a critical step in procedures that generate quantitative morphological data in the neurosciences. Samples need to be representative to allow statistical evaluations, and samples need to deliver a precision that makes statistical evaluations not only possible but also meaningful. Sampling generated variability should, e.g., not be able to hide significant group differences from statistical detection if they are present. Estimators of the coefficient of error (CE) have been develope...
Statistical aspects of tumor registries, Hiroshima and Nagasaki

Energy Technology Data Exchange (ETDEWEB)

Ishida, M

1961-02-24

Statistical considerations are presented on the tumor registries established for purpose of studying radiation induced carcinoma in Hiroshima and Nagasaki by observing tumors developing in the survivors of these cities. In addition to describing the background and purpose of the tumor registries the report consists of two parts: (1) accuracy of reported tumor cases and (2) statistical aspects of the incidence of tumors based both on a current population and on a fixed sample. Under the heading background, discussion includes the difficulties in attaining complete registration; the various problems associated with the tumor registries; and the special characteristics of tumor registries in Hiroshima and Nagasaki. Beye's a posteriori probability formula was applied to the Type I and Type II errors in the autopsy data of Hiroshima ABCC. (Type I, diagnosis of what is not cancer as cancer; Type II, diagnosis of what is cancer as noncancer.) Finally, the report discussed the difficulties in estimating a current population of survivors; the advantages and disadvantages of analyses based on a fixed sample and on an estimated current population; the comparison of incidence rates based on these populations using the 20 months' data of the tumor registry in Hiroshima; and the sample size required for studying radiation induced carcinoma. 10 references, 1 figure, 8 tables.
The Relationship Between Radiative Forcing and Temperature. What Do Statistical Analyses of the Instrumental Temperature Record Measure?

International Nuclear Information System (INIS)

Kaufmann, R.K.; Kauppi, H.; Stock, J.H.

2006-01-01

Comparing statistical estimates for the long-run temperature effect of doubled CO2 with those generated by climate models begs the question, is the long-run temperature effect of doubled CO2 that is estimated from the instrumental temperature record using statistical techniques consistent with the transient climate response, the equilibrium climate sensitivity, or the effective climate sensitivity. Here, we attempt to answer the question, what do statistical analyses of the observational record measure, by using these same statistical techniques to estimate the temperature effect of a doubling in the atmospheric concentration of carbon dioxide from seventeen simulations run for the Coupled Model Intercomparison Project 2 (CMIP2). The results indicate that the temperature effect estimated by the statistical methodology is consistent with the transient climate response and that this consistency is relatively unaffected by sample size or the increase in radiative forcing in the sample
Time Series Analysis Based on Running Mann Whitney Z Statistics

Science.gov (United States)

A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Can Low Frequency Measurements Be Good Enough? - A Statistical Assessment of Citizen Hydrology Streamflow Observations

Science.gov (United States)

Davids, J. C.; Rutten, M.; Van De Giesen, N.

2016-12-01

Hydrologic data has traditionally been collected with permanent installations of sophisticated and relatively accurate but expensive monitoring equipment at limited numbers of sites. Consequently, the spatial coverage of the data is limited and costs are high. Achieving adequate maintenance of sophisticated monitoring equipment often exceeds local technical and resource capacity, and permanently deployed monitoring equipment is susceptible to vandalism, theft, and other hazards. Rather than using expensive, vulnerable installations at a few points, SmartPhones4Water (S4W), a form of Citizen Hydrology, leverages widely available mobile technology to gather hydrologic data at many sites in a manner that is repeatable and scalable. However, there is currently a limited understanding of the impact of decreased observational frequency on the accuracy of key streamflow statistics like minimum flow, maximum flow, and runoff. As a first step towards evaluating the tradeoffs between traditional continuous monitoring approaches and emerging Citizen Hydrology methods, we randomly selected 50 active U.S. Geological Survey (USGS) streamflow gauges in California. We used historical 15 minute flow data from 01/01/2008 through 12/31/2014 to develop minimum flow, maximum flow, and runoff values (7 year total) for each gauge. In order to mimic lower frequency Citizen Hydrology observations, we developed a bootstrap randomized subsampling with replacement procedure. We calculated the same statistics, along with their respective distributions, from 50 subsample iterations with four different subsampling intervals (i.e. daily, three day, weekly, and monthly). Based on our results we conclude that, depending on the types of questions being asked, and the watershed characteristics, Citizen Hydrology streamflow measurements can provide useful and accurate information. Depending on watershed characteristics, minimum flows were reasonably estimated with subsample intervals ranging from
Monitoring larval populations of the Douglas-fir tussock moth and the western spruce budworm on permanent plots: sampling methods and statistical properties of data

Science.gov (United States)

A.R. Mason; H.G. Paul

1994-01-01

Procedures for monitoring larval populations of the Douglas-fir tussock moth and the western spruce budworm are recommended based on many years experience in sampling these species in eastern Oregon and Washington. It is shown that statistically reliable estimates of larval density can be made for a population by sampling host trees in a series of permanent plots in a...

FUNSTAT and statistical image representations

Science.gov (United States)

Parzen, E.

1983-01-01

General ideas of functional statistical inference analysis of one sample and two samples, univariate and bivariate are outlined. ONESAM program is applied to analyze the univariate probability distributions of multi-spectral image data.
A General Linear Method for Equating with Small Samples

Science.gov (United States)

Albano, Anthony D.

2015-01-01

Research on equating with small samples has shown that methods with stronger assumptions and fewer statistical estimates can lead to decreased error in the estimated equating function. This article introduces a new approach to linear observed-score equating, one which provides flexible control over how form difficulty is assumed versus estimated…
Multiple category-lot quality assurance sampling: a new classification system with application to schistosomiasis control.

Science.gov (United States)

Olives, Casey; Valadez, Joseph J; Brooker, Simon J; Pagano, Marcello

2012-01-01

Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n=15 and n=25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n=15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools.
Statistical Analyses of High-Resolution Aircraft and Satellite Observations of Sea Ice: Applications for Improving Model Simulations

Science.gov (United States)

Farrell, S. L.; Kurtz, N. T.; Richter-Menge, J.; Harbeck, J. P.; Onana, V.

2012-12-01

Satellite-derived estimates of ice thickness and observations of ice extent over the last decade point to a downward trend in the basin-scale ice volume of the Arctic Ocean. This loss has broad-ranging impacts on the regional climate and ecosystems, as well as implications for regional infrastructure, marine navigation, national security, and resource exploration. New observational datasets at small spatial and temporal scales are now required to improve our understanding of physical processes occurring within the ice pack and advance parameterizations in the next generation of numerical sea-ice models. High-resolution airborne and satellite observations of the sea ice are now available at meter-scale resolution or better that provide new details on the properties and morphology of the ice pack across basin scales. For example the NASA IceBridge airborne campaign routinely surveys the sea ice of the Arctic and Southern Oceans with an advanced sensor suite including laser and radar altimeters and digital cameras that together provide high-resolution measurements of sea ice freeboard, thickness, snow depth and lead distribution. Here we present statistical analyses of the ice pack primarily derived from the following IceBridge instruments: the Digital Mapping System (DMS), a nadir-looking, high-resolution digital camera; the Airborne Topographic Mapper, a scanning lidar; and the University of Kansas snow radar, a novel instrument designed to estimate snow depth on sea ice. Together these instruments provide data from which a wide range of sea ice properties may be derived. We provide statistics on lead distribution and spacing, lead width and area, floe size and distance between floes, as well as ridge height, frequency and distribution. The goals of this study are to (i) identify unique statistics that can be used to describe the characteristics of specific ice regions, for example first-year/multi-year ice, diffuse ice edge/consolidated ice pack, and convergent
Applied statistical designs for the researcher

CERN Document Server

Paulson, Daryl S

2003-01-01

Research and Statistics Basic Review of Parametric Statistics Exploratory Data Analysis Two Sample Tests Completely Randomized One-Factor Analysis of Variance One and Two Restrictions on Randomization Completely Randomized Two-Factor Factorial Designs Two-Factor Factorial Completely Randomized Blocked Designs Useful Small Scale Pilot Designs Nested Statistical Designs Linear Regression Nonparametric Statistics Introduction to Research Synthesis and "Meta-Analysis" and Conclusory Remarks References Index.
Constrained statistical inference : sample-size tables for ANOVA and regression

NARCIS (Netherlands)

Vanbrabant, Leonard; Van De Schoot, Rens; Rosseel, Yves

2015-01-01

Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient β1 is larger than β2 and β3. The corresponding hypothesis is H: β1 > {β2, β3} and
Optimizing sampling approaches along ecological gradients

DEFF Research Database (Denmark)

Schweiger, Andreas; Irl, Severin D. H.; Steinbauer, Manuel

2016-01-01

1. Natural scientists and especially ecologists use manipulative experiments or field observations along gradients to differentiate patterns driven by processes from those caused by random noise. A well-conceived sampling design is essential for identifying, analysing and reporting underlying...... patterns in a statistically solid and reproducible manner, given the normal restrictions in labour, time and money. However, a technical guideline about an adequate sampling design to maximize prediction success under restricted resources is lacking. This study aims at developing such a solid...... and reproducible guideline for sampling along gradients in all fields of ecology and science in general. 2. We conducted simulations with artificial data for five common response types known in ecology, each represented by a simple function (no response, linear, exponential, symmetric unimodal and asymmetric...
On the Statistical Properties of Cospectra

Science.gov (United States)

Huppenkothen, D.; Bachetti, M.

2018-05-01

In recent years, the cross-spectrum has received considerable attention as a means of characterizing the variability of astronomical sources as a function of wavelength. The cospectrum has only recently been understood as a means of mitigating instrumental effects dependent on temporal frequency in astronomical detectors, as well as a method of characterizing the coherent variability in two wavelength ranges on different timescales. In this paper, we lay out the statistical foundations of the cospectrum, starting with the simplest case of detecting a periodic signal in the presence of white noise, under the assumption that the same source is observed simultaneously in independent detectors in the same energy range. This case is especially relevant for detecting faint X-ray pulsars in detectors heavily affected by instrumental effects, including NuSTAR, Astrosat, and IXPE, which allow for even sampling and where the cospectrum can act as an effective way to mitigate dead time. We show that the statistical distributions of both single and averaged cospectra differ considerably from those for standard periodograms. While a single cospectrum follows a Laplace distribution exactly, averaged cospectra are approximated by a Gaussian distribution only for more than ∼30 averaged segments, dependent on the number of trials. We provide an instructive example of a quasi-periodic oscillation in NuSTAR and show that applying standard periodogram statistics leads to underestimated tail probabilities for period detection. We also demonstrate the application of these distributions to a NuSTAR observation of the X-ray pulsar Hercules X-1.
The quantitative LOD score: test statistic and sample size for exclusion and linkage of quantitative traits in human sibships.

Science.gov (United States)

Page, G P; Amos, C I; Boerwinkle, E

1998-04-01

We present a test statistic, the quantitative LOD (QLOD) score, for the testing of both linkage and exclusion of quantitative-trait loci in randomly selected human sibships. As with the traditional LOD score, the boundary values of 3, for linkage, and -2, for exclusion, can be used for the QLOD score. We investigated the sample sizes required for inferring exclusion and linkage, for various combinations of linked genetic variance, total heritability, recombination distance, and sibship size, using fixed-size sampling. The sample sizes required for both linkage and exclusion were not qualitatively different and depended on the percentage of variance being linked or excluded and on the total genetic variance. Information regarding linkage and exclusion in sibships larger than size 2 increased as approximately all possible pairs n(n-1)/2 up to sibships of size 6. Increasing the recombination (theta) distance between the marker and the trait loci reduced empirically the power for both linkage and exclusion, as a function of approximately (1-2theta)4.
Frontiers in statistical quality control

CERN Document Server

Wilrich, Peter-Theodor

2004-01-01

This volume treats the four main categories of Statistical Quality Control: General SQC Methodology, On-line Control including Sampling Inspection and Statistical Process Control, Off-line Control with Data Analysis and Experimental Design, and, fields related to Reliability. Experts with international reputation present their newest contributions.
Impact of sampling frequency in the analysis of tropospheric ozone observations

Directory of Open Access Journals (Sweden)

M. Saunois

2012-08-01

Full Text Available Measurements of ozone vertical profiles are valuable for the evaluation of atmospheric chemistry models and contribute to the understanding of the processes controlling the distribution of tropospheric ozone. The longest record of ozone vertical profiles is provided by ozone sondes, which have a typical frequency of 4 to 12 profiles a month. Here we quantify the uncertainty introduced by low frequency sampling in the determination of means and trends. To do this, the high frequency MOZAIC (Measurements of OZone, water vapor, carbon monoxide and nitrogen oxides by in-service AIrbus airCraft profiles over airports, such as Frankfurt, have been subsampled at two typical ozone sonde frequencies of 4 and 12 profiles per month. We found the lowest sampling uncertainty on seasonal means at 700 hPa over Frankfurt, with around 5% for a frequency of 12 profiles per month and 10% for a 4 profile-a-month frequency. However the uncertainty can reach up to 15 and 29% at the lowest altitude levels. As a consequence, the sampling uncertainty at the lowest frequency could be higher than the typical 10% accuracy of the ozone sondes and should be carefully considered for observation comparison and model evaluation. We found that the 95% confidence limit on the seasonal mean derived from the subsample created is similar to the sampling uncertainty and suggest to use it as an estimate of the sampling uncertainty. Similar results are found at six other Northern Hemisphere sites. We show that the sampling substantially impacts on the inter-annual variability and the trend derived over the period 1998–2008 both in magnitude and in sign throughout the troposphere. Also, a tropical case is discussed using the MOZAIC profiles taken over Windhoek, Namibia between 2005 and 2008. For this site, we found that the sampling uncertainty in the free troposphere is around 8 and 12% at 12 and 4 profiles a month respectively.
The autoradiographic observation of neutron activated plant samples

International Nuclear Information System (INIS)

Koyama, Motoko; Tanizaki, Yoshiyuki

2003-01-01

Imaging Plate (IP) is a radiography apparatus of applying photostimulable luminescence. IP has some advantages in comparison with X-ray film, for example, high sensitivity, wide latitude and high fidelity for radiations. The high sensitivity of IP makes it possible to observe the distribution of short-lived nuclides. We obtained autoradiographs of Azuki bean cuttings. In the basal region of Azuki bean cuttings, the intensity of autoradiographs of indole acetic acid (IAA)-treated samples were higher than that of water- and Gibbereline(GA)-treated ones. The high intensity parts of IAA-treated cuttings were extended upwards. The high intensive imaging of basal region treated in IAA indicated that high elemental concentrations were in existence for adventitious root formations. The measurement results by γ-ray spectrometry showed that the Ca content in the Azuki bean cuttings basal region increased in IAA treatment. It seems that the cell division for adventitious root formation needs Ca. In Azuki bean epicotyls, Ca content showed an increase to basal region, though Mg content increased to upper region. (author)
A perceptual space of local image statistics.

Science.gov (United States)

Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M

2015-12-01

Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.
The contribution of simple random sampling to observed variations in faecal egg counts.

Science.gov (United States)

Torgerson, Paul R; Paul, Michaela; Lewis, Fraser I

2012-09-10

It has been over 100 years since the classical paper published by Gosset in 1907, under the pseudonym "Student", demonstrated that yeast cells suspended in a fluid and measured by a haemocytometer conformed to a Poisson process. Similarly parasite eggs in a faecal suspension also conform to a Poisson process. Despite this there are common misconceptions how to analyse or interpret observations from the McMaster or similar quantitative parasitic diagnostic techniques, widely used for evaluating parasite eggs in faeces. The McMaster technique can easily be shown from a theoretical perspective to give variable results that inevitably arise from the random distribution of parasite eggs in a well mixed faecal sample. The Poisson processes that lead to this variability are described and illustrative examples of the potentially large confidence intervals that can arise from observed faecal eggs counts that are calculated from the observations on a McMaster slide. Attempts to modify the McMaster technique, or indeed other quantitative techniques, to ensure uniform egg counts are doomed to failure and belie ignorance of Poisson processes. A simple method to immediately identify excess variation/poor sampling from replicate counts is provided. Copyright © 2012 Elsevier B.V. All rights reserved.
Fundamentals of statistics

CERN Document Server

Mulholland, Henry

1968-01-01

Fundamentals of Statistics covers topics on the introduction, fundamentals, and science of statistics. The book discusses the collection, organization and representation of numerical data; elementary probability; the binomial Poisson distributions; and the measures of central tendency. The text describes measures of dispersion for measuring the spread of a distribution; continuous distributions for measuring on a continuous scale; the properties and use of normal distribution; and tests involving the normal or student's 't' distributions. The use of control charts for sample means; the ranges
How Sample Size Affects a Sampling Distribution

Science.gov (United States)

Mulekar, Madhuri S.; Siegel, Murray H.

2009-01-01

If students are to understand inferential statistics successfully, they must have a profound understanding of the nature of the sampling distribution. Specifically, they must comprehend the determination of the expected value and standard error of a sampling distribution as well as the meaning of the central limit theorem. Many students in a high…
Observations related to hydrogen in powder and single crystal samples of YB2Cu3O7-δ

International Nuclear Information System (INIS)

Porath, D.; Grayevsky, A.; Kaplan, N.; Shaltiel, D.; Yaron, U.; Walker, E.

1994-01-01

New observations related to hydrogenation of YBa 2 Cu 3 O 7-δ (YBCO) are reported: (a) The effects of sample preparation on the H concentration in ''uncharged'' YBCO samples is investigated, and it is shown through nuclear magnetic resonance measurements that samples of YBCO prepared by ''standard'' solid-state reaction procedures may contain ab initio up to 0.2 atoms formula -1 of hydrogen. (b) It is demonstrated that one may introduce up to 0.3 atoms formula -1 into single crystal samples of YBCO without destroying the macroscopic crystal. The significance of the above observations is discussed briefly. (orig.)
Direct Learning of Systematics-Aware Summary Statistics

CERN Multimedia

CERN. Geneva

2018-01-01

Complex machine learning tools, such as deep neural networks and gradient boosting algorithms, are increasingly being used to construct powerful discriminative features for High Energy Physics analyses. These methods are typically trained with simulated or auxiliary data samples by optimising some classification or regression surrogate objective. The learned feature representations are then used to build a sample-based statistical model to perform inference (e.g. interval estimation or hypothesis testing) over a set of parameters of interest. However, the effectiveness of the mentioned approach can be reduced by the presence of known uncertainties that cause differences between training and experimental data, included in the statistical model via nuisance parameters. This work presents an end-to-end algorithm, which leverages on existing deep learning technologies but directly aims to produce inference-optimal sample-summary statistics. By including the statistical model and a differentiable approximation of ...
Experimental observations of Lagrangian sand grain kinematics under bedload transport: statistical description of the step and rest regimes

Science.gov (United States)

Guala, M.; Liu, M.

2017-12-01

The kinematics of sediment particles is investigated by non-intrusive imaging methods to provide a statistical description of bedload transport in conditions near the threshold of motion. In particular, we focus on the cyclic transition between motion and rest regimes to quantify the waiting time statistics inferred to be responsible for anomalous diffusion, and so far elusive. Despite obvious limitations in the spatio-temporal domain of the observations, we are able to identify the probability distributions of the particle step time and length, velocity, acceleration, waiting time, and thus distinguish which quantities exhibit well converged mean values, based on the thickness of their respective tails. The experimental results shown here for four different transport conditions highlight the importance of the waiting time distribution and represent a benchmark dataset for the stochastic modeling of bedload transport.
On the Sampling

OpenAIRE

Güleda Doğan

2017-01-01

This editorial is on statistical sampling, which is one of the most two important reasons for editorial rejection from our journal Turkish Librarianship. The stages of quantitative research, the stage in which we are sampling, the importance of sampling for a research, deciding on sample size and sampling methods are summarised briefly.

Preferential sampling in veterinary parasitological surveillance

Directory of Open Access Journals (Sweden)

Lorenzo Cecconi

2016-04-01

Full Text Available In parasitological surveillance of livestock, prevalence surveys are conducted on a sample of farms using several sampling designs. For example, opportunistic surveys or informative sampling designs are very common. Preferential sampling refers to any situation in which the spatial process and the sampling locations are not independent. Most examples of preferential sampling in the spatial statistics literature are in environmental statistics with focus on pollutant monitors, and it has been shown that, if preferential sampling is present and is not accounted for in the statistical modelling and data analysis, statistical inference can be misleading. In this paper, working in the context of veterinary parasitology, we propose and use geostatistical models to predict the continuous and spatially-varying risk of a parasite infection. Specifically, breaking with the common practice in veterinary parasitological surveillance to ignore preferential sampling even though informative or opportunistic samples are very common, we specify a two-stage hierarchical Bayesian model that adjusts for preferential sampling and we apply it to data on Fasciola hepatica infection in sheep farms in Campania region (Southern Italy in the years 2013-2014.
Uncertainties in observational data on organic aerosol: An annual perspective of sampling artifacts in Beijing, China

International Nuclear Information System (INIS)

Cheng, Yuan; He, Ke-bin

2015-01-01

Current understanding of organic aerosol (OA) is challenged by the large gap between simulation results and observational data. Based on six campaigns conducted in a representative mega city in China, this study provided an annual perspective of the uncertainties in observational OA data caused by sampling artifacts. Our results suggest that for the commonly-used sampling approach that involves collection of particles on a bare quartz filter, the positive artifact could result in a 20–40 % overestimation of OA concentrations. Based on an evaluation framework that includes four criteria, an activated carbon denuder was demonstrated to be able to effectively eliminate the positive artifact with a long useful time of at least one month, and hence it was recommended to be a good choice for routine measurement of carbonaceous aerosol. - Highlights: • Positive artifact can cause an overestimation of OA concentrations by up to 40%. • It remains a challenge to measure semivolatile OA based on filter sampling. • The positive artifact can be effectively removed by an ACM denuder. • The ACM denuder is small in size, easy to use and multi-functional. • The ACM denuder is recommended for routine measurement of OA. - Accounting for sampling artifacts can help to bridge the gap between simulated and observed OA concentrations.
Groundwater sampling from shallow boreholes (PP and PR) and groundwater observation tubes (PVP) at Olkiluoto in 2004

Energy Technology Data Exchange (ETDEWEB)

Hirvonen, H. [Teollisuuden Voima Oyj, Eurajoki (Finland)

2005-11-15

Groundwater sampling from the shallow boreholes and groundwater observation tubes was performed in summer 2004 (PP2, PP3, PP7, PP8, PRl, PVPl, PVP3A, PVP3B, PVP4A and PVP4B) and in autumn 2004 (PP2, PP3, PP5, PP7, PP8, PP9, PP36, PP37, PP39, PR1, PR2, PVP1, PVP3A, PVP3B, PVP4A, PVP8A, PVP9A, PVP9B, PVP10B, PVP11, PVP12, PVP13, PVP14 and PVP20). The results from previous samplings have been used in the hydrogeochemical baseline characterization at Olkiluoto and some of the latest results have also been part of the ONKALO monitoring program. This study contains data on preliminary pumping of the sampling points and pumping for groundwater sampling and chemical analyses in the laboratory. This study also includes comparison with analytical results obtained between 1995-2004. The total dissolved solids (TDS) of groundwater samples were mainly below 1000 mg/L. According to Davis's TDS classification, these waters were fresh waters. The only exception was the water sample from shallow borehole PP7 (1400mg/L and 1450mg/L), which was brackish. Several different groundwater types were observed, but the most common water type was Ca-HCO{sub 3} (five samples). Analytical results from 1995-2003 were compared. During 2001-2003 in groundwater samples from sampling points PVP1, PVP9A and PP7 all measured main parameters changed considerably, but from summer 2003 to autumn 2004 the greatest alterations occurred in PR2, PVP1, PVP3A and PVP3B waters. These changes can be seen in almost all parameters. For other samples only minor changes in results were observed during the reference period. (orig.)
The novel programmable riometer for in-depth ionospheric and magnetospheric observations (PRIAMOS) using direct sampling DSP techniques

OpenAIRE

Dekoulis, G.; Honary, F.

2005-01-01

This paper describes the feasibility study and simulation results for the unique multi-frequency, multi-bandwidth, Programmable Riometer for in-depth Ionospheric And Magnetospheric ObservationS (PRIAMOS) based on direct sampling digital signal processing (DSP) techniques. This novel architecture is based on sampling the cosmic noise wavefront at the antenna. It eliminates the usage of any intermediate frequency (IF) mixer stages (-6 dB) and the noise balancing technique (-3 dB), providing a m...
STATISTICAL TOOLS FOR CLASSIFYING GALAXY GROUP DYNAMICS

International Nuclear Information System (INIS)

Hou, Annie; Parker, Laura C.; Harris, William E.; Wilman, David J.

2009-01-01

The dynamical state of galaxy groups at intermediate redshifts can provide information about the growth of structure in the universe. We examine three goodness-of-fit tests, the Anderson-Darling (A-D), Kolmogorov, and χ 2 tests, in order to determine which statistical tool is best able to distinguish between groups that are relaxed and those that are dynamically complex. We perform Monte Carlo simulations of these three tests and show that the χ 2 test is profoundly unreliable for groups with fewer than 30 members. Power studies of the Kolmogorov and A-D tests are conducted to test their robustness for various sample sizes. We then apply these tests to a sample of the second Canadian Network for Observational Cosmology Redshift Survey (CNOC2) galaxy groups and find that the A-D test is far more reliable and powerful at detecting real departures from an underlying Gaussian distribution than the more commonly used χ 2 and Kolmogorov tests. We use this statistic to classify a sample of the CNOC2 groups and find that 34 of 106 groups are inconsistent with an underlying Gaussian velocity distribution, and thus do not appear relaxed. In addition, we compute velocity dispersion profiles (VDPs) for all groups with more than 20 members and compare the overall features of the Gaussian and non-Gaussian groups, finding that the VDPs of the non-Gaussian groups are distinct from those classified as Gaussian.
Statistics II essentials

CERN Document Server

Milewski, Emil G

2012-01-01

REA's Essentials provide quick and easy access to critical information in a variety of different fields, ranging from the most basic to the most advanced. As its name implies, these concise, comprehensive study guides summarize the essentials of the field covered. Essentials are helpful when preparing for exams, doing homework and will remain a lasting reference source for students, teachers, and professionals. Statistics II discusses sampling theory, statistical inference, independent and dependent variables, correlation theory, experimental design, count data, chi-square test, and time se
Statistical quality management using miniTAB 14

International Nuclear Information System (INIS)

An, Seong Jin

2007-01-01

This book explains statistical quality management giving descriptions of definition of quality, quality management, quality cost, basic methods of quality management, principles of control chart, control chart for variables, control chart for attributes, capability analysis, other issues of statistical process control, acceptance sampling, sampling for variable acceptance, design and analysis of experiment, Taguchi quality engineering, reaction surface methodology reliability analysis.
Spatial statistics of hydrography and water chemistry in a eutrophic boreal lake based on sounding and water samples.

Science.gov (United States)

Leppäranta, Matti; Lewis, John E; Heini, Anniina; Arvola, Lauri

2018-06-04

Spatial variability, an essential characteristic of lake ecosystems, has often been neglected in field research and monitoring. In this study, we apply spatial statistical methods for the key physics and chemistry variables and chlorophyll a over eight sampling dates in two consecutive years in a large (area 103 km 2 ) eutrophic boreal lake in southern Finland. In the four summer sampling dates, the water body was vertically and horizontally heterogenic except with color and DOC, in the two winter ice-covered dates DO was vertically stratified, while in the two autumn dates, no significant spatial differences in any of the measured variables were found. Chlorophyll a concentration was one order of magnitude lower under the ice cover than in open water. The Moran statistic for spatial correlation was significant for chlorophyll a and NO 2 +NO 3 -N in all summer situations and for dissolved oxygen and pH in three cases. In summer, the mass centers of the chemicals were within 1.5 km from the geometric center of the lake, and the 2nd moment radius ranged in 3.7-4.1 km respective to 3.9 km for the homogeneous situation. The lateral length scales of the studied variables were 1.5-2.5 km, about 1 km longer in the surface layer. The detected spatial "noise" strongly suggests that besides vertical variation also the horizontal variation in eutrophic lakes, in particular, should be considered when the ecosystems are monitored.
Small sample approach, and statistical and epidemiological aspects

NARCIS (Netherlands)

Offringa, Martin; van der Lee, Hanneke

2011-01-01

In this chapter, the design of pharmacokinetic studies and phase III trials in children is discussed. Classical approaches and relatively novel approaches, which may be more useful in the context of drug research in children, are discussed. The burden of repeated blood sampling in pediatric
Strong laws for L- and U-statistics

NARCIS (Netherlands)

Aaronson, J; Burton, R; Dehling, H; Gilat, D; Hill, T; Weiss, B

Strong laws of large numbers are given for L-statistics (linear combinations of order statistics) and for U-statistics (averages of kernels of random samples) for ergodic stationary processes, extending classical theorems; of Hoeffding and of Helmers for lid sequences. Examples are given to show
Statistical Computing

Indian Academy of Sciences (India)

inference and finite population sampling. Sudhakar Kunte. Elements of statistical computing are discussed in this series. ... which captain gets an option to decide whether to field first or bat first ... may of course not be fair, in the sense that the team which wins ... describe two methods of drawing a random number between 0.
Sampling

CERN Document Server

Thompson, Steven K

2012-01-01

Praise for the Second Edition "This book has never had a competitor. It is the only book that takes a broad approach to sampling . . . any good personal statistics library should include a copy of this book." —Technometrics "Well-written . . . an excellent book on an important subject. Highly recommended." —Choice "An ideal reference for scientific researchers and other professionals who use sampling." —Zentralblatt Math Features new developments in the field combined with all aspects of obtaining, interpreting, and using sample data Sampling provides an up-to-date treat
Applied systems ecology: models, data, and statistical methods

Energy Technology Data Exchange (ETDEWEB)

Eberhardt, L L

1976-01-01

In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.
Statistical Power in Meta-Analysis

Science.gov (United States)

Liu, Jin

2015-01-01

Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Frontiers in statistical quality control

CERN Document Server

Wilrich, Peter-Theodor

2001-01-01

The book is a collection of papers presented at the 5th International Workshop on Intelligent Statistical Quality Control in Würzburg, Germany. Contributions deal with methodology and successful industrial applications. They can be grouped in four catagories: Sampling Inspection, Statistical Process Control, Data Analysis and Process Capability Studies and Experimental Design.
The complex universe: recent observations and theoretical challenges

International Nuclear Information System (INIS)

Sylos Labini, Francesco; Pietronero, Luciano

2010-01-01

The large-scale distribution of galaxies in the universe displays a complex pattern of clusters, super-clusters, filaments and voids with sizes limited only by the boundaries of the available samples. A quantitative statistical characterization of these structures shows that galaxy distribution is inhomogeneous in these samples, being characterized by large amplitude fluctuations of large spatial extension. Over a large range of scales, both the average conditional density and its variance show a non-trivial scaling behavior: at small scales, r −1 . At larger scales, the density depends only weakly (logarithmically) on the system size and density fluctuations follow the Gumbel distribution of extreme value statistics. These complex behaviors are different from what is expected in a homogeneous distribution with Gaussian fluctuations. The observed density inhomogeneities pose a fundamental challenge to the standard picture of cosmology but they also represent an important opportunity which points to new directions with respect to many cosmological puzzles. Indeed, the fact that matter distribution is not uniform, in the limited range of scales sampled by observations, raises the question of understanding how inhomogeneities affect the large-scale dynamics of the universe. We discuss several attempts which try to model inhomogeneities in cosmology, considering their effects with respect to the role and abundance of dark energy and dark matter
Statistical studies of powerful extragalactic radio sources

Energy Technology Data Exchange (ETDEWEB)

Macklin, J T

1981-01-01

This dissertation is mainly about the use of efficient statistical tests to study the properties of powerful extragalactic radio sources. Most of the analysis is based on subsets of a sample of 166 bright (3CR) sources selected at 178 MHz. The first chapter is introductory and it is followed by three on the misalignment and symmetry of double radio sources. The properties of nuclear components in extragalactic sources are discussed in the next chapter, using statistical tests which make efficient use of upper limits, often the only available information on the flux density from the nuclear component. Multifrequency observations of four 3CR sources are presented in the next chapter. The penultimate chapter is about the analysis of correlations involving more than two variables. The Spearman partial rank correlation coefficient is shown to be the most powerful test available which is based on non-parametric statistics. It is therefore used to study the dependences of the properties of sources on their size at constant redshift, and the results are interpreted in terms of source evolution. Correlations of source properties with luminosity and redshift are then examined.
The Math Problem: Advertising Students' Attitudes toward Statistics

Science.gov (United States)

Fullerton, Jami A.; Kendrick, Alice

2013-01-01

This study used the Students' Attitudes toward Statistics Scale (STATS) to measure attitude toward statistics among a national sample of advertising students. A factor analysis revealed four underlying factors make up the attitude toward statistics construct--"Interest & Future Applicability," "Confidence," "Statistical Tools," and "Initiative."…
Some connections between importance sampling and enhanced sampling methods in molecular dynamics.

Science.gov (United States)

Lie, H C; Quer, J

2017-11-21

In molecular dynamics, enhanced sampling methods enable the collection of better statistics of rare events from a reference or target distribution. We show that a large class of these methods is based on the idea of importance sampling from mathematical statistics. We illustrate this connection by comparing the Hartmann-Schütte method for rare event simulation (J. Stat. Mech. Theor. Exp. 2012, P11004) and the Valsson-Parrinello method of variationally enhanced sampling [Phys. Rev. Lett. 113, 090601 (2014)]. We use this connection in order to discuss how recent results from the Monte Carlo methods literature can guide the development of enhanced sampling methods.
Evaluation of statistical methods for quantifying fractal scaling in water-quality time series with irregular sampling

Science.gov (United States)

Zhang, Qian; Harman, Ciaran J.; Kirchner, James W.

2018-02-01

River water-quality time series often exhibit fractal scaling, which here refers to autocorrelation that decays as a power law over some range of scales. Fractal scaling presents challenges to the identification of deterministic trends because (1) fractal scaling has the potential to lead to false inference about the statistical significance of trends and (2) the abundance of irregularly spaced data in water-quality monitoring networks complicates efforts to quantify fractal scaling. Traditional methods for estimating fractal scaling - in the form of spectral slope (β) or other equivalent scaling parameters (e.g., Hurst exponent) - are generally inapplicable to irregularly sampled data. Here we consider two types of estimation approaches for irregularly sampled data and evaluate their performance using synthetic time series. These time series were generated such that (1) they exhibit a wide range of prescribed fractal scaling behaviors, ranging from white noise (β = 0) to Brown noise (β = 2) and (2) their sampling gap intervals mimic the sampling irregularity (as quantified by both the skewness and mean of gap-interval lengths) in real water-quality data. The results suggest that none of the existing methods fully account for the effects of sampling irregularity on β estimation. First, the results illustrate the danger of using interpolation for gap filling when examining autocorrelation, as the interpolation methods consistently underestimate or overestimate β under a wide range of prescribed β values and gap distributions. Second, the widely used Lomb-Scargle spectral method also consistently underestimates β. A previously published modified form, using only the lowest 5 % of the frequencies for spectral slope estimation, has very poor precision, although the overall bias is small. Third, a recent wavelet-based method, coupled with an aliasing filter, generally has the smallest bias and root-mean-squared error among all methods for a wide range of

Statistical retrieval of thin liquid cloud microphysical properties using ground-based infrared and microwave observations

Science.gov (United States)

Marke, Tobias; Ebell, Kerstin; Löhnert, Ulrich; Turner, David D.

2016-12-01

In this article, liquid water cloud microphysical properties are retrieved by a combination of microwave and infrared ground-based observations. Clouds containing liquid water are frequently occurring in most climate regimes and play a significant role in terms of interaction with radiation. Small perturbations in the amount of liquid water contained in the cloud can cause large variations in the radiative fluxes. This effect is enhanced for thin clouds (liquid water path, LWP cloud properties crucial. Due to large relative errors in retrieving low LWP values from observations in the microwave domain and a high sensitivity for infrared methods when the LWP is low, a synergistic retrieval based on a neural network approach is built to estimate both LWP and cloud effective radius (reff). These statistical retrievals can be applied without high computational demand but imply constraints like prior information on cloud phase and cloud layering. The neural network retrievals are able to retrieve LWP and reff for thin clouds with a mean relative error of 9% and 17%, respectively. This is demonstrated using synthetic observations of a microwave radiometer (MWR) and a spectrally highly resolved infrared interferometer. The accuracy and robustness of the synergistic retrievals is confirmed by a low bias in a radiative closure study for the downwelling shortwave flux, even for marginally invalid scenes. Also, broadband infrared radiance observations, in combination with the MWR, have the potential to retrieve LWP with a higher accuracy than a MWR-only retrieval.
Equivalent statistics and data interpretation.

Science.gov (United States)

Francis, Gregory

2017-08-01

Recent reform efforts in psychological science have led to a plethora of choices for scientists to analyze their data. A scientist making an inference about their data must now decide whether to report a p value, summarize the data with a standardized effect size and its confidence interval, report a Bayes Factor, or use other model comparison methods. To make good choices among these options, it is necessary for researchers to understand the characteristics of the various statistics used by the different analysis frameworks. Toward that end, this paper makes two contributions. First, it shows that for the case of a two-sample t test with known sample sizes, many different summary statistics are mathematically equivalent in the sense that they are based on the very same information in the data set. When the sample sizes are known, the p value provides as much information about a data set as the confidence interval of Cohen's d or a JZS Bayes factor. Second, this equivalence means that different analysis methods differ only in their interpretation of the empirical data. At first glance, it might seem that mathematical equivalence of the statistics suggests that it does not matter much which statistic is reported, but the opposite is true because the appropriateness of a reported statistic is relative to the inference it promotes. Accordingly, scientists should choose an analysis method appropriate for their scientific investigation. A direct comparison of the different inferential frameworks provides some guidance for scientists to make good choices and improve scientific practice.
Estimation of global network statistics from incomplete data.

Directory of Open Access Journals (Sweden)

Catherine A Bliss

Full Text Available Complex networks underlie an enormous variety of social, biological, physical, and virtual systems. A profound complication for the science of complex networks is that in most cases, observing all nodes and all network interactions is impossible. Previous work addressing the impacts of partial network data is surprisingly limited, focuses primarily on missing nodes, and suggests that network statistics derived from subsampled data are not suitable estimators for the same network statistics describing the overall network topology. We generate scaling methods to predict true network statistics, including the degree distribution, from only partial knowledge of nodes, links, or weights. Our methods are transparent and do not assume a known generating process for the network, thus enabling prediction of network statistics for a wide variety of applications. We validate analytical results on four simulated network classes and empirical data sets of various sizes. We perform subsampling experiments by varying proportions of sampled data and demonstrate that our scaling methods can provide very good estimates of true network statistics while acknowledging limits. Lastly, we apply our techniques to a set of rich and evolving large-scale social networks, Twitter reply networks. Based on 100 million tweets, we use our scaling techniques to propose a statistical characterization of the Twitter Interactome from September 2008 to November 2008. Our treatment allows us to find support for Dunbar's hypothesis in detecting an upper threshold for the number of active social contacts that individuals maintain over the course of one week.
Sample size determination and power

CERN Document Server

Ryan, Thomas P, Jr

2013-01-01

THOMAS P. RYAN, PhD, teaches online advanced statistics courses for Northwestern University and The Institute for Statistics Education in sample size determination, design of experiments, engineering statistics, and regression analysis.
Statistical sampling strategies for survey of soil contamination

NARCIS (Netherlands)

Brus, D.J.

2011-01-01

This chapter reviews methods for selecting sampling locations in contaminated soils for three situations. In the first situation a global estimate of the soil contamination in an area is required. The result of the surey is a number or a series of numbers per contaminant, e.g. the estimated mean
THE DETECTION AND STATISTICS OF GIANT ARCS BEHIND CLASH CLUSTERS

International Nuclear Information System (INIS)

Xu, Bingxiao; Zheng, Wei; Postman, Marc; Bradley, Larry; Meneghetti, Massimo; Koekemoer, Anton; Seitz, Stella; Zitrin, Adi; Merten, Julian; Maoz, Dani; Frye, Brenda; Umetsu, Keiichi; Vega, Jesus

2016-01-01

We developed an algorithm to find and characterize gravitationally lensed galaxies (arcs) to perform a comparison of the observed and simulated arc abundance. Observations are from the Cluster Lensing And Supernova survey with Hubble (CLASH). Simulated CLASH images are created using the MOKA package and also clusters selected from the high-resolution, hydrodynamical simulations, MUSIC, over the same mass and redshift range as the CLASH sample. The algorithm's arc elongation accuracy, completeness, and false positive rate are determined and used to compute an estimate of the true arc abundance. We derive a lensing efficiency of 4 ± 1 arcs (with length ≥6″ and length-to-width ratio ≥7) per cluster for the X-ray-selected CLASH sample, 4 ± 1 arcs per cluster for the MOKA-simulated sample, and 3 ± 1 arcs per cluster for the MUSIC-simulated sample. The observed and simulated arc statistics are in full agreement. We measure the photometric redshifts of all detected arcs and find a median redshift z s = 1.9 with 33% of the detected arcs having z s > 3. We find that the arc abundance does not depend strongly on the source redshift distribution but is sensitive to the mass distribution of the dark matter halos (e.g., the c–M relation). Our results show that consistency between the observed and simulated distributions of lensed arc sizes and axial ratios can be achieved by using cluster-lensing simulations that are carefully matched to the selection criteria used in the observations
The Detection and Statistics of Giant Arcs behind CLASH Clusters

Science.gov (United States)

Xu, Bingxiao; Postman, Marc; Meneghetti, Massimo; Seitz, Stella; Zitrin, Adi; Merten, Julian; Maoz, Dani; Frye, Brenda; Umetsu, Keiichi; Zheng, Wei; Bradley, Larry; Vega, Jesus; Koekemoer, Anton

2016-02-01

We developed an algorithm to find and characterize gravitationally lensed galaxies (arcs) to perform a comparison of the observed and simulated arc abundance. Observations are from the Cluster Lensing And Supernova survey with Hubble (CLASH). Simulated CLASH images are created using the MOKA package and also clusters selected from the high-resolution, hydrodynamical simulations, MUSIC, over the same mass and redshift range as the CLASH sample. The algorithm's arc elongation accuracy, completeness, and false positive rate are determined and used to compute an estimate of the true arc abundance. We derive a lensing efficiency of 4 ± 1 arcs (with length ≥6″ and length-to-width ratio ≥7) per cluster for the X-ray-selected CLASH sample, 4 ± 1 arcs per cluster for the MOKA-simulated sample, and 3 ± 1 arcs per cluster for the MUSIC-simulated sample. The observed and simulated arc statistics are in full agreement. We measure the photometric redshifts of all detected arcs and find a median redshift zs = 1.9 with 33% of the detected arcs having zs > 3. We find that the arc abundance does not depend strongly on the source redshift distribution but is sensitive to the mass distribution of the dark matter halos (e.g., the c-M relation). Our results show that consistency between the observed and simulated distributions of lensed arc sizes and axial ratios can be achieved by using cluster-lensing simulations that are carefully matched to the selection criteria used in the observations.
Basics of modern mathematical statistics

CERN Document Server

Spokoiny, Vladimir

2015-01-01

This textbook provides a unified and self-contained presentation of the main approaches to and ideas of mathematical statistics. It collects the basic mathematical ideas and tools needed as a basis for more serious studies or even independent research in statistics. The majority of existing textbooks in mathematical statistics follow the classical asymptotic framework. Yet, as modern statistics has changed rapidly in recent years, new methods and approaches have appeared. The emphasis is on finite sample behavior, large parameter dimensions, and model misspecifications. The present book provides a fully self-contained introduction to the world of modern mathematical statistics, collecting the basic knowledge, concepts and findings needed for doing further research in the modern theoretical and applied statistics. This textbook is primarily intended for graduate and postdoc students and young researchers who are interested in modern statistical methods.
Lectures on algebraic statistics

CERN Document Server

Drton, Mathias; Sullivant, Seth

2009-01-01

How does an algebraic geometer studying secant varieties further the understanding of hypothesis tests in statistics? Why would a statistician working on factor analysis raise open problems about determinantal varieties? Connections of this type are at the heart of the new field of "algebraic statistics". In this field, mathematicians and statisticians come together to solve statistical inference problems using concepts from algebraic geometry as well as related computational and combinatorial techniques. The goal of these lectures is to introduce newcomers from the different camps to algebraic statistics. The introduction will be centered around the following three observations: many important statistical models correspond to algebraic or semi-algebraic sets of parameters; the geometry of these parameter spaces determines the behaviour of widely used statistical inference procedures; computational algebraic geometry can be used to study parameter spaces and other features of statistical models.
Audiometric changes with age in Hiroshima: a statistical study

Energy Technology Data Exchange (ETDEWEB)

Hollingsworth, J W; Ishii, Goro

1960-10-01

Audiometry observations were analyzed for 290 irradiated survivors of the 1945 atomic bomb in Hiroshima and in 293 nonirradiated subjects. The study was undertaken in order to determine the age changes in audiology in irradiated and nonirradiated subjects as well as to investigate the pattern of hearing levels in a Japanese population for comparison with patterns in Caucasians. The following statistical observations were made. Correlation between hearing levels for right and left ear. Correlation between hearing levels at various cycles. Changes in hearing levels by age and sex. The relation between age and decibel loss was not linear and correlation ratios with age were 0.45 to 0.72. Audiometry seems to be of some value as one of a battery of tests of physiologic aging designed for detection of irradiation induced nonspecific aging acceleration. In this relatively small sample, no differences in hearing acuity were detected in the atomic bomb survivors as compared with the control sample. 6 references, 3 figures, 9 tables.
Fisher statistics for analysis of diffusion tensor directional information.

Science.gov (United States)

Hutchinson, Elizabeth B; Rutecki, Paul A; Alexander, Andrew L; Sutula, Thomas P

2012-04-30

A statistical approach is presented for the quantitative analysis of diffusion tensor imaging (DTI) directional information using Fisher statistics, which were originally developed for the analysis of vectors in the field of paleomagnetism. In this framework, descriptive and inferential statistics have been formulated based on the Fisher probability density function, a spherical analogue of the normal distribution. The Fisher approach was evaluated for investigation of rat brain DTI maps to characterize tissue orientation in the corpus callosum, fornix, and hilus of the dorsal hippocampal dentate gyrus, and to compare directional properties in these regions following status epilepticus (SE) or traumatic brain injury (TBI) with values in healthy brains. Direction vectors were determined for each region of interest (ROI) for each brain sample and Fisher statistics were applied to calculate the mean direction vector and variance parameters in the corpus callosum, fornix, and dentate gyrus of normal rats and rats that experienced TBI or SE. Hypothesis testing was performed by calculation of Watson's F-statistic and associated p-value giving the likelihood that grouped observations were from the same directional distribution. In the fornix and midline corpus callosum, no directional differences were detected between groups, however in the hilus, significant (pstatistical comparison of tissue structural orientation. Copyright © 2012 Elsevier B.V. All rights reserved.
Statistical and Machine Learning forecasting methods: Concerns and ways forward

Science.gov (United States)

Makridakis, Spyros; Assimakopoulos, Vassilios

2018-01-01

Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions. PMID:29584784
Statistical and Machine Learning forecasting methods: Concerns and ways forward.

Science.gov (United States)

Makridakis, Spyros; Spiliotis, Evangelos; Assimakopoulos, Vassilios

2018-01-01

Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions.
Probability an introduction with statistical applications

CERN Document Server

Kinney, John J

2014-01-01

Praise for the First Edition""This is a well-written and impressively presented introduction to probability and statistics. The text throughout is highly readable, and the author makes liberal use of graphs and diagrams to clarify the theory."" - The StatisticianThoroughly updated, Probability: An Introduction with Statistical Applications, Second Edition features a comprehensive exploration of statistical data analysis as an application of probability. The new edition provides an introduction to statistics with accessible coverage of reliability, acceptance sampling, confidence intervals, h
Statistical analysis of archeomagnetic samples of Teotihuacan, Mexico

Science.gov (United States)

Soler-Arechalde, A. M.

2012-12-01

Teotihuacan was the one of the most important metropolis of Mesoamerica during the Classic Period (1 to 600 AC). The city had a continuous growth in different stages that usually concluded with a ritual. Fire was an important element natives would burn entire structures. An example of this is the Quetzalcoatl pyramid in La Ciudadela (350 AC), it was burned and a new structure was built over it, also the Big Fire at 570 AC, that marks its end. These events are suitable to archaeomagnetic dating. The inclusion of ash in the stucco enhances the magnetic signal of detrital type that also allows us to make dating. This increases the number of samples to be processed as well as the number of dates. The samples have been analyzed according to their type: floor, wall, talud and painting and whether or not exposed to fire. Sequences of directions obtained in excavations in strict stratigraphic control will be shown. A sequence of images was used to analyze the improving of Teotihuacan secular variation curve through more than a decade of continuous work at the area.
Development of the Large-Scale Statistical Analysis System of Satellites Observations Data with Grid Datafarm Architecture

Science.gov (United States)

Yamamoto, K.; Murata, K.; Kimura, E.; Honda, R.

2006-12-01

number of files and the elapsed time, parallel and distributed processing shorten the elapsed time to 1/5 than sequential processing. On the other hand, sequential processing times were shortened in another experiment, whose file size is smaller than 100KB. In this case, the elapsed time to scan one file is within one second. It implies that disk swap took place in case of parallel processing by each node. We note that the operation became unstable when the number of the files exceeded 1000. To overcome the problem (iii), we developed an original data class. This class supports our reading of data files with various data formats since it converts them into an original data format since it defines schemata for every type of data and encapsulates the structure of data files. In addition, since this class provides a function of time re-sampling, users can easily convert multiple data (array) with different time resolution into the same time resolution array. Finally, using the Gfarm, we achieved a high performance environment for large-scale statistical data analyses. It should be noted that the present method is effective only when one data file size is large enough. At present, we are restructuring the new Gfarm environment with 8 nodes: CPU is Athlon 64 x2 Dual Core 2GHz, 2GB memory and 1.2TB disk (using RAID0) for each node. Our original class is to be implemented on the new Gfarm environment. In the present talk, we show the latest results with applying the present system for data analyses with huge number of satellite observation data files.
Stratified source-sampling techniques for Monte Carlo eigenvalue analysis

International Nuclear Information System (INIS)

Mohamed, A.

1998-01-01

In 1995, at a conference on criticality safety, a special session was devoted to the Monte Carlo ''Eigenvalue of the World'' problem. Argonne presented a paper, at that session, in which the anomalies originally observed in that problem were reproduced in a much simplified model-problem configuration, and removed by a version of stratified source-sampling. In this paper, stratified source-sampling techniques are generalized and applied to three different Eigenvalue of the World configurations which take into account real-world statistical noise sources not included in the model problem, but which differ in the amount of neutronic coupling among the constituents of each configuration. It is concluded that, in Monte Carlo eigenvalue analysis of loosely-coupled arrays, the use of stratified source-sampling reduces the probability of encountering an anomalous result over that if conventional source-sampling methods are used. However, this gain in reliability is substantially less than that observed in the model-problem results
Nonparametric statistics for social and behavioral sciences

CERN Document Server

Kraska-MIller, M

2013-01-01

Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency
Constructing and sampling directed graphs with given degree sequences

International Nuclear Information System (INIS)

Kim, H; Del Genio, C I; Bassler, K E; Toroczkai, Z

2012-01-01

The interactions between the components of complex networks are often directed. Proper modeling of such systems frequently requires the construction of ensembles of digraphs with a given sequence of in- and out-degrees. As the number of simple labeled graphs with a given degree sequence is typically very large even for short sequences, sampling methods are needed for statistical studies. Currently, there are two main classes of methods that generate samples. One of the existing methods first generates a restricted class of graphs and then uses a Markov chain Monte-Carlo algorithm based on edge swaps to generate other realizations. As the mixing time of this process is still unknown, the independence of the samples is not well controlled. The other class of methods is based on the configuration model that may lead to unacceptably many sample rejections due to self-loops and multiple edges. Here we present an algorithm that can directly construct all possible realizations of a given bi-degree sequence by simple digraphs. Our method is rejection-free, guarantees the independence of the constructed samples and provides their weight. The weights can then be used to compute statistical averages of network observables as if they were obtained from uniformly distributed sampling or from any other chosen distribution. (paper)
The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.

Science.gov (United States)

Shen, Dan; Shen, Haipeng; Zhu, Hongtu; Marron, J S

2016-10-01

The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size. When this ratio converges to a nonzero constant, the sample eigenvector converges to a cone, with a certain angle to its corresponding population eigenvector. In the High Dimension, Low Sample Size case, the angle between the sample eigenvector and its population counterpart converges to a limiting distribution. Several generalizations of the multi-spike covariance models are also explored, and additional theoretical results are presented.

Understanding the Sampling Distribution and the Central Limit Theorem.

Science.gov (United States)

Lewis, Charla P.

The sampling distribution is a common source of misuse and misunderstanding in the study of statistics. The sampling distribution, underlying distribution, and the Central Limit Theorem are all interconnected in defining and explaining the proper use of the sampling distribution of various statistics. The sampling distribution of a statistic is…
Research design and statistical analysis

CERN Document Server

Myers, Jerome L; Lorch Jr, Robert F

2013-01-01

Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data. The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations. Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions. Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations
Use of Statistical Heuristics in Everyday Inductive Reasoning.

Science.gov (United States)

Nisbett, Richard E.; And Others

1983-01-01

In everyday reasoning, people use statistical heuristics (judgmental tools that are rough intuitive equivalents of statistical principles). Use of statistical heuristics is more likely when (1) sampling is clear, (2) the role of chance is clear, (3) statistical reasoning is normative for the event, or (4) the subject has had training in…
Observation of η′ →ωe+e-

NARCIS (Netherlands)

Ablikim, M.; Achasov, M. N.; Ai, X. C.; Albayrak, O.; Albrecht, M.; Ambrose, D. J.; Amoroso, A.; An, F. F.; An, Q.; Bai, J. Z.; Ferroli, R. Baldini; Ban, Y.; Bennett, D. W.; Bennett, J. V.; Bertani, M.; Bettoni, D.; Bian, J. M.; Bianchi, F.; Boger, E.; Boyko, I.; Briere, R. A.; Cai, H.; Cai, X.; Cakir, O.; Calcaterra, A.; Cao, G. F.; Cetin, S. A.; Chang, J. F.; Chelkov, G.; Chen, G.; Chen, H. S.; Chen, H. Y.; Chen, J. C.; Chen, M. L.; Chen, S. J.; Chen, X.; Chen, X. R.; Chen, Y. B.; Cheng, H. P.; Chu, X. K.; Cibinetto, G.; Dai, H. L.; Dai, J. P.; Dbeyssi, A.; Dedovich, D.; Deng, Z. Y.; Denig, A.; Denysenko, I.; Destefanis, M.; De Mori, F.; Ding, Y.; Dong, C.; Dong, J.; Dong, L. Y.; Dong, M. Y.; Du, S. X.; Duan, P. F.; Eren, E. E.; Fan, J. Z.; Fang, J.; Fang, S. S.; Fang, X.; Fang, Y.; Fava, L.; Feldbauer, F.; Felici, G.; Feng, C. Q.; Fioravanti, E.; Fritsch, M.; Fu, C. D.; Gao, Q.; Gao, X. Y.; Gao, Y.; Gao, Z.; Garzia, I.; Goetzen, K.; Gong, W. X.; Gradl, W.; Greco, M.; Gu, M. H.; Gu, Y. T.; Guan, Y. H.; Guo, A. Q.; Guo, L. B.; Guo, Y.; Guo, Y. P.; Haddadi, Z.; Hafner, A.; Han, S.; Hao, X. Q.; Harris, F. A.; He, K. L.; He, X. Q.; Held, T.; Heng, Y. K.; Hou, Z. L.; Hu, C.; Hu, H. M.; Hu, J. F.; Hu, T.; Hu, Y.; Huang, G. M.; Huang, G. S.; Huang, J. S.; Huang, X. T.; Huang, Y.; Hussain, T.; Ji, Q.; Ji, Q. P.; Ji, X. B.; Ji, X. L.; Jiang, L. W.; Jiang, X. S.; Jiang, X. Y.; Jiao, J. B.; Jiao, Z.; Jin, D. P.; Jin, S.; Johansson, T.; Julin, A.; Kalantar-Nayestanaki, N.; Kang, X. L.; Kang, X. S.; Kavatsyuk, M.; Ke, B. C.; Kiese, P.; Kliemt, R.; Kloss, B.; Kolcu, O. B.; Kopf, B.; Kornicer, M.; Kühn, W.; Kupsc, A.; Lange, J. S.; Lara, M.; Larin, P.; Leng, C.; Li, C.; Li, Cheng; Li, D. M.; Li, F.; Li, F. Y.; Li, G.; Li, H. B.; Li, J. C.; Li, Jin; Li, K.; Li, K.; Li, Lei; Li, P. R.; Li, T.; Li, W. D.; Li, W. G.; Li, X. L.; Li, X. M.; Li, X. N.; Li, X. Q.; Li, Z. B.; Liang, H.; Liang, Y. F.; Liang, Y. T.; Liao, G. R.; Lin, D. X.; Liu, B. J.; Liu, C. X.; Liu, F. H.; Liu, Fang; Liu, Feng; Liu, H. B.; Liu, H. H.; Liu, H. H.; Liu, H. M.; Liu, J.; Liu, J. B.; Liu, J. P.; Liu, J. Y.; Liu, K.; Liu, K. Y.; Liu, L. D.; Liu, P. L.; Liu, Q.; Liu, S. B.; Liu, X.; Liu, Y. B.; Liu, Z. A.; Liu, Zhiqing; Loehner, Herbert; Lou, X. C.; Lu, H. J.; Lu, J. G.; Lu, Y.; Lu, Y. P.; Luo, C. L.; Luo, M. X.; Luo, T.; Luo, X. L.; Lyu, X. R.; Ma, F. C.; Ma, H. L.; Ma, L. L.; Ma, Q. M.; Ma, T.; Ma, X. N.; Ma, X. Y.; Maas, F. E.; Maggiora, M.; Mao, Y. J.; Mao, Z. P.; Marcello, S.; Messchendorp, J. G.; Min, J.; Mitchell, R. E.; Mo, X. H.; Mo, Y. J.; Morales, C. Morales; Moriya, K.; Muchnoi, N. Yu; Muramatsu, H.; Nefedov, Y.; Nerling, F.; Nikolaev, I. B.; Ning, Z.; Nisar, S.; Niu, S. L.; Niu, X. Y.; Olsen, S. L.; Ouyang, Q.; Pacetti, S.; Patteri, P.; Pelizaeus, M.; Peng, H. P.; Peters, K.; Pettersson, J.; Ping, J. L.; Ping, R. G.; Poling, R.; Prasad, V.; Qi, M.; Qian, S.; Qiao, C. F.; Qin, L. Q.; Qin, N.; Qin, X. S.; Qin, Z. H.; Qiu, J. F.; Rashid, K. H.; Redmer, C. F.; Ripka, M.; Rong, G.; Rosner, Ch; Ruan, X. D.; Santoro, V.; Sarantsev, A.; Savrié, M.; Schoenning, K.; Schumann, S.; Shan, W.; Shao, M.; Shen, C. P.; Shen, P. X.; Shen, X. Y.; Sheng, H. Y.; Song, W. M.; Song, X. Y.; Sosio, S.; Spataro, S.; Sun, G. X.; Sun, J. F.; Sun, S. S.; Sun, Y. J.; Sun, Y. Z.; Sun, Z. J.; Sun, Z. T.; Tang, C. J.; Tang, X.; Tapan, I.; Thorndike, E. H.; Tiemens, M.; Ullrich, M.; Uman, I.; Varner, G. S.; Wang, B.; Wang, D.; Wang, D. Y.; Wang, K.; Wang, L. L.; Wang, L. S.; Wang, M.; Wang, P.; Wang, P. L.; Wang, S. G.; Wang, W.; Wang, X. F.; Wang, Y. D.; Wang, Y. F.; Wang, Y. Q.; Wang, Z.; Wang, Z. G.; Wang, Z. H.; Wang, Z. Y.; Weber, T.; Wei, D. H.; Wei, J. B.; Weidenkaff, P.; Wen, S. P.; Wiedner, U.; Wolke, M.; Wu, L. H.; Wu, Z.; Xia, L. G.; Xia, Y.; Xiao, D.; Xiao, H.; Xiao, Z. J.; Xie, Y. G.; Xiu, Q. L.; Xu, G. F.; Xu, L.; Xu, Q. J.; Xu, X. P.; Yan, L.; Yan, W. B.; Yan, W. C.; Yan, Y. H.; Yang, H. J.; Yang, H. X.; Yang, L.; Yang, Y.; Yang, Y. X.; Ye, M.; Ye, M. H.; Yin, J. H.; Yu, B. X.; Yu, C. X.; Yu, J. S.; Yuan, C. Z.; Yuan, W. L.; Yuan, Y.; Yuncu, A.; Zafar, A. A.; Zallo, A.; Zeng, Y.; Zhang, B. X.; Zhang, B. Y.; Zhang, C.; Zhang, C. C.; Zhang, D. H.; Zhang, H. H.; Zhang, H. Y.; Zhang, J. J.; Zhang, J. L.; Zhang, J. Q.; Zhang, J. W.; Zhang, J. Y.; Zhang, J. Z.; Zhang, K.; Zhang, L.; Zhang, X. Y.; Zhang, Y.; Zhang, Y. N.; Zhang, Y. H.; Zhang, Y. T.; Zhang, Yu; Zhang, Z. H.; Zhang, Z. P.; Zhang, Z. Y.; Zhao, G.; Zhao, J. W.; Zhao, J. Y.; Zhao, J. Z.; Zhao, Lei; Zhao, Ling; Zhao, M. G.; Zhao, Q.; Zhao, Q. W.; Zhao, S. J.; Zhao, T. C.; Zhao, Y. B.; Zhao, Z. G.; Zhemchugov, A.; Zheng, B.; Zheng, J. P.; Zheng, W. J.; Zheng, Y. H.; Zhong, B.; Zhou, L.; Zhou, X.; Zhou, X. K.; Zhou, X. R.; Zhou, X. Y.; Zhu, K.; Zhu, K. J.; Zhu, S.; Zhu, S. H.; Zhu, X. L.; Zhu, Y. C.; Zhu, Y. S.; Zhu, Z. A.; Zhuang, J.; Zotti, L.; Zou, B. S.; Zou, J. H.

2015-01-01

Based on a sample of η′ mesons produced in the radiative decay J/ψ→γη′ in 1.31×109 J/ψ events collected with the BESIII detector, the decay η′→ωe+e- is observed for the first time, with a statistical significance of 8σ. The branching fraction is measured to be
Pseudo-populations a basic concept in statistical surveys

CERN Document Server

Quatember, Andreas

2015-01-01

This book emphasizes that artificial or pseudo-populations play an important role in statistical surveys from finite universes in two manners: firstly, the concept of pseudo-populations may substantially improve users’ understanding of various aspects in the sampling theory and survey methodology; an example of this scenario is the Horvitz-Thompson estimator. Secondly, statistical procedures exist in which pseudo-populations actually have to be generated. An example of such a scenario can be found in simulation studies in the field of survey sampling, where close-to-reality pseudo-populations are generated from known sample and population data to form the basis for the simulation process. The chapters focus on estimation methods, sampling techniques, nonresponse, questioning designs and statistical disclosure control.This book is a valuable reference in understanding the importance of the pseudo-population concept and applying it in teaching and research.
Final Sampling and Analysis Plan for Background Sampling, Fort Sheridan, Illinois

National Research Council Canada - National Science Library

1995-01-01

.... This Background Sampling and Analysis Plan (BSAP) is designed to address this issue through the collection of additional background samples at Fort Sheridan to support the statistical analysis and the Baseline Risk Assessment (BRA...
Galaxies distribution in the universe: large-scale statistics and structures

International Nuclear Information System (INIS)

Maurogordato, Sophie

1988-01-01

This research thesis addresses the distribution of galaxies in the Universe, and more particularly large scale statistics and structures. Based on an assessment of the main used statistical techniques, the author outlines the need to develop additional tools to correlation functions in order to characterise the distribution. She introduces a new indicator: the probability of a volume randomly tested in the distribution to be void. This allows a characterisation of void properties at the work scales (until 10h"-"1 Mpc) in the Harvard Smithsonian Center for Astrophysics Redshift Survey, or CfA catalog. A systematic analysis of statistical properties of different sub-samples has then been performed with respect to the size and location, luminosity class, and morphological type. This analysis is then extended to different scenarios of structure formation. A program of radial speed measurements based on observations allows the determination of possible relationships between apparent structures. The author also presents results of the search for south extensions of Perseus supernova [fr
The statistical stability phenomenon

CERN Document Server

Gorban, Igor I

2017-01-01

This monograph investigates violations of statistical stability of physical events, variables, and processes and develops a new physical-mathematical theory taking into consideration such violations – the theory of hyper-random phenomena. There are five parts. The first describes the phenomenon of statistical stability and its features, and develops methods for detecting violations of statistical stability, in particular when data is limited. The second part presents several examples of real processes of different physical nature and demonstrates the violation of statistical stability over broad observation intervals. The third part outlines the mathematical foundations of the theory of hyper-random phenomena, while the fourth develops the foundations of the mathematical analysis of divergent and many-valued functions. The fifth part contains theoretical and experimental studies of statistical laws where there is violation of statistical stability. The monograph should be of particular interest to engineers...
Writing to Learn Statistics in an Advanced Placement Statistics Course

Science.gov (United States)

Northrup, Christian Glenn

2012-01-01

This study investigated the use of writing in a statistics classroom to learn if writing provided a rich description of problem-solving processes of students as they solved problems. Through analysis of 329 written samples provided by students, it was determined that writing provided a rich description of problem-solving processes and enabled…
Statistical survey of widely spread out solar electron events observed with STEREO and ACE with special attention to anisotropies

Science.gov (United States)

Dresing, N.; Gómez-Herrero, R.; Heber, B.; Klassen, A.; Malandraki, O.; Dröge, W.; Kartavykh, Y.

2014-07-01

Context. In February 2011, the two STEREO spacecrafts reached a separation of 180 degrees in longitude, offering a complete view of the Sun for the first time ever. When the full Sun surface is visible, source active regions of solar energetic particle (SEP) events can be identified unambiguously. STEREO, in combination with near-Earth observatories such as ACE or SOHO, provides three well separated viewpoints, which build an unprecedented platform from which to investigate the longitudinal variations of SEP events. Aims: We show an ensemble of SEP events that were observed between 2009 and mid-2013 by at least two spacecrafts and show a remarkably wide particle spread in longitude (wide-spread events). The main selection criterion for these events was a longitudinal separation of at least 80 degrees between active region and spacecraft magnetic footpoint for the widest separated spacecraft. We investigate the events statistically in terms of peak intensities, onset delays, and rise times, and determine the spread of the longitudinal events, which is the range filled by SEPs during the events. Energetic electron anisotropies are investigated to distinguish the source and transport mechanisms that lead to the observed wide particle spreads. Methods: According to the anisotropy distributions, we divided the events into three classes depending on different source and transport scenarios. One potential mechanism for wide-spread events is efficient perpendicular transport in the interplanetary medium that competes with another scenario, which is a wide particle spread that occurs close to the Sun. In the latter case, the observations at 1 AU during the early phase of the events are expected to show significant anisotropies because of the wide injection range at the Sun and particle-focusing during the outward propagation, while in the first case only low anisotropies are anticipated. Results: We find events for both of these scenarios in our sample that match the
IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

Science.gov (United States)

Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben

2017-09-15

Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Evaluation of statistical methods for quantifying fractal scaling in water-quality time series with irregular sampling

Directory of Open Access Journals (Sweden)

Q. Zhang

2018-02-01

Full Text Available River water-quality time series often exhibit fractal scaling, which here refers to autocorrelation that decays as a power law over some range of scales. Fractal scaling presents challenges to the identification of deterministic trends because (1 fractal scaling has the potential to lead to false inference about the statistical significance of trends and (2 the abundance of irregularly spaced data in water-quality monitoring networks complicates efforts to quantify fractal scaling. Traditional methods for estimating fractal scaling – in the form of spectral slope (β or other equivalent scaling parameters (e.g., Hurst exponent – are generally inapplicable to irregularly sampled data. Here we consider two types of estimation approaches for irregularly sampled data and evaluate their performance using synthetic time series. These time series were generated such that (1 they exhibit a wide range of prescribed fractal scaling behaviors, ranging from white noise (β  =  0 to Brown noise (β  =  2 and (2 their sampling gap intervals mimic the sampling irregularity (as quantified by both the skewness and mean of gap-interval lengths in real water-quality data. The results suggest that none of the existing methods fully account for the effects of sampling irregularity on β estimation. First, the results illustrate the danger of using interpolation for gap filling when examining autocorrelation, as the interpolation methods consistently underestimate or overestimate β under a wide range of prescribed β values and gap distributions. Second, the widely used Lomb–Scargle spectral method also consistently underestimates β. A previously published modified form, using only the lowest 5 % of the frequencies for spectral slope estimation, has very poor precision, although the overall bias is small. Third, a recent wavelet-based method, coupled with an aliasing filter, generally has the smallest bias and root-mean-squared error among
Statistical Estimators Using Jointly Administrative and Survey Data to Produce French Structural Business Statistics

Directory of Open Access Journals (Sweden)

Brion Philippe

2015-12-01

Full Text Available Using as much administrative data as possible is a general trend among most national statistical institutes. Different kinds of administrative sources, from tax authorities or other administrative bodies, are very helpful material in the production of business statistics. However, these sources often have to be completed by information collected through statistical surveys. This article describes the way Insee has implemented such a strategy in order to produce French structural business statistics. The originality of the French procedure is that administrative and survey variables are used jointly for the same enterprises, unlike the majority of multisource systems, in which the two kinds of sources generally complement each other for different categories of units. The idea is to use, as much as possible, the richness of the administrative sources combined with the timeliness of a survey, even if the latter is conducted only on a sample of enterprises. One main issue is the classification of enterprises within the NACE nomenclature, which is a cornerstone variable in producing the breakdown of the results by industry. At a given date, two values of the corresponding code may coexist: the value of the register, not necessarily up to date, and the value resulting from the data collected via the survey, but only from a sample of enterprises. Using all this information together requires the implementation of specific statistical estimators combining some properties of the difference estimators with calibration techniques. This article presents these estimators, as well as their statistical properties, and compares them with those of other methods.
Use Of R in Statistics Lithuania

Directory of Open Access Journals (Sweden)

Tomas Rudys

2016-06-01

Full Text Available Recently R becoming more and more popular among official statistics offices. It can be used not even for research purposes, but also for a production of official statistics. Statistics Lithuania recently started an analysis of possibilities where R can be used and could it replace some other statistical programming languages or systems. For this reason a work group was arranged. In the paper we will present overview of the current situation on implementation of R in Statistics Lithuania, some problems we are chasing with and some future plans. At the current situation R is used mainly for research purposes. Looking for- ward a short courses on basic R was prepared and at the moment we are starting to use R for data analysis, data manipulation from Oracle data bases, some reports preparation, data editing, survey estimation. On the other hand we found some problems working with big data sets, also survey sampling as there are surveys with complex sampling designs. We are also analysing the running of R on our servers in order to have possibilities to use more random access memory (RAM. Despite the problems, we are trying to use R in more fields in production of official statistics.
Statistical Shape Model for Manifold Regularization: Gleason grading of prostate histology.

Science.gov (United States)

Sparks, Rachel; Madabhushi, Anant

2013-09-01

Gleason patterns of prostate cancer histopathology, characterized primarily by morphological and architectural attributes of histological structures (glands and nuclei), have been found to be highly correlated with disease aggressiveness and patient outcome. Gleason patterns 4 and 5 are highly correlated with more aggressive disease and poorer patient outcome, while Gleason patterns 1-3 tend to reflect more favorable patient outcome. Because Gleason grading is done manually by a pathologist visually examining glass (or digital) slides subtle morphologic and architectural differences of histological attributes, in addition to other factors, may result in grading errors and hence cause high inter-observer variability. Recently some researchers have proposed computerized decision support systems to automatically grade Gleason patterns by using features pertaining to nuclear architecture, gland morphology, as well as tissue texture. Automated characterization of gland morphology has been shown to distinguish between intermediate Gleason patterns 3 and 4 with high accuracy. Manifold learning (ML) schemes attempt to generate a low dimensional manifold representation of a higher dimensional feature space while simultaneously preserving nonlinear relationships between object instances. Classification can then be performed in the low dimensional space with high accuracy. However ML is sensitive to the samples contained in the dataset; changes in the dataset may alter the manifold structure. In this paper we present a manifold regularization technique to constrain the low dimensional manifold to a specific range of possible manifold shapes, the range being determined via a statistical shape model of manifolds (SSMM). In this work we demonstrate applications of the SSMM in (1) identifying samples on the manifold which contain noise, defined as those samples which deviate from the SSMM, and (2) accurate out-of-sample extrapolation (OSE) of newly acquired samples onto a
Statistical Analyses of Scatterplots to Identify Important Factors in Large-Scale Simulations

Energy Technology Data Exchange (ETDEWEB)

Kleijnen, J.P.C.; Helton, J.C.

1999-04-01

The robustness of procedures for identifying patterns in scatterplots generated in Monte Carlo sensitivity analyses is investigated. These procedures are based on attempts to detect increasingly complex patterns in the scatterplots under consideration and involve the identification of (1) linear relationships with correlation coefficients, (2) monotonic relationships with rank correlation coefficients, (3) trends in central tendency as defined by means, medians and the Kruskal-Wallis statistic, (4) trends in variability as defined by variances and interquartile ranges, and (5) deviations from randomness as defined by the chi-square statistic. The following two topics related to the robustness of these procedures are considered for a sequence of example analyses with a large model for two-phase fluid flow: the presence of Type I and Type II errors, and the stability of results obtained with independent Latin hypercube samples. Observations from analysis include: (1) Type I errors are unavoidable, (2) Type II errors can occur when inappropriate analysis procedures are used, (3) physical explanations should always be sought for why statistical procedures identify variables as being important, and (4) the identification of important variables tends to be stable for independent Latin hypercube samples.
DESIGNING ENVIRONMENTAL MONITORING DATABASES FOR STATISTIC ASSESSMENT

Science.gov (United States)

Databases designed for statistical analyses have characteristics that distinguish them from databases intended for general use. EMAP uses a probabilistic sampling design to collect data to produce statistical assessments of environmental conditions. In addition to supporting the ...
Truths, lies, and statistics.

Science.gov (United States)

Thiese, Matthew S; Walker, Skyler; Lindsey, Jenna

2017-10-01

Distribution of valuable research discoveries are needed for the continual advancement of patient care. Publication and subsequent reliance of false study results would be detrimental for patient care. Unfortunately, research misconduct may originate from many sources. While there is evidence of ongoing research misconduct in all it's forms, it is challenging to identify the actual occurrence of research misconduct, which is especially true for misconduct in clinical trials. Research misconduct is challenging to measure and there are few studies reporting the prevalence or underlying causes of research misconduct among biomedical researchers. Reported prevalence estimates of misconduct are probably underestimates, and range from 0.3% to 4.9%. There have been efforts to measure the prevalence of research misconduct; however, the relatively few published studies are not freely comparable because of varying characterizations of research misconduct and the methods used for data collection. There are some signs which may point to an increased possibility of research misconduct, however there is a need for continued self-policing by biomedical researchers. There are existing resources to assist in ensuring appropriate statistical methods and preventing other types of research fraud. These included the "Statistical Analyses and Methods in the Published Literature", also known as the SAMPL guidelines, which help scientists determine the appropriate method of reporting various statistical methods; the "Strengthening Analytical Thinking for Observational Studies", or the STRATOS, which emphases on execution and interpretation of results; and the Committee on Publication Ethics (COPE), which was created in 1997 to deliver guidance about publication ethics. COPE has a sequence of views and strategies grounded in the values of honesty and accuracy.
THE DETECTION AND STATISTICS OF GIANT ARCS BEHIND CLASH CLUSTERS

Energy Technology Data Exchange (ETDEWEB)

Xu, Bingxiao; Zheng, Wei [Department of Physics and Astronomy, The Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218 (United States); Postman, Marc; Bradley, Larry [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21208 (United States); Meneghetti, Massimo; Koekemoer, Anton [INAF, Osservatorio Astronomico di Bologna, and INFN, Sezione di Bologna, Via Ranzani 1, I-40127 Bologna (Italy); Seitz, Stella [Universitaets-Sternwarte, Fakultaet fuer Physik, Ludwig-Maximilians Universitaet Muenchen, Scheinerstr. 1, D-81679 Muenchen (Germany); Zitrin, Adi [California Institute of Technology, MC 249-17, Pasadena, CA 91125 (United States); Merten, Julian [University of Oxford, Department of Physics, Denys Wilkinson Building, Keble Road, Oxford, OX1 3RH (United Kingdom); Maoz, Dani [School of Physics and Astronomy, Tel Aviv University, Tel-Aviv 69978 (Israel); Frye, Brenda [Steward Observatory/Department of Astronomy, University of Arizona, 933 N. Cherry Ave., Tucson, AZ 85721 (United States); Umetsu, Keiichi [Institute of Astronomy and Astrophysics, Academia Sinica, P.O. Box 23-141, Taipei 10617, Taiwan (China); Vega, Jesus, E-mail: bxu6@jhu.edu [Universidad Autonoma de Madrid, Ciudad Universitaria de Cantoblanco, E-28049 Madrid (Spain)

2016-02-01

We developed an algorithm to find and characterize gravitationally lensed galaxies (arcs) to perform a comparison of the observed and simulated arc abundance. Observations are from the Cluster Lensing And Supernova survey with Hubble (CLASH). Simulated CLASH images are created using the MOKA package and also clusters selected from the high-resolution, hydrodynamical simulations, MUSIC, over the same mass and redshift range as the CLASH sample. The algorithm's arc elongation accuracy, completeness, and false positive rate are determined and used to compute an estimate of the true arc abundance. We derive a lensing efficiency of 4 ± 1 arcs (with length ≥6″ and length-to-width ratio ≥7) per cluster for the X-ray-selected CLASH sample, 4 ± 1 arcs per cluster for the MOKA-simulated sample, and 3 ± 1 arcs per cluster for the MUSIC-simulated sample. The observed and simulated arc statistics are in full agreement. We measure the photometric redshifts of all detected arcs and find a median redshift z{sub s} = 1.9 with 33% of the detected arcs having z{sub s} > 3. We find that the arc abundance does not depend strongly on the source redshift distribution but is sensitive to the mass distribution of the dark matter halos (e.g., the c–M relation). Our results show that consistency between the observed and simulated distributions of lensed arc sizes and axial ratios can be achieved by using cluster-lensing simulations that are carefully matched to the selection criteria used in the observations.
Notices about using elementary statistics in psychology

OpenAIRE

松田, 文子; 三宅, 幹子; 橋本, 優花里; 山崎, 理央; 森田, 愛子; 小嶋, 佳子

2003-01-01

Improper uses of elementary statistics that were often observed in beginners' manuscripts and papers were collected and better ways were suggested. This paper consists of three parts: About descriptive statistics, multivariate analyses, and statistical tests.

Investigation of a sample of carbon-enhanced metal-poor stars observed with FORS and GMOS

Science.gov (United States)

Caffau, E.; Gallagher, A. J.; Bonifacio, P.; Spite, M.; Duffau, S.; Spite, F.; Monaco, L.; Sbordone, L.

2018-06-01

Aims: Carbon-enhanced metal-poor (CEMP) stars represent a sizeable fraction of all known metal-poor stars in the Galaxy. Their formation and composition remains a significant topic of investigation within the stellar astrophysics community. Methods: We analysed a sample of low-resolution spectra of 30 dwarf stars, obtained using the visual and near UV FOcal Reducer and low dispersion Spectrograph for the Very Large Telescope (FORS/VLT) of the European Southern Observatory (ESO) and the Gemini Multi-Object Spectrographs (GMOS) at the GEMINI telescope, to derive their metallicity and carbon abundance. Results: We derived C and Ca from all spectra, and Fe and Ba from the majority of the stars. Conclusions: We have extended the population statistics of CEMP stars and have confirmed that in general, stars with a high C abundance belonging to the high C band show a high Ba-content (CEMP-s or -r/s), while stars with a normal C abundance or that are C-rich, but belong to the low C band, are normal in Ba (CEMP-no). Based on observations made with ESO Telescopes at the La Silla Paranal Observatory under programme ID 099.D-0791.Based on observations obtained at the Gemini Observatory (processed using the Gemini IRAF package), which is operated by the Association of Universities for Research in Astronomy, Inc., under a cooperative agreement with the NSF on behalf of the Gemini partnership: the National Science Foundation (United States), the National Research Council (Canada), CONICYT (Chile), Ministerio de Ciencia, Tecnología e Innovación Productiva (Argentina), and Ministério da Ciência, Tecnologia e Inovação (Brazil).Tables 1 and 2 are also available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/614/A68
Identification of mine waters by statistical multivariate methods

Energy Technology Data Exchange (ETDEWEB)

Mali, N [IGGG, Ljubljana (Slovenia)

1992-01-01

Three water-bearing aquifers are present in the Velenje lignite mine. The aquifer waters have differing chemical composition; a geochemical water analysis can therefore determine the source of mine water influx. Mine water samples from different locations in the mine were analyzed, the results of chemical content and of electric conductivity of mine water were statistically processed by means of MICROGAS, SPSS-X and IN STATPAC computer programs, which apply three multivariate statistical methods (discriminate, cluster and factor analysis). Reliability of calculated values was determined with the Kolmogorov and Smirnov tests. It is concluded that laboratory analysis of single water samples can produce measurement errors, but statistical processing of water sample data can identify origin and movement of mine water. 15 refs.
Observation of ferromagnetic resonance in a microscopic sample using magnetic resonance force microscopy

International Nuclear Information System (INIS)

Zhang, Z.; Hammel, P.C.; Wigen, P.E.

1996-01-01

We report the observation of a ferromagnetic resonance signal arising from a microscopic (∼20μmx40μm) particle of thin (3μm) yttrium iron garnet film using magnetic resonance force microscopy (MRFM). The large signal intensity in the resonance spectra suggests that MRFM could become a powerful microscopic ferromagnetic resonance technique with a micron or sub-micron resolution. We also observe a very strong nonresonance signal which occurs in the field regime where the sample magnetization readily reorients in response to the modulation of the magnetic field. This signal will be the main noise source in applications where a magnet is mounted on the cantilever. copyright 1996 American Institute of Physics
Trace metals analysis in molybdenite mineral sample

International Nuclear Information System (INIS)

Tamrakar, Praveen Kumar; Pitre, K.S.

2000-01-01

DC polarography and other related techniques, viz., DPP and DPASV have been successfully used for the simultaneous determination of trace metals in molybdenite mineral sample. The polarograms and voltammograms of sample solution have been recorded in 0.1 M (NH 4 ) 2 tartrate supporting electrolyte at two different pH values i.e., 2.7±0.1 and 9.0±0.1. The results indicate the presence of Cu 2+ , Mo 6+ , Cd 2+ , Ni 2+ , In 3+ , Fe 3+ and W 6+ metal ions in the sample. For the determination of tungsten(VI), 11 M HCl has been used as supporting electrolyte. Tungsten(VI) produces a well defined wave/peak with E 1/2 /Ep=-0.42V/-0.48V vs SCE in 11 M HCl. The quantitative analysis by the method of standard addition shows the mineral sample to have the following composition, Cu 2+ ( 14.83), Mo 6+ (253.70), Cd 2+ (41.36), Ni 2+ (16.08), In 3+ (3.06), Fe 3+ (83.00)and W 6+ (4.14 )mg/g of the sample. Statistical treatment of the observed voltammetric data reveals high accuracy and good precision of determination. The observed voltammetric results are comparable with those obtained using AAS method. (author)
Statistical Model of Extreme Shear

DEFF Research Database (Denmark)

Larsen, Gunner Chr.; Hansen, Kurt Schaldemose

2004-01-01

In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...... are consistent, given the inevitabel uncertainties associated with model as well as with the extreme value data analysis. Keywords: Statistical model, extreme wind conditions, statistical analysis, turbulence, wind loading, statistical analysis, turbulence, wind loading, wind shear, wind turbines....
A simulative comparison of respondent driven sampling with incentivized snowball sampling--the "strudel effect".

Science.gov (United States)

Gyarmathy, V Anna; Johnston, Lisa G; Caplinskiene, Irma; Caplinskas, Saulius; Latkin, Carl A

2014-02-01

Respondent driven sampling (RDS) and incentivized snowball sampling (ISS) are two sampling methods that are commonly used to reach people who inject drugs (PWID). We generated a set of simulated RDS samples on an actual sociometric ISS sample of PWID in Vilnius, Lithuania ("original sample") to assess if the simulated RDS estimates were statistically significantly different from the original ISS sample prevalences for HIV (9.8%), Hepatitis A (43.6%), Hepatitis B (Anti-HBc 43.9% and HBsAg 3.4%), Hepatitis C (87.5%), syphilis (6.8%) and Chlamydia (8.8%) infections and for selected behavioral risk characteristics. The original sample consisted of a large component of 249 people (83% of the sample) and 13 smaller components with 1-12 individuals. Generally, as long as all seeds were recruited from the large component of the original sample, the simulation samples simply recreated the large component. There were no significant differences between the large component and the entire original sample for the characteristics of interest. Altogether 99.2% of 360 simulation sample point estimates were within the confidence interval of the original prevalence values for the characteristics of interest. When population characteristics are reflected in large network components that dominate the population, RDS and ISS may produce samples that have statistically non-different prevalence values, even though some isolated network components may be under-sampled and/or statistically significantly different from the main groups. This so-called "strudel effect" is discussed in the paper. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
The natural statistics of audiovisual speech.

Directory of Open Access Journals (Sweden)

Chandramouli Chandrasekaran

2009-07-01

Full Text Available Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2-7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver.
Observer-based output feedback control of networked control systems with non-uniform sampling and time-varying delay

Science.gov (United States)

Meng, Su; Chen, Jie; Sun, Jian

2017-10-01

This paper investigates the problem of observer-based output feedback control for networked control systems with non-uniform sampling and time-varying transmission delay. The sampling intervals are assumed to vary within a given interval. The transmission delay belongs to a known interval. A discrete-time model is first established, which contains time-varying delay and norm-bounded uncertainties coming from non-uniform sampling intervals. It is then converted to an interconnection of two subsystems in which the forward channel is delay-free. The scaled small gain theorem is used to derive the stability condition for the closed-loop system. Moreover, the observer-based output feedback controller design method is proposed by utilising a modified cone complementary linearisation algorithm. Finally, numerical examples illustrate the validity and superiority of the proposed method.
The matchmaking paradox: a statistical explanation

International Nuclear Information System (INIS)

Eliazar, Iddo I; Sokolov, Igor M

2010-01-01

Medical surveys regarding the number of heterosexual partners per person yield different female and male averages-a result which, from a physical standpoint, is impossible. In this paper we term this puzzle the 'matchmaking paradox', and establish a statistical model explaining it. We consider a bipartite graph with N male and N female nodes (N >> 1), and B bonds connecting them (B >> 1). Each node is associated a random 'attractiveness level', and the bonds connect to the nodes randomly-with probabilities which are proportionate to the nodes' attractiveness levels. The population's average bonds-per-nodes B/N is estimated via a sample average calculated from a survey of size n (n >> 1). A comprehensive statistical analysis of this model is carried out, asserting that (i) the sample average well estimates the population average if and only if the attractiveness levels possess a finite mean; (ii) if the attractiveness levels are governed by a 'fat-tailed' probability law then the sample average displays wild fluctuations and strong skew-thus providing a statistical explanation to the matchmaking paradox.
Accelerating assimilation development for new observing systems using EFSO

Science.gov (United States)

Lien, Guo-Yuan; Hotta, Daisuke; Kalnay, Eugenia; Miyoshi, Takemasa; Chen, Tse-Chun

2018-03-01

To successfully assimilate data from a new observing system, it is necessary to develop appropriate data selection strategies, assimilating only the generally useful data. This development work is usually done by trial and error using observing system experiments (OSEs), which are very time and resource consuming. This study proposes a new, efficient methodology to accelerate the development using ensemble forecast sensitivity to observations (EFSO). First, non-cycled assimilation of the new observation data is conducted to compute EFSO diagnostics for each observation within a large sample. Second, the average EFSO conditionally sampled in terms of various factors is computed. Third, potential data selection criteria are designed based on the non-cycled EFSO statistics, and tested in cycled OSEs to verify the actual assimilation impact. The usefulness of this method is demonstrated with the assimilation of satellite precipitation data. It is shown that the EFSO-based method can efficiently suggest data selection criteria that significantly improve the assimilation results.
Bias expansion of spatial statistics and approximation of differenced ...

Indian Academy of Sciences (India)

Investigations of spatial statistics, computed from lattice data in the plane, can lead to a special lattice point counting problem. The statistical goal is to expand the asymptotic expectation or large-sample bias of certain spatial covariance estimators, where this bias typically depends on the shape of a spatial sampling region.
Inverse statistical physics of protein sequences: a key issues review.

Science.gov (United States)

Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin

2018-03-01

In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.
Catch statistics for belugas in West Greenland 1862 to 1999

Directory of Open Access Journals (Sweden)

MP Heide-Jørgensen

2002-07-01

Full Text Available Information and statistics including trade statistics on catches of white whales or belugas (Delphinapterus leucas in West Greenland since 1862 are presented. The period before 1952 was dominated by large catches south of 66o N that peaked with 1,380 reported kills in 1922. Catch levels in the past five decades are evaluated on the basis of official catch statistics, trade in mattak (whale skin, sampling of jaws and reports from local residents and other observers. Options are given for corrections of catch statistics based upon auxiliary statistics on trade of mattak, catches in previous decades for areas without reporting and on likely levels of loss rates in different hunting operations. The fractions of the reported catches that are caused by ice entrapments of whales are estimated. During 1954-1999 total reported catches ranged from 216 to 1,874 and they peaked around 1970. Correcting for underreporting and killed-but-lost whales increases the catch reports by 42% on average for 1954-1998. If the whales killed in ice entrapments are removed then the corrected catch estimate is on average 28% larger than the reported catches.
Mathematical Anxiety among Business Statistics Students.

Science.gov (United States)

High, Robert V.

A survey instrument was developed to identify sources of mathematics anxiety among undergraduate business students in a statistics class. A number of statistics classes were selected at two colleges in Long Island, New York. A final sample of n=102 respondents indicated that there was a relationship between the mathematics grade in prior…
Statistical criteria for characterizing irradiance time series.

Energy Technology Data Exchange (ETDEWEB)

Stein, Joshua S.; Ellis, Abraham; Hansen, Clifford W.

2010-10-01

We propose and examine several statistical criteria for characterizing time series of solar irradiance. Time series of irradiance are used in analyses that seek to quantify the performance of photovoltaic (PV) power systems over time. Time series of irradiance are either measured or are simulated using models. Simulations of irradiance are often calibrated to or generated from statistics for observed irradiance and simulations are validated by comparing the simulation output to the observed irradiance. Criteria used in this comparison should derive from the context of the analyses in which the simulated irradiance is to be used. We examine three statistics that characterize time series and their use as criteria for comparing time series. We demonstrate these statistics using observed irradiance data recorded in August 2007 in Las Vegas, Nevada, and in June 2009 in Albuquerque, New Mexico.
Alfvén waves in the foreshock propagating upstream in the plasma rest frame: statistics from Cluster observations

Directory of Open Access Journals (Sweden)

Y. Narita

2004-07-01

Full Text Available We statistically study various properties of low-frequency waves such as frequencies, wave numbers, phase velocities, and polarization in the plasma rest frame in the terrestrial foreshock. Using Cluster observations the wave telescope or k-filtering is applied to investigate wave numbers and rest frame frequencies. We find that most of the foreshock waves propagate upstream along the magnetic field at phase velocity close to the Alfvén velocity. We identify that frequencies are around 0.1xΩcp and wave numbers are around 0.1xΩcp/VA, where Ωcp is the proton cyclotron frequency and VA is the Alfvén velocity. Our results confirm the conclusions drawn from ISEE observations and strongly support the existence of Alfvén waves in the foreshock.
Triacylglycerol Analysis in Human Milk and Other Mammalian Species: Small-Scale Sample Preparation, Characterization, and Statistical Classification Using HPLC-ELSD Profiles.

Science.gov (United States)

Ten-Doménech, Isabel; Beltrán-Iturat, Eduardo; Herrero-Martínez, José Manuel; Sancho-Llopis, Juan Vicente; Simó-Alfonso, Ernesto Francisco

2015-06-24

In this work, a method for the separation of triacylglycerols (TAGs) present in human milk and from other mammalian species by reversed-phase high-performance liquid chromatography using a core-shell particle packed column with UV and evaporative light-scattering detectors is described. Under optimal conditions, a mobile phase containing acetonitrile/n-pentanol at 10 °C gave an excellent resolution among more than 50 TAG peaks. A small-scale method for fat extraction in these milks (particularly of interest for human milk samples) using minimal amounts of sample and reagents was also developed. The proposed extraction protocol and the traditional method were compared, giving similar results, with respect to the total fat and relative TAG contents. Finally, a statistical study based on linear discriminant analysis on the TAG composition of different types of milks (human, cow, sheep, and goat) was carried out to differentiate the samples according to their mammalian origin.
Solar Ion Processing of Itokawa Grains: Reconciling Model Predictions with Sample Observations

Science.gov (United States)

Christoffersen, Roy; Keller, L. P.

2014-01-01

Analytical TEM observations of Itokawa grains reported to date show complex solar wind ion processing effects in the outer 30-100 nm of pyroxene and olivine grains. The effects include loss of long-range structural order, formation of isolated interval cavities or "bubbles", and other nanoscale compositional/microstructural variations. None of the effects so far described have, however, included complete ion-induced amorphization. To link the array of observed relationships to grain surface exposure times, we have adapted our previous numerical model for progressive solar ion processing effects in lunar regolith grains to the Itokawa samples. The model uses SRIM ion collision damage and implantation calculations within a framework of a constant-deposited-energy model for amorphization. Inputs include experimentally-measured amorphization fluences, a Pi steradian variable ion incidence geometry required for a rotating asteroid, and a numerical flux-versus-velocity solar wind spectrum.
Automated statistical modeling of analytical measurement systems

International Nuclear Information System (INIS)

Jacobson, J.J.

1992-01-01

The statistical modeling of analytical measurement systems at the Idaho Chemical Processing Plant (ICPP) has been completely automated through computer software. The statistical modeling of analytical measurement systems is one part of a complete quality control program used by the Remote Analytical Laboratory (RAL) at the ICPP. The quality control program is an integration of automated data input, measurement system calibration, database management, and statistical process control. The quality control program and statistical modeling program meet the guidelines set forth by the American Society for Testing Materials and American National Standards Institute. A statistical model is a set of mathematical equations describing any systematic bias inherent in a measurement system and the precision of a measurement system. A statistical model is developed from data generated from the analysis of control standards. Control standards are samples which are made up at precise known levels by an independent laboratory and submitted to the RAL. The RAL analysts who process control standards do not know the values of those control standards. The object behind statistical modeling is to describe real process samples in terms of their bias and precision and, to verify that a measurement system is operating satisfactorily. The processing of control standards gives us this ability
Quantum formalism for classical statistics

Science.gov (United States)

Wetterich, C.

2018-06-01

In static classical statistical systems the problem of information transport from a boundary to the bulk finds a simple description in terms of wave functions or density matrices. While the transfer matrix formalism is a type of Heisenberg picture for this problem, we develop here the associated Schrödinger picture that keeps track of the local probabilistic information. The transport of the probabilistic information between neighboring hypersurfaces obeys a linear evolution equation, and therefore the superposition principle for the possible solutions. Operators are associated to local observables, with rules for the computation of expectation values similar to quantum mechanics. We discuss how non-commutativity naturally arises in this setting. Also other features characteristic of quantum mechanics, such as complex structure, change of basis or symmetry transformations, can be found in classical statistics once formulated in terms of wave functions or density matrices. We construct for every quantum system an equivalent classical statistical system, such that time in quantum mechanics corresponds to the location of hypersurfaces in the classical probabilistic ensemble. For suitable choices of local observables in the classical statistical system one can, in principle, compute all expectation values and correlations of observables in the quantum system from the local probabilistic information of the associated classical statistical system. Realizing a static memory material as a quantum simulator for a given quantum system is not a matter of principle, but rather of practical simplicity.

Wind energy statistics

International Nuclear Information System (INIS)

Holttinen, H.; Tammelin, B.; Hyvoenen, R.

1997-01-01

The recording, analyzing and publishing of statistics of wind energy production has been reorganized in cooperation of VTT Energy, Finnish Meteorological (FMI Energy) and Finnish Wind Energy Association (STY) and supported by the Ministry of Trade and Industry (KTM). VTT Energy has developed a database that contains both monthly data and information on the wind turbines, sites and operators involved. The monthly production figures together with component failure statistics are collected from the operators by VTT Energy, who produces the final wind energy statistics to be published in Tuulensilmae and reported to energy statistics in Finland and abroad (Statistics Finland, Eurostat, IEA). To be able to verify the annual and monthly wind energy potential with average wind energy climate a production index in adopted. The index gives the expected wind energy production at various areas in Finland calculated using real wind speed observations, air density and a power curve for a typical 500 kW-wind turbine. FMI Energy has produced the average figures for four weather stations using the data from 1985-1996, and produces the monthly figures. (orig.)
Estimating statistical uncertainty of Monte Carlo efficiency-gain in the context of a correlated sampling Monte Carlo code for brachytherapy treatment planning with non-normal dose distribution.

Science.gov (United States)

Mukhopadhyay, Nitai D; Sampson, Andrew J; Deniz, Daniel; Alm Carlsson, Gudrun; Williamson, Jeffrey; Malusek, Alexandr

2012-01-01

Correlated sampling Monte Carlo methods can shorten computing times in brachytherapy treatment planning. Monte Carlo efficiency is typically estimated via efficiency gain, defined as the reduction in computing time by correlated sampling relative to conventional Monte Carlo methods when equal statistical uncertainties have been achieved. The determination of the efficiency gain uncertainty arising from random effects, however, is not a straightforward task specially when the error distribution is non-normal. The purpose of this study is to evaluate the applicability of the F distribution and standardized uncertainty propagation methods (widely used in metrology to estimate uncertainty of physical measurements) for predicting confidence intervals about efficiency gain estimates derived from single Monte Carlo runs using fixed-collision correlated sampling in a simplified brachytherapy geometry. A bootstrap based algorithm was used to simulate the probability distribution of the efficiency gain estimates and the shortest 95% confidence interval was estimated from this distribution. It was found that the corresponding relative uncertainty was as large as 37% for this particular problem. The uncertainty propagation framework predicted confidence intervals reasonably well; however its main disadvantage was that uncertainties of input quantities had to be calculated in a separate run via a Monte Carlo method. The F distribution noticeably underestimated the confidence interval. These discrepancies were influenced by several photons with large statistical weights which made extremely large contributions to the scored absorbed dose difference. The mechanism of acquiring high statistical weights in the fixed-collision correlated sampling method was explained and a mitigation strategy was proposed. Copyright © 2011 Elsevier Ltd. All rights reserved.
Cluster Statistics of BTW Automata

International Nuclear Information System (INIS)

Ajanta Bhowal Acharyya

2011-01-01

The cluster statistics of BTW automata in the SOC states are obtained by extensive computer simulation. Various moments of the clusters are calculated and few results are compared with earlier available numerical estimates and exact results. Reasonably good agreement is observed. An extended statistical analysis has been made. (author)
Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science.

Science.gov (United States)

Veldkamp, Coosje L S; Nuijten, Michèle B; Dominguez-Alvarez, Linda; van Assen, Marcel A L M; Wicherts, Jelte M

2014-01-01

Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this 'co-piloting' currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.
Statistical evaluations of current sampling procedures and incomplete core recovery

International Nuclear Information System (INIS)

Heasler, P.G.; Jensen, L.

1994-03-01

This document develops two formulas that describe the effects of incomplete recovery on core sampling results for the Hanford waste tanks. The formulas evaluate incomplete core recovery from a worst-case (i.e.,biased) and best-case (i.e., unbiased) perspective. A core sampler is unbiased if the sample material recovered is a random sample of the material in the tank, while any sampler that preferentially recovers a particular type of waste over others is a biased sampler. There is strong evidence to indicate that the push-mode sampler presently used at the Hanford site is a biased one. The formulas presented here show the effects of incomplete core recovery on the accuracy of composition measurements, as functions of the vertical variability in the waste. These equations are evaluated using vertical variability estimates from previously sampled tanks (B110, U110, C109). Assuming that the values of vertical variability used in this study adequately describes the Hanford tank farm, one can use the formulas to compute the effect of incomplete recovery on the accuracy of an average constituent estimate. To determine acceptable recovery limits, we have assumed that the relative error of such an estimate should be no more than 20%
Statistical distributions applications and parameter estimates

CERN Document Server

Thomopoulos, Nick T

2017-01-01

This book gives a description of the group of statistical distributions that have ample application to studies in statistics and probability. Understanding statistical distributions is fundamental for researchers in almost all disciplines. The informed researcher will select the statistical distribution that best fits the data in the study at hand. Some of the distributions are well known to the general researcher and are in use in a wide variety of ways. Other useful distributions are less understood and are not in common use. The book describes when and how to apply each of the distributions in research studies, with a goal to identify the distribution that best applies to the study. The distributions are for continuous, discrete, and bivariate random variables. In most studies, the parameter values are not known a priori, and sample data is needed to estimate parameter values. In other scenarios, no sample data is available, and the researcher seeks some insight that allows the estimate of ...
Estimation of sampling error uncertainties in observed surface air temperature change in China

Science.gov (United States)

Hua, Wei; Shen, Samuel S. P.; Weithmann, Alexander; Wang, Huijun

2017-08-01

This study examines the sampling error uncertainties in the monthly surface air temperature (SAT) change in China over recent decades, focusing on the uncertainties of gridded data, national averages, and linear trends. Results indicate that large sampling error variances appear at the station-sparse area of northern and western China with the maximum value exceeding 2.0 K2 while small sampling error variances are found at the station-dense area of southern and eastern China with most grid values being less than 0.05 K2. In general, the negative temperature existed in each month prior to the 1980s, and a warming in temperature began thereafter, which accelerated in the early and mid-1990s. The increasing trend in the SAT series was observed for each month of the year with the largest temperature increase and highest uncertainty of 0.51 ± 0.29 K (10 year)-1 occurring in February and the weakest trend and smallest uncertainty of 0.13 ± 0.07 K (10 year)-1 in August. The sampling error uncertainties in the national average annual mean SAT series are not sufficiently large to alter the conclusion of the persistent warming in China. In addition, the sampling error uncertainties in the SAT series show a clear variation compared with other uncertainty estimation methods, which is a plausible reason for the inconsistent variations between our estimate and other studies during this period.
Statistics of Extremes

KAUST Repository

Davison, Anthony C.; Huser, Raphaë l

2015-01-01

Statistics of extremes concerns inference for rare events. Often the events have never yet been observed, and their probabilities must therefore be estimated by extrapolation of tail models fitted to available data. Because data concerning the event
Application of nonparametric statistics to material strength/reliability assessment

International Nuclear Information System (INIS)

Arai, Taketoshi

1992-01-01

An advanced material technology requires data base on a wide variety of material behavior which need to be established experimentally. It may often happen that experiments are practically limited in terms of reproducibility or a range of test parameters. Statistical methods can be applied to understanding uncertainties in such a quantitative manner as required from the reliability point of view. Statistical assessment involves determinations of a most probable value and the maximum and/or minimum value as one-sided or two-sided confidence limit. A scatter of test data can be approximated by a theoretical distribution only if the goodness of fit satisfies a test criterion. Alternatively, nonparametric statistics (NPS) or distribution-free statistics can be applied. Mathematical procedures by NPS are well established for dealing with most reliability problems. They handle only order statistics of a sample. Mathematical formulas and some applications to engineering assessments are described. They include confidence limits of median, population coverage of sample, required minimum number of a sample, and confidence limits of fracture probability. These applications demonstrate that a nonparametric statistical estimation is useful in logical decision making in the case a large uncertainty exists. (author)
Comparison of small n statistical tests of differential expression applied to microarrays

Directory of Open Access Journals (Sweden)

Lee Anna Y

2009-02-01

Full Text Available Abstract Background DNA microarrays provide data for genome wide patterns of expression between observation classes. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of differential expression. We compare the performance of the standard t-test, fold change, and four small n statistical test methods designed to circumvent these problems. We report results of various normalization methods for empirical microarray data and of various random error models for simulated data. Results Three Empirical Bayes methods (CyberT, BRB, and limma t-statistics were the most effective statistical tests across simulated and both 2-colour cDNA and Affymetrix experimental data. The CyberT regularized t-statistic in particular was able to maintain expected false positive rates with simulated data showing high variances at low gene intensities, although at the cost of low true positive rates. The Local Pooled Error (LPE test introduced a bias that lowered false positive rates below theoretically expected values and had lower power relative to the top performers. The standard two-sample t-test and fold change were also found to be sub-optimal for detecting differentially expressed genes. The generalized log transformation was shown to be beneficial in improving results with certain data sets, in particular high variance cDNA data. Conclusion Pre-processing of data influences performance and the proper combination of pre-processing and statistical testing is necessary for obtaining the best results. All three Empirical Bayes methods assessed in our study are good choices for statistical tests for small n microarray studies for both Affymetrix and cDNA data. Choice of method for a particular study will depend on software and normalization preferences.
Sampling strategies in antimicrobial resistance monitoring: evaluating how precision and sensitivity vary with the number of animals sampled per farm.

Directory of Open Access Journals (Sweden)

Takehisa Yamamoto

Full Text Available Because antimicrobial resistance in food-producing animals is a major public health concern, many countries have implemented antimicrobial monitoring systems at a national level. When designing a sampling scheme for antimicrobial resistance monitoring, it is necessary to consider both cost effectiveness and statistical plausibility. In this study, we examined how sampling scheme precision and sensitivity can vary with the number of animals sampled from each farm, while keeping the overall sample size constant to avoid additional sampling costs. Five sampling strategies were investigated. These employed 1, 2, 3, 4 or 6 animal samples per farm, with a total of 12 animals sampled in each strategy. A total of 1,500 Escherichia coli isolates from 300 fattening pigs on 30 farms were tested for resistance against 12 antimicrobials. The performance of each sampling strategy was evaluated by bootstrap resampling from the observational data. In the bootstrapping procedure, farms, animals, and isolates were selected randomly with replacement, and a total of 10,000 replications were conducted. For each antimicrobial, we observed that the standard deviation and 2.5-97.5 percentile interval of resistance prevalence were smallest in the sampling strategy that employed 1 animal per farm. The proportion of bootstrap samples that included at least 1 isolate with resistance was also evaluated as an indicator of the sensitivity of the sampling strategy to previously unidentified antimicrobial resistance. The proportion was greatest with 1 sample per farm and decreased with larger samples per farm. We concluded that when the total number of samples is pre-specified, the most precise and sensitive sampling strategy involves collecting 1 sample per farm.
Cloud-based solution to identify statistically significant MS peaks differentiating sample categories.

Science.gov (United States)

Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B

2013-03-23

Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.
Special nuclear material inventory sampling plans

International Nuclear Information System (INIS)

Vaccaro, H.; Goldman, A.

1987-01-01

Since their introduction in 1942, sampling inspection procedures have been common quality assurance practice. The U.S. Department of Energy (DOE) supports such sampling of special nuclear materials inventories. The DOE Order 5630.7 states, Operations Offices may develop and use statistically valid sampling plans appropriate for their site-specific needs. The benefits for nuclear facilities operations include reduced worker exposure and reduced work load. Improved procedures have been developed for obtaining statistically valid sampling plans that maximize these benefits. The double sampling concept is described and the resulting sample sizes for double sample plans are compared with other plans. An algorithm is given for finding optimal double sampling plans that assist in choosing the appropriate detection and false alarm probabilities for various sampling plans
Bayesian statistics an introduction

CERN Document Server

Lee, Peter M

2012-01-01

Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee’s book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as wel
Statistical evaluation of cleanup: How should it be done?

International Nuclear Information System (INIS)

Gilbert, R.O.

1993-02-01

This paper discusses statistical issues that must be addressed when conducting statistical tests for the purpose of evaluating if a site has been remediated to guideline values or standards. The importance of using the Data Quality Objectives (DQO) process to plan and design the sampling plan is emphasized. Other topics discussed are: (1) accounting for the uncertainty of cleanup standards when conducting statistical tests, (2) determining the number of samples and measurements needed to attain specified DQOs, (3) considering whether the appropriate testing philosophy in a given situation is ''guilty until proven innocent'' or ''innocent until proven guilty'' when selecting a statistical test for evaluating the attainment of standards, (4) conducting tests using data sets that contain measurements that have been reported by the laboratory as less than the minimum detectable activity, and (5) selecting statistical tests that are appropriate for risk-based or background-based standards. A recent draft report by Berger that provides guidance on sampling plans and data analyses for final status surveys at US Nuclear Regulatory Commission licensed facilities serves as a focal point for discussion
EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.

Science.gov (United States)

Tong, Xiaoxiao; Bentler, Peter M

2013-01-01

Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
Statistics for Engineers

International Nuclear Information System (INIS)

Kim, Jin Gyeong; Park, Jin Ho; Park, Hyeon Jin; Lee, Jae Jun; Jun, Whong Seok; Whang, Jin Su

2009-08-01

This book explains statistics for engineers using MATLAB, which includes arrangement and summary of data, probability, probability distribution, sampling distribution, assumption, check, variance analysis, regression analysis, categorical data analysis, quality assurance such as conception of control chart, consecutive control chart, breakthrough strategy and analysis using Matlab, reliability analysis like measurement of reliability and analysis with Maltab, and Markov chain.
Two-Sample Statistics for Testing the Equality of Survival Functions Against Improper Semi-parametric Accelerated Failure Time Alternatives: An Application to the Analysis of a Breast Cancer Clinical Trial

Science.gov (United States)

BROËT, PHILIPPE; TSODIKOV, ALEXANDER; DE RYCKE, YANN; MOREAU, THIERRY

2010-01-01

This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests. PMID:15293627
Two-sample statistics for testing the equality of survival functions against improper semi-parametric accelerated failure time alternatives: an application to the analysis of a breast cancer clinical trial.

Science.gov (United States)

Broët, Philippe; Tsodikov, Alexander; De Rycke, Yann; Moreau, Thierry

2004-06-01

This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests.
Statistics for environmental science and management

National Research Council Canada - National Science Library

Manly, B.F.J

2009-01-01

.... Additional topics covered include environmental monitoring, impact assessment, censored data, environmental sampling, the role of statistics in environmental science, assessing site reclamation...

Autonomous spatially adaptive sampling in experiments based on curvature, statistical error and sample spacing with applications in LDA measurements

Science.gov (United States)

Theunissen, Raf; Kadosh, Jesse S.; Allen, Christian B.

2015-06-01

Spatially varying signals are typically sampled by collecting uniformly spaced samples irrespective of the signal content. For signals with inhomogeneous information content, this leads to unnecessarily dense sampling in regions of low interest or insufficient sample density at important features, or both. A new adaptive sampling technique is presented directing sample collection in proportion to local information content, capturing adequately the short-period features while sparsely sampling less dynamic regions. The proposed method incorporates a data-adapted sampling strategy on the basis of signal curvature, sample space-filling, variable experimental uncertainty and iterative improvement. Numerical assessment has indicated a reduction in the number of samples required to achieve a predefined uncertainty level overall while improving local accuracy for important features. The potential of the proposed method has been further demonstrated on the basis of Laser Doppler Anemometry experiments examining the wake behind a NACA0012 airfoil and the boundary layer characterisation of a flat plate.
Statistical Methods for Environmental Pollution Monitoring

Energy Technology Data Exchange (ETDEWEB)

Gilbert, Richard O. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

1987-01-01

The application of statistics to environmental pollution monitoring studies requires a knowledge of statistical analysis methods particularly well suited to pollution data. This book fills that need by providing sampling plans, statistical tests, parameter estimation procedure techniques, and references to pertinent publications. Most of the statistical techniques are relatively simple, and examples, exercises, and case studies are provided to illustrate procedures. The book is logically divided into three parts. Chapters 1, 2, and 3 are introductory chapters. Chapters 4 through 10 discuss field sampling designs and Chapters 11 through 18 deal with a broad range of statistical analysis procedures. Some statistical techniques given here are not commonly seen in statistics book. For example, see methods for handling correlated data (Sections 4.5 and 11.12), for detecting hot spots (Chapter 10), and for estimating a confidence interval for the mean of a lognormal distribution (Section 13.2). Also, Appendix B lists a computer code that estimates and tests for trends over time at one or more monitoring stations using nonparametric methods (Chapters 16 and 17). Unfortunately, some important topics could not be included because of their complexity and the need to limit the length of the book. For example, only brief mention could be made of time series analysis using Box-Jenkins methods and of kriging techniques for estimating spatial and spatial-time patterns of pollution, although multiple references on these topics are provided. Also, no discussion of methods for assessing risks from environmental pollution could be included.
Statistical and observational research of solar flare for total spectra and geometrical features

Science.gov (United States)

Nishimoto, S.; Watanabe, K.; Imada, S.; Kawate, T.; Lee, K. S.

2017-12-01

Impulsive energy release phenomena such as solar flares, sometimes affect to the solar-terrestrial environment. Usually, we use soft X-ray flux (GOES class) as the index of flare scale. However, the magnitude of effect to the solar-terrestrial environment is not proportional to that scale. To identify the relationship between solar flare phenomena and influence to the solar-terrestrial environment, we need to understand the full spectrum of solar flares. There is the solar flare irradiance model named the Flare Irradiance Spectral Model (FISM) (Chamberlin et al., 2006, 2007, 2008). The FISM can estimate solar flare spectra with high wavelength resolution. However, this model can not express the time evolution of emitted plasma during the solar flare, and has low accuracy on short wavelength that strongly effects and/or controls the total flare spectra. For the purpose of obtaining the time evolution of total solar flare spectra, we are performing statistical analysis of the electromagnetic data of solar flares. In this study, we select solar flare events larger than M-class from the Hinode flare catalogue (Watanabe et al., 2012). First, we focus on the EUV emission observed by the SDO/EVE. We examined the intensities and time evolutions of five EUV lines of 55 flare events. As a result, we found positive correlation between the "soft X-ray flux" and the "EUV peak flux" for all EVU lines. Moreover, we found that hot lines peaked earlier than cool lines of the EUV light curves. We also examined the hard X-ray data obtained by RHESSI. When we analyzed 163 events, we found good correlation between the "hard X-ray intensity" and the "soft X-ray flux". Because it seems that the geometrical features of solar flares effect to those time evolutions, we also looked into flare ribbons observed by SDO/AIA. We examined 21 flare events, and found positive correlation between the "GOES duration" and the "ribbon length". We also found positive correlation between the "ribbon
Sample size re-assessment leading to a raised sample size does not inflate type I error rate under mild conditions.

Science.gov (United States)

Broberg, Per

2013-07-19

One major concern with adaptive designs, such as the sample size adjustable designs, has been the fear of inflating the type I error rate. In (Stat Med 23:1023-1038, 2004) it is however proven that when observations follow a normal distribution and the interim result show promise, meaning that the conditional power exceeds 50%, type I error rate is protected. This bound and the distributional assumptions may seem to impose undesirable restrictions on the use of these designs. In (Stat Med 30:3267-3284, 2011) the possibility of going below 50% is explored and a region that permits an increased sample size without inflation is defined in terms of the conditional power at the interim. A criterion which is implicit in (Stat Med 30:3267-3284, 2011) is derived by elementary methods and expressed in terms of the test statistic at the interim to simplify practical use. Mathematical and computational details concerning this criterion are exhibited. Under very general conditions the type I error rate is preserved under sample size adjustable schemes that permit a raise. The main result states that for normally distributed observations raising the sample size when the result looks promising, where the definition of promising depends on the amount of knowledge gathered so far, guarantees the protection of the type I error rate. Also, in the many situations where the test statistic approximately follows a normal law, the deviation from the main result remains negligible. This article provides details regarding the Weibull and binomial distributions and indicates how one may approach these distributions within the current setting. There is thus reason to consider such designs more often, since they offer a means of adjusting an important design feature at little or no cost in terms of error rate.
BrightStat.com: free statistics online.

Science.gov (United States)

Stricker, Daniel

2008-10-01

Powerful software for statistical analysis is expensive. Here I present BrightStat, a statistical software running on the Internet which is free of charge. BrightStat's goals, its main capabilities and functionalities are outlined. Three different sample runs, a Friedman test, a chi-square test, and a step-wise multiple regression are presented. The results obtained by BrightStat are compared with results computed by SPSS, one of the global leader in providing statistical software, and VassarStats, a collection of scripts for data analysis running on the Internet. Elementary statistics is an inherent part of academic education and BrightStat is an alternative to commercial products.
Statistical learning in social action contexts.

Science.gov (United States)

Monroy, Claire; Meyer, Marlene; Gerson, Sarah; Hunnius, Sabine

2017-01-01

Sensitivity to the regularities and structure contained within sequential, goal-directed actions is an important building block for generating expectations about the actions we observe. Until now, research on statistical learning for actions has solely focused on individual action sequences, but many actions in daily life involve multiple actors in various interaction contexts. The current study is the first to investigate the role of statistical learning in tracking regularities between actions performed by different actors, and whether the social context characterizing their interaction influences learning. That is, are observers more likely to track regularities across actors if they are perceived as acting jointly as opposed to in parallel? We tested adults and toddlers to explore whether social context guides statistical learning and-if so-whether it does so from early in development. In a between-subjects eye-tracking experiment, participants were primed with a social context cue between two actors who either shared a goal of playing together ('Joint' condition) or stated the intention to act alone ('Parallel' condition). In subsequent videos, the actors performed sequential actions in which, for certain action pairs, the first actor's action reliably predicted the second actor's action. We analyzed predictive eye movements to upcoming actions as a measure of learning, and found that both adults and toddlers learned the statistical regularities across actors when their actions caused an effect. Further, adults with high statistical learning performance were sensitive to social context: those who observed actors with a shared goal were more likely to correctly predict upcoming actions. In contrast, there was no effect of social context in the toddler group, regardless of learning performance. These findings shed light on how adults and toddlers perceive statistical regularities across actors depending on the nature of the observed social situation and the
7 CFR 52.38a - Definitions of terms applicable to statistical sampling.

Science.gov (United States)

2010-01-01

... the number of defects (or defectives), which exceed the sample unit tolerance (“T”), in a series of... accumulation of defects (or defectives) allowed to exceed the sample unit tolerance (“T”) in any sample unit or consecutive group of sample units. (ii) CuSum value. The accumulated number of defects (or defectives) that...
A Statistical Programme Assignment Model

DEFF Research Database (Denmark)

Rosholm, Michael; Staghøj, Jonas; Svarer, Michael

When treatment effects of active labour market programmes are heterogeneous in an observable way across the population, the allocation of the unemployed into different programmes becomes a particularly important issue. In this paper, we present a statistical model designed to improve the present...... duration of unemployment spells may result if a statistical programme assignment model is introduced. We discuss several issues regarding the plementation of such a system, especially the interplay between the statistical model and case workers....
Statistics in the Computer Age

DEFF Research Database (Denmark)

Tjur, Tue

2011-01-01

It is a trivial observation that the computers have changed the way statistics is practiced. But has it also changed the theory of statistics and the way we teach it? I think yes—even if the changes appear to be surprisingly small in some contexts. This is an attempt to give a more detailed answer...
Methane hydrate distribution from prolonged and repeated formation in natural and compacted sand samples: X-ray CT observations

Energy Technology Data Exchange (ETDEWEB)

Rees, E.V.L.; Kneafsey, T.J.; Seol, Y.

2010-07-01

To study physical properties of methane gas hydrate-bearing sediments, it is necessary to synthesize laboratory samples due to the limited availability of cores from natural deposits. X-ray computed tomography (CT) and other observations have shown gas hydrate to occur in a number of morphologies over a variety of sediment types. To aid in understanding formation and growth patterns of hydrate in sediments, methane hydrate was repeatedly formed in laboratory-packed sand samples and in a natural sediment core from the Mount Elbert Stratigraphic Test Well. CT scanning was performed during hydrate formation and decomposition steps, and periodically while the hydrate samples remained under stable conditions for up to 60 days. The investigation revealed the impact of water saturation on location and morphology of hydrate in both laboratory and natural sediments during repeated hydrate formations. Significant redistribution of hydrate and water in the samples was observed over both the short and long term.
Statistical methods used in the public health literature and implications for training of public health professionals.

Science.gov (United States)

Hayat, Matthew J; Powell, Amanda; Johnson, Tessa; Cadwell, Betsy L

2017-01-01

Statistical literacy and knowledge is needed to read and understand the public health literature. The purpose of this study was to quantify basic and advanced statistical methods used in public health research. We randomly sampled 216 published articles from seven top tier general public health journals. Studies were reviewed by two readers and a standardized data collection form completed for each article. Data were analyzed with descriptive statistics and frequency distributions. Results were summarized for statistical methods used in the literature, including descriptive and inferential statistics, modeling, advanced statistical techniques, and statistical software used. Approximately 81.9% of articles reported an observational study design and 93.1% of articles were substantively focused. Descriptive statistics in table or graphical form were reported in more than 95% of the articles, and statistical inference reported in more than 76% of the studies reviewed. These results reveal the types of statistical methods currently used in the public health literature. Although this study did not obtain information on what should be taught, information on statistical methods being used is useful for curriculum development in graduate health sciences education, as well as making informed decisions about continuing education for public health professionals.
EPA/NMED/LANL 1998 water quality results: Statistical analysis and comparison to regulatory standards

International Nuclear Information System (INIS)

Gallaher, B.; Mercier, T.; Black, P.; Mullen, K.

2000-01-01

Four governmental agencies conducted a round of groundwater, surface water, and spring water sampling at the Los Alamos National Laboratory during 1998. Samples were split among the four parties and sent to independent analytical laboratories. Results from three of the agencies were available for this study. Comparisons of analytical results that were paired by location and date were made between the various analytical laboratories. The results for over 50 split samples analyzed for inorganic chemicals, metals, and radionuclides were compared. Statistical analyses included non-parametric (sign test and signed-ranks test) and parametric (paired t-test and linear regression) methods. The data pairs were tested for statistically significant differences, defined by an observed significance level, or p-value, less than 0.05. The main conclusion is that the laboratories' performances are similar across most of the analytes that were measured. In some 95% of the laboratory measurements there was agreement on whether contaminant levels exceeded regulatory limits. The most significant differences in performance were noted for the radioactive suite, particularly for gross alpha particle activity and Sr-90
Energy statistics manual

Energy Technology Data Exchange (ETDEWEB)

NONE

2010-07-01

Detailed, complete, timely and reliable statistics are essential to monitor the energy situation at a country level as well as at an international level. Energy statistics on supply, trade, stocks, transformation and demand are indeed the basis for any sound energy policy decision. For instance, the market of oil -- which is the largest traded commodity worldwide -- needs to be closely monitored in order for all market players to know at any time what is produced, traded, stocked and consumed and by whom. In view of the role and importance of energy in world development, one would expect that basic energy information to be readily available and reliable. This is not always the case and one can even observe a decline in the quality, coverage and timeliness of energy statistics over the last few years.
An audit of the statistics and the comparison with the parameter in the population

Science.gov (United States)

Bujang, Mohamad Adam; Sa'at, Nadiah; Joys, A. Reena; Ali, Mariana Mohamad

2015-10-01

The sufficient sample size that is needed to closely estimate the statistics for particular parameters are use to be an issue. Although sample size might had been calculated referring to objective of the study, however, it is difficult to confirm whether the statistics are closed with the parameter for a particular population. All these while, guideline that uses a p-value less than 0.05 is widely used as inferential evidence. Therefore, this study had audited results that were analyzed from various sub sample and statistical analyses and had compared the results with the parameters in three different populations. Eight types of statistical analysis and eight sub samples for each statistical analysis were analyzed. Results found that the statistics were consistent and were closed to the parameters when the sample study covered at least 15% to 35% of population. Larger sample size is needed to estimate parameter that involve with categorical variables compared with numerical variables. Sample sizes with 300 to 500 are sufficient to estimate the parameters for medium size of population.
The N-Pact Factor: Evaluating the Quality of Empirical Journals with Respect to Sample Size and Statistical Power

Science.gov (United States)

Fraley, R. Chris; Vazire, Simine

2014-01-01

The authors evaluate the quality of research reported in major journals in social-personality psychology by ranking those journals with respect to their N-pact Factors (NF)—the statistical power of the empirical studies they publish to detect typical effect sizes. Power is a particularly important attribute for evaluating research quality because, relative to studies that have low power, studies that have high power are more likely to (a) to provide accurate estimates of effects, (b) to produce literatures with low false positive rates, and (c) to lead to replicable findings. The authors show that the average sample size in social-personality research is 104 and that the power to detect the typical effect size in the field is approximately 50%. Moreover, they show that there is considerable variation among journals in sample sizes and power of the studies they publish, with some journals consistently publishing higher power studies than others. The authors hope that these rankings will be of use to authors who are choosing where to submit their best work, provide hiring and promotion committees with a superior way of quantifying journal quality, and encourage competition among journals to improve their NF rankings. PMID:25296159
Statistical Processing Algorithms for Human Population Databases

Directory of Open Access Journals (Sweden)

Camelia COLESCU

2012-01-01

Full Text Available The article is describing some algoritms for statistic functions aplied to a human population database. The samples are specific for the most interesting periods, when the evolution of statistical datas has spectacolous value. The article describes the most usefull form of grafical prezentation of the results
Types of non-probabilistic sampling used in marketing research. „Snowball” sampling

OpenAIRE

Manuela Rozalia Gabor

2007-01-01

A significant way of investigating a firm’s market is the statistical sampling. The sampling typology provides a non / probabilistic models of gathering information and this paper describes thorough information related to network sampling, named “snowball” sampling. This type of sampling enables the survey of occurrence forms concerning the decision power within an organisation and of the interpersonal relation network governing a certain collectivity, a certain consumer panel. The snowball s...
Review of research designs and statistical methods employed in dental postgraduate dissertations.

Science.gov (United States)

Shirahatti, Ravi V; Hegde-Shetiya, Sahana

2015-01-01

There is a need to evaluate the quality of postgraduate dissertations of dentistry submitted to university in the light of the international standards of reporting. We conducted the review with an objective to document the use of sampling methods, measurement standardization, blinding, methods to eliminate bias, appropriate use of statistical tests, appropriate use of data presentation in postgraduate dental research and suggest and recommend modifications. The public access database of the dissertations from Rajiv Gandhi University of Health Sciences was reviewed. Three hundred and thirty-three eligible dissertations underwent preliminary evaluation followed by detailed evaluation of 10% of randomly selected dissertations. The dissertations were assessed based on international reporting guidelines such as strengthening the reporting of observational studies in epidemiology (STROBE), consolidated standards of reporting trials (CONSORT), and other scholarly resources. The data were compiled using MS Excel and SPSS 10.0. Numbers and percentages were used for describing the data. The "in vitro" studies were the most common type of research (39%), followed by observational (32%) and experimental studies (29%). The disciplines conservative dentistry (92%) and prosthodontics (75%) reported high numbers of in vitro research. Disciplines oral surgery (80%) and periodontics (67%) had conducted experimental studies as a major share of their research. Lacunae in the studies included observational studies not following random sampling (70%), experimental studies not following random allocation (75%), not mentioning about blinding, confounding variables and calibrations in measurements, misrepresenting the data by inappropriate data presentation, errors in reporting probability values and not reporting confidence intervals. Few studies showed grossly inappropriate choice of statistical tests and many studies needed additional tests. Overall observations indicated the need to
Statistical Analysis and validation

NARCIS (Netherlands)

Hoefsloot, H.C.J.; Horvatovich, P.; Bischoff, R.

2013-01-01

In this chapter guidelines are given for the selection of a few biomarker candidates from a large number of compounds with a relative low number of samples. The main concepts concerning the statistical validation of the search for biomarkers are discussed. These complicated methods and concepts are
Luminosity excesses in low-mass young stellar objects - a statistical study

International Nuclear Information System (INIS)

Strom, K.M.; Strom, S.E.; Kenyon, S.J.; Hartmann, L.

1988-01-01

This paper presents a statistical study in which the observed total luminosity is compared quantitatively with an estimate of the stellar luminosity for a sample of 59 low-mass young stellar objects (YSOs) in the Taurus-Auriga complex. In 13 of the analyzed YSOs, luminosity excesses greater than 0.20 are observed together with greater than 0.6 IR excesses, which typically contribute the bulk of the observed excess luminosity and are characterized by spectral energy distributions which are flat or rise toward long wavelengths. The analysis suggests that YSOs showing the largest luminosity excesses typically power optical jets and/or molecular outflows or have strong winds, as evidenced by the presence of O I emission, indicating a possible correlation between accretion and mass-outflow properties. 38 references

Testing statistical self-similarity in the topology of river networks

Science.gov (United States)

Troutman, Brent M.; Mantilla, Ricardo; Gupta, Vijay K.

2010-01-01

Recent work has demonstrated that the topological properties of real river networks deviate significantly from predictions of Shreve's random model. At the same time the property of mean self-similarity postulated by Tokunaga's model is well supported by data. Recently, a new class of network model called random self-similar networks (RSN) that combines self-similarity and randomness has been introduced to replicate important topological features observed in real river networks. We investigate if the hypothesis of statistical self-similarity in the RSN model is supported by data on a set of 30 basins located across the continental United States that encompass a wide range of hydroclimatic variability. We demonstrate that the generators of the RSN model obey a geometric distribution, and self-similarity holds in a statistical sense in 26 of these 30 basins. The parameters describing the distribution of interior and exterior generators are tested to be statistically different and the difference is shown to produce the well-known Hack's law. The inter-basin variability of RSN parameters is found to be statistically significant. We also test generator dependence on two climatic indices, mean annual precipitation and radiative index of dryness. Some indication of climatic influence on the generators is detected, but this influence is not statistically significant with the sample size available. Finally, two key applications of the RSN model to hydrology and geomorphology are briefly discussed.
Infants' statistical learning: 2- and 5-month-olds' segmentation of continuous visual sequences.

Science.gov (United States)

Slone, Lauren Krogh; Johnson, Scott P

2015-05-01

Past research suggests that infants have powerful statistical learning abilities; however, studies of infants' visual statistical learning offer differing accounts of the developmental trajectory of and constraints on this learning. To elucidate this issue, the current study tested the hypothesis that young infants' segmentation of visual sequences depends on redundant statistical cues to segmentation. A sample of 20 2-month-olds and 20 5-month-olds observed a continuous sequence of looming shapes in which unit boundaries were defined by both transitional probability and co-occurrence frequency. Following habituation, only 5-month-olds showed evidence of statistically segmenting the sequence, looking longer to a statistically improbable shape pair than to a probable pair. These results reaffirm the power of statistical learning in infants as young as 5 months but also suggest considerable development of statistical segmentation ability between 2 and 5 months of age. Moreover, the results do not support the idea that infants' ability to segment visual sequences based on transitional probabilities and/or co-occurrence frequencies is functional at the onset of visual experience, as has been suggested previously. Rather, this type of statistical segmentation appears to be constrained by the developmental state of the learner. Factors contributing to the development of statistical segmentation ability during early infancy, including memory and attention, are discussed. Copyright © 2015 Elsevier Inc. All rights reserved.
Time-to-event methodology improved statistical evaluation in register-based health services research.

Science.gov (United States)

Bluhmki, Tobias; Bramlage, Peter; Volk, Michael; Kaltheuner, Matthias; Danne, Thomas; Rathmann, Wolfgang; Beyersmann, Jan

2017-02-01

Complex longitudinal sampling and the observational structure of patient registers in health services research are associated with methodological challenges regarding data management and statistical evaluation. We exemplify common pitfalls and want to stimulate discussions on the design, development, and deployment of future longitudinal patient registers and register-based studies. For illustrative purposes, we use data from the prospective, observational, German DIabetes Versorgungs-Evaluation register. One aim was to explore predictors for the initiation of a basal insulin supported therapy in patients with type 2 diabetes initially prescribed to glucose-lowering drugs alone. Major challenges are missing mortality information, time-dependent outcomes, delayed study entries, different follow-up times, and competing events. We show that time-to-event methodology is a valuable tool for improved statistical evaluation of register data and should be preferred to simple case-control approaches. Patient registers provide rich data sources for health services research. Analyses are accompanied with the trade-off between data availability, clinical plausibility, and statistical feasibility. Cox' proportional hazards model allows for the evaluation of the outcome-specific hazards, but prediction of outcome probabilities is compromised by missing mortality information. Copyright © 2016 Elsevier Inc. All rights reserved.
1861-1981: Statistics teaching in Italian universities

Directory of Open Access Journals (Sweden)

Donata Marasini

2013-05-01

Full Text Available This paper aims to outline the development of Statistics from 1861 to 1981 with respect to its contents. The paper pays particular attention to some statistical topics which have been covered by basic introductory courses in the Italian Universities since the beginning of the Italian unification. The review takes as its starting point the well-known book “Filosofia della Statistica” of Melchiorre Gioja. This volume was published 35 years before Italian unification but it already contains the fundamental topics of exploratory and inductive Statistics. These topics give the opportunity to mention Italian statisticians who are considered the pioneers of this discipline. In particular, the attention is focused on four statisticians: Corrado Gini, well-known for its modern insights; Marcello Boldrini, high cultured man also in the epistemological field; Bruno de Finetti, founder of subjective school and Bayesian reasoning; Giuseppe Pompilj, precursor of random variables and sampling theory. The paper browses the indexes of three well-known Italian handbooks that, although published after the period 1861-1981, deal with topics covered in some basic teachings of exploratory statistics, statistical inference and sampling theory from finite population.
mapDIA: Preprocessing and statistical analysis of quantitative proteomics data from data independent acquisition mass spectrometry.

Science.gov (United States)

Teo, Guoshou; Kim, Sinae; Tsou, Chih-Chiang; Collins, Ben; Gingras, Anne-Claude; Nesvizhskii, Alexey I; Choi, Hyungwon

2015-11-03

Data independent acquisition (DIA) mass spectrometry is an emerging technique that offers more complete detection and quantification of peptides and proteins across multiple samples. DIA allows fragment-level quantification, which can be considered as repeated measurements of the abundance of the corresponding peptides and proteins in the downstream statistical analysis. However, few statistical approaches are available for aggregating these complex fragment-level data into peptide- or protein-level statistical summaries. In this work, we describe a software package, mapDIA, for statistical analysis of differential protein expression using DIA fragment-level intensities. The workflow consists of three major steps: intensity normalization, peptide/fragment selection, and statistical analysis. First, mapDIA offers normalization of fragment-level intensities by total intensity sums as well as a novel alternative normalization by local intensity sums in retention time space. Second, mapDIA removes outlier observations and selects peptides/fragments that preserve the major quantitative patterns across all samples for each protein. Last, using the selected fragments and peptides, mapDIA performs model-based statistical significance analysis of protein-level differential expression between specified groups of samples. Using a comprehensive set of simulation datasets, we show that mapDIA detects differentially expressed proteins with accurate control of the false discovery rates. We also describe the analysis procedure in detail using two recently published DIA datasets generated for 14-3-3β dynamic interaction network and prostate cancer glycoproteome. The software was written in C++ language and the source code is available for free through SourceForge website http://sourceforge.net/projects/mapdia/.This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.
The Statistics of wood assays for preservative retention

Science.gov (United States)

Patricia K. Lebow; Scott W. Conklin

2011-01-01

This paper covers general statistical concepts that apply to interpreting wood assay retention values. In particular, since wood assays are typically obtained from a single composited sample, the statistical aspects, including advantages and disadvantages, of simple compositing are covered.
Statistical Survey of Non-Formal Education

Directory of Open Access Journals (Sweden)

Ondřej Nývlt

2012-12-01

Full Text Available focused on a programme within a regular education system. Labour market flexibility and new requirements on employees create a new domain of education called non-formal education. Is there a reliable statistical source with a good methodological definition for the Czech Republic? Labour Force Survey (LFS has been the basic statistical source for time comparison of non-formal education for the last ten years. Furthermore, a special Adult Education Survey (AES in 2011 was focused on individual components of non-formal education in a detailed way. In general, the goal of the EU is to use data from both internationally comparable surveys for analyses of the particular fields of lifelong learning in the way, that annual LFS data could be enlarged by detailed information from AES in five years periods. This article describes reliability of statistical data aboutnon-formal education. This analysis is usually connected with sampling and non-sampling errors.
Studies in Theoretical and Applied Statistics

CERN Document Server

Pratesi, Monica; Ruiz-Gazen, Anne

2018-01-01

This book includes a wide selection of the papers presented at the 48th Scientific Meeting of the Italian Statistical Society (SIS2016), held in Salerno on 8-10 June 2016. Covering a wide variety of topics ranging from modern data sources and survey design issues to measuring sustainable development, it provides a comprehensive overview of the current Italian scientific research in the fields of open data and big data in public administration and official statistics, survey sampling, ordinal and symbolic data, statistical models and methods for network data, time series forecasting, spatial analysis, environmental statistics, economic and financial data analysis, statistics in the education system, and sustainable development. Intended for researchers interested in theoretical and empirical issues, this volume provides interesting starting points for further research.
Searching for the 3.5 keV Line in the Stacked Suzaku Observations of Galaxy Clusters

Science.gov (United States)

Bulbul, Esra; Markevitch, Maxim; Foster, Adam; Miller, Eric; Bautz, Mark; Lowenstein, Mike; Randall, Scott W.; Smith, Randall K.

2016-01-01

We perform a detailed study of the stacked Suzaku observations of 47 galaxy clusters, spanning a redshift range of 0.01-0.45, to search for the unidentified 3.5 keV line. This sample provides an independent test for the previously detected line. We detect a 2sigma-significant spectral feature at 3.5 keV in the spectrum of the full sample. When the sample is divided into two subsamples (cool-core and non-cool core clusters), the cool-core subsample shows no statistically significant positive residuals at the line energy. A very weak (approx. 2sigma confidence) spectral feature at 3.5 keV is permitted by the data from the non-cool-core clusters sample. The upper limit on a neutrino decay mixing angle of sin(sup 2)(2theta) = 6.1 x 10(exp -11) from the full Suzaku sample is consistent with the previous detections in the stacked XMM-Newton sample of galaxy clusters (which had a higher statistical sensitivity to faint lines), M31, and Galactic center, at a 90% confidence level. However, the constraint from the present sample, which does not include the Perseus cluster, is in tension with previously reported line flux observed in the core of the Perseus cluster with XMM-Newton and Suzaku.
Fully automated gamma spectrometry gauge observing possible radioactive contamination of melting-shop samples

International Nuclear Information System (INIS)

Kroos, J.; Westkaemper, G.; Stein, J.

1999-01-01

At Salzgitter AG, several monitoring systems have been installed to check the scrap transport by rail and by car. At the moment, the scrap transport by ship is reloaded onto wagons for monitoring afterwards. In the future, a detection system will be mounted onto a crane for a direct check on scrap upon the departure of ship. Furthermore, at Salzgitter AG Central Chemical Laboratory, a fully automated gamma spectrometry gauge is installed in order to observe a possible radioactive contamination of the products. The gamma spectrometer is integrated into the automated OE spectrometry line for testing melting shop samples after performing the OE spectrometry. With this technique the specific activity of selected nuclides and dose rate will be determined. The activity observation is part of the release procedure. The corresponding measurement data are stored in a database for quality management reasons. (author)
A simulative comparison of respondent driven sampling with incentivized snowball sampling – the “strudel effect”

Science.gov (United States)

Gyarmathy, V. Anna; Johnston, Lisa G.; Caplinskiene, Irma; Caplinskas, Saulius; Latkin, Carl A.

2014-01-01

Background Respondent driven sampling (RDS) and Incentivized Snowball Sampling (ISS) are two sampling methods that are commonly used to reach people who inject drugs (PWID). Methods We generated a set of simulated RDS samples on an actual sociometric ISS sample of PWID in Vilnius, Lithuania (“original sample”) to assess if the simulated RDS estimates were statistically significantly different from the original ISS sample prevalences for HIV (9.8%), Hepatitis A (43.6%), Hepatitis B (Anti-HBc 43.9% and HBsAg 3.4%), Hepatitis C (87.5%), syphilis (6.8%) and Chlamydia (8.8%) infections and for selected behavioral risk characteristics. Results The original sample consisted of a large component of 249 people (83% of the sample) and 13 smaller components with 1 to 12 individuals. Generally, as long as all seeds were recruited from the large component of the original sample, the simulation samples simply recreated the large component. There were no significant differences between the large component and the entire original sample for the characteristics of interest. Altogether 99.2% of 360 simulation sample point estimates were within the confidence interval of the original prevalence values for the characteristics of interest. Conclusions When population characteristics are reflected in large network components that dominate the population, RDS and ISS may produce samples that have statistically non-different prevalence values, even though some isolated network components may be under-sampled and/or statistically significantly different from the main groups. This so-called “strudel effect” is discussed in the paper. PMID:24360650
Ignorability in Statistical and Probabilistic Inference

DEFF Research Database (Denmark)

Jaeger, Manfred

2005-01-01

When dealing with incomplete data in statistical learning, or incomplete observations in probabilistic inference, one needs to distinguish the fact that a certain event is observed from the fact that the observed event has happened. Since the modeling and computational complexities entailed...
Continuous quality control of the blood sampling procedure using a structured observation scheme.

Science.gov (United States)

Seemann, Tine Lindberg; Nybo, Mads

2016-10-15

An observational study was conducted using a structured observation scheme to assess compliance with the local phlebotomy guideline, to identify necessary focus items, and to investigate whether adherence to the phlebotomy guideline improved. The questionnaire from the EFLM Working Group for the Preanalytical Phase was adapted to local procedures. A pilot study of three months duration was conducted. Based on this, corrective actions were implemented and a follow-up study was conducted. All phlebotomists at the Department of Clinical Biochemistry and Pharmacology were observed. Three blood collections by each phlebotomist were observed at each session conducted at the phlebotomy ward and the hospital wards, respectively. Error frequencies were calculated for the phlebotomy ward and the hospital wards and for the two study phases. A total of 126 blood drawings by 39 phlebotomists were observed in the pilot study, while 84 blood drawings by 34 phlebotomists were observed in the follow-up study. In the pilot study, the three major error items were hand hygiene (42% error), mixing of samples (22%), and order of draw (21%). Minor significant differences were found between the two settings. After focus on the major aspects, the follow-up study showed significant improvement for all three items at both settings (P < 0.01, P < 0.01, and P = 0.01, respectively). Continuous quality control of the phlebotomy procedure revealed a number of items not conducted in compliance with the local phlebotomy guideline. It supported significant improvements in the adherence to the recommended phlebotomy procedures and facilitated documentation of the phlebotomy quality.
Statistical analysis and digital processing of the Mössbauer spectra

International Nuclear Information System (INIS)

Prochazka, Roman; Tucek, Jiri; Mashlan, Miroslav; Pechousek, Jiri; Tucek, Pavel; Marek, Jaroslav

2010-01-01

This work is focused on using the statistical methods and development of the filtration procedures for signal processing in Mössbauer spectroscopy. Statistical tools for noise filtering in the measured spectra are used in many scientific areas. The use of a pure statistical approach in accumulated Mössbauer spectra filtration is described. In Mössbauer spectroscopy, the noise can be considered as a Poisson statistical process with a Gaussian distribution for high numbers of observations. This noise is a superposition of the non-resonant photons counting with electronic noise (from γ-ray detection and discrimination units), and the velocity system quality that can be characterized by the velocity nonlinearities. The possibility of a noise-reducing process using a new design of statistical filter procedure is described. This mathematical procedure improves the signal-to-noise ratio and thus makes it easier to determine the hyperfine parameters of the given Mössbauer spectra. The filter procedure is based on a periodogram method that makes it possible to assign the statistically important components in the spectral domain. The significance level for these components is then feedback-controlled using the correlation coefficient test results. The estimation of the theoretical correlation coefficient level which corresponds to the spectrum resolution is performed. Correlation coefficient test is based on comparison of the theoretical and the experimental correlation coefficients given by the Spearman method. The correctness of this solution was analyzed by a series of statistical tests and confirmed by many spectra measured with increasing statistical quality for a given sample (absorber). The effect of this filter procedure depends on the signal-to-noise ratio and the applicability of this method has binding conditions
Statistical analysis and digital processing of the Mössbauer spectra

Science.gov (United States)

Prochazka, Roman; Tucek, Pavel; Tucek, Jiri; Marek, Jaroslav; Mashlan, Miroslav; Pechousek, Jiri

2010-02-01

This work is focused on using the statistical methods and development of the filtration procedures for signal processing in Mössbauer spectroscopy. Statistical tools for noise filtering in the measured spectra are used in many scientific areas. The use of a pure statistical approach in accumulated Mössbauer spectra filtration is described. In Mössbauer spectroscopy, the noise can be considered as a Poisson statistical process with a Gaussian distribution for high numbers of observations. This noise is a superposition of the non-resonant photons counting with electronic noise (from γ-ray detection and discrimination units), and the velocity system quality that can be characterized by the velocity nonlinearities. The possibility of a noise-reducing process using a new design of statistical filter procedure is described. This mathematical procedure improves the signal-to-noise ratio and thus makes it easier to determine the hyperfine parameters of the given Mössbauer spectra. The filter procedure is based on a periodogram method that makes it possible to assign the statistically important components in the spectral domain. The significance level for these components is then feedback-controlled using the correlation coefficient test results. The estimation of the theoretical correlation coefficient level which corresponds to the spectrum resolution is performed. Correlation coefficient test is based on comparison of the theoretical and the experimental correlation coefficients given by the Spearman method. The correctness of this solution was analyzed by a series of statistical tests and confirmed by many spectra measured with increasing statistical quality for a given sample (absorber). The effect of this filter procedure depends on the signal-to-noise ratio and the applicability of this method has binding conditions.
Independent assessment of matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) sample preparation quality: A novel statistical approach for quality scoring.

Science.gov (United States)

Kooijman, Pieter C; Kok, Sander J; Weusten, Jos J A M; Honing, Maarten

2016-05-05

Preparation of samples according to an optimized method is crucial for accurate determination of polymer sample characteristics by Matrix-Assisted Laser Desorption Ionization (MALDI) analysis. Sample preparation conditions such as matrix choice, cationization agent, deposition technique or even the deposition volume should be chosen to suit the sample of interest. Many sample preparation protocols have been developed and employed, yet finding the optimal sample preparation protocol remains a challenge. Because an objective comparison between the results of diverse protocols is not possible, "gut-feeling" or "good enough" is often decisive in the search for an optimum. This implies that sub-optimal protocols are used, leading to a loss of mass spectral information quality. To address this problem a novel analytical strategy based on MALDI imaging and statistical data processing was developed in which eight parameters were formulated to objectively quantify the quality of sample deposition and optimal MALDI matrix composition and finally sum up to an overall quality score of the sample deposition. These parameters can be established in a fully automated way using commercially available mass spectrometry imaging instruments without any hardware adjustments. With the newly developed analytical strategy the highest quality MALDI spots were selected, resulting in more reproducible and more valuable spectra for PEG in a variety of matrices. Moreover, our method enables an objective comparison of sample preparation protocols for any analyte and opens up new fields of investigation by presenting MALDI performance data in a clear and concise way. Copyright © 2016 Elsevier B.V. All rights reserved.
Network and adaptive sampling

CERN Document Server

Chaudhuri, Arijit

2014-01-01

Combining the two statistical techniques of network sampling and adaptive sampling, this book illustrates the advantages of using them in tandem to effectively capture sparsely located elements in unknown pockets. It shows how network sampling is a reliable guide in capturing inaccessible entities through linked auxiliaries. The text also explores how adaptive sampling is strengthened in information content through subsidiary sampling with devices to mitigate unmanageable expanding sample sizes. Empirical data illustrates the applicability of both methods.
Statistical transformation and the interpretation of inpatient glucose control data.

Science.gov (United States)

Saulnier, George E; Castro, Janna C; Cook, Curtiss B

2014-03-01

To introduce a statistical method of assessing hospital-based non-intensive care unit (non-ICU) inpatient glucose control. Point-of-care blood glucose (POC-BG) data from hospital non-ICUs were extracted for January 1 through December 31, 2011. Glucose data distribution was examined before and after Box-Cox transformations and compared to normality. Different subsets of data were used to establish upper and lower control limits, and exponentially weighted moving average (EWMA) control charts were constructed from June, July, and October data as examples to determine if out-of-control events were identified differently in nontransformed versus transformed data. A total of 36,381 POC-BG values were analyzed. In all 3 monthly test samples, glucose distributions in nontransformed data were skewed but approached a normal distribution once transformed. Interpretation of out-of-control events from EWMA control chart analyses also revealed differences. In the June test data, an out-of-control process was identified at sample 53 with nontransformed data, whereas the transformed data remained in control for the duration of the observed period. Analysis of July data demonstrated an out-of-control process sooner in the transformed (sample 55) than nontransformed (sample 111) data, whereas for October, transformed data remained in control longer than nontransformed data. Statistical transformations increase the normal behavior of inpatient non-ICU glycemic data sets. The decision to transform glucose data could influence the interpretation and conclusions about the status of inpatient glycemic control. Further study is required to determine whether transformed versus nontransformed data influence clinical decisions or evaluation of interventions.
Computerized statistical analysis with bootstrap method in nuclear medicine

International Nuclear Information System (INIS)

Zoccarato, O.; Sardina, M.; Zatta, G.; De Agostini, A.; Barbesti, S.; Mana, O.; Tarolo, G.L.

1988-01-01

Statistical analysis of data samples involves some hypothesis about the features of data themselves. The accuracy of these hypotheses can influence the results of statistical inference. Among the new methods of computer-aided statistical analysis, the bootstrap method appears to be one of the most powerful, thanks to its ability to reproduce many artificial samples starting from a single original sample and because it works without hypothesis about data distribution. The authors applied the bootstrap method to two typical situation of Nuclear Medicine Department. The determination of the normal range of serum ferritin, as assessed by radioimmunoassay and defined by the mean value ±2 standard deviations, starting from an experimental sample of small dimension, shows an unacceptable lower limit (ferritin plasmatic levels below zero). On the contrary, the results obtained by elaborating 5000 bootstrap samples gives ans interval of values (10.95 ng/ml - 72.87 ng/ml) corresponding to the normal ranges commonly reported. Moreover the authors applied the bootstrap method in evaluating the possible error associated with the correlation coefficient determined between left ventricular ejection fraction (LVEF) values obtained by first pass radionuclide angiocardiography with 99m Tc and 195m Au. The results obtained indicate a high degree of statistical correlation and give the range of r 2 values to be considered acceptable for this type of studies
Statistical interpretation of geochemical data

International Nuclear Information System (INIS)

Carambula, M.

1990-01-01

Statistical results have been obtained from a geochemical research from the following four aerial photographies Zapican, Carape, Las Canias, Alferez. They have been studied 3020 samples in total, to 22 chemical elements using plasma emission spectrometry methods.

Perceived Statistical Knowledge Level and Self-Reported Statistical Practice Among Academic Psychologists

Directory of Open Access Journals (Sweden)

Laura Badenes-Ribera

2018-06-01

Full Text Available Introduction: Publications arguing against the null hypothesis significance testing (NHST procedure and in favor of good statistical practices have increased. The most frequently mentioned alternatives to NHST are effect size statistics (ES, confidence intervals (CIs, and meta-analyses. A recent survey conducted in Spain found that academic psychologists have poor knowledge about effect size statistics, confidence intervals, and graphic displays for meta-analyses, which might lead to a misinterpretation of the results. In addition, it also found that, although the use of ES is becoming generalized, the same thing is not true for CIs. Finally, academics with greater knowledge about ES statistics presented a profile closer to good statistical practice and research design. Our main purpose was to analyze the extension of these results to a different geographical area through a replication study.Methods: For this purpose, we elaborated an on-line survey that included the same items as the original research, and we asked academic psychologists to indicate their level of knowledge about ES, their CIs, and meta-analyses, and how they use them. The sample consisted of 159 Italian academic psychologists (54.09% women, mean age of 47.65 years. The mean number of years in the position of professor was 12.90 (SD = 10.21.Results: As in the original research, the results showed that, although the use of effect size estimates is becoming generalized, an under-reporting of CIs for ES persists. The most frequent ES statistics mentioned were Cohen's d and R2/η2, which can have outliers or show non-normality or violate statistical assumptions. In addition, academics showed poor knowledge about meta-analytic displays (e.g., forest plot and funnel plot and quality checklists for studies. Finally, academics with higher-level knowledge about ES statistics seem to have a profile closer to good statistical practices.Conclusions: Changing statistical practice is not
Evaluation and application of summary statistic imputation to discover new height-associated loci.

Science.gov (United States)

Rüeger, Sina; McDaid, Aaron; Kutalik, Zoltán

2018-05-01

As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian
Applied statistics in ecology: common pitfalls and simple solutions

Science.gov (United States)

E. Ashley Steel; Maureen C. Kennedy; Patrick G. Cunningham; John S. Stanovick

2013-01-01

The most common statistical pitfalls in ecological research are those associated with data exploration, the logic of sampling and design, and the interpretation of statistical results. Although one can find published errors in calculations, the majority of statistical pitfalls result from incorrect logic or interpretation despite correct numerical calculations. There...
Systematic sampling for suspended sediment

Science.gov (United States)

Robert B. Thomas

1991-01-01

Abstract - Because of high costs or complex logistics, scientific populations cannot be measured entirely and must be sampled. Accepted scientific practice holds that sample selection be based on statistical principles to assure objectivity when estimating totals and variances. Probability sampling--obtaining samples with known probabilities--is the only method that...
Statistical correlation of spectral broadening in VLF transmitter signal and low-frequency ionospheric turbulence from observation on DEMETER satellite

Directory of Open Access Journals (Sweden)

A. Rozhnoi

2008-10-01

Full Text Available In our earlier papers we have found the effect of VLF transmitter signal depression over epicenters of the large earthquakes from observation on the French DEMETER satellite that can be considered as new method of global diagnostics of seismic influence on the ionosphere. At present paper we investigate a possibility VLF signal-ionospheric turbulence interaction using additional characteristic of VLF signal-spectrum broadening. This characteristic is important for estimation of the interaction type: linear or nonlinear scattering. Our main results are the following:
– There are two zones of increased spectrum broadening, which are centered near magnetic latitudes Φ=±10° and Φ=±40°. Basing on the previous case study research and ground ionosonde registrations, probably it is evidence of nonlinear (active scattering of VLF signal on the ionospheric turbulence. However occurrence rate of spectrum broadening in the middle-latitude area is higher than in the near-equatorial zone (~15–20% in comparison with ~100% in former area that is probably coincides with the rate of ionospheric turbulence.
– From two years statistics of observation in the selected 3 low-latitude regions and 1 middle-latitude region inside reception area of VLF signal from NWC transmitter we find a correlation of spectrum broadening neither with ion-cyclotron noise (f=150–500 Hz, which possibly means poor representation of the turbulence by the noise due to its mixture with natural ELF emission (which correlates with whistler, nor with magnetic storm activity.
– We find rather evident correlation of ion-cyclotron frequency noise, VLF signal depression and weak correlation of spectrum broadening with seismicity in the middle-latitude region over Japan. But in the low-latitude regions we do not find such a correlation. Statistical decrease of VLF signal supports our previous case study results. However rather weak spectrum broadening
Nonparametric statistics a step-by-step approach

CERN Document Server

Corder, Gregory W

2014-01-01

"…a very useful resource for courses in nonparametric statistics in which the emphasis is on applications rather than on theory. It also deserves a place in libraries of all institutions where introductory statistics courses are taught."" -CHOICE This Second Edition presents a practical and understandable approach that enhances and expands the statistical toolset for readers. This book includes: New coverage of the sign test and the Kolmogorov-Smirnov two-sample test in an effort to offer a logical and natural progression to statistical powerSPSS® (Version 21) software and updated screen ca
New color-photographic observation of thermoluminescence from sliced rock samples

International Nuclear Information System (INIS)

Hashimoto, Tetsuo; Kimura, Kenichi; Koyanagi, Akira; Takahashi, Kuniaki; Sotobayashi, Takeshi

1983-01-01

New observation technique has been established for the thermoluminescence photography using extremely high-sensitive color films. Considering future application to the geological fields, a granite was selected as a testing material. The sliced specimens (0.5--0.7 mm in thickness), which were irradiated with a 60 Co source, were mounted on the heater attached with a thermocouple, which was connected to a microcomputer for measuring the temperature. The samples were heated in the temperature range of 80--400 0 C by operating the camera-shutter controlled with the microcomputer. Four commercially available films (Kodak-1000(ASA), -400, Sakura-400, Fuji-400) could give apparently detectable color-images of artificial thermoluminescence above a total absorbed dose of 880 Gy(88 krad). The specimens, irradiated upto 8.4 kGy(840krad), allowed easily to distinguish the distinct appearance of the thermoluminescence images depending on kinds of white mineral constituents. Moreover, such color images were changeable with the heating temperature. Sakura-400 film has proved the most colorful images from aspects of color tone although Kodak-1000 film showed the highest sensitivity. By applying this Kodak-1000, it was found that the characteristic color image due to natural thermoluminescence was significantly observed on the Precambrian granite which was exposed with natural radiation alone since its formation. This simple and interesting technique, obtainable surface information reflecting impurities and local crystal defects in addition to small mineral constituents, was named as the thermoluminescence color imaging (abbreviated to TLCI) technique by the authors and its versatile applications were discussed. (author)
ASURV: Astronomical SURVival Statistics

Science.gov (United States)

Feigelson, E. D.; Nelson, P. I.; Isobe, T.; LaValley, M.

2014-06-01

ASURV (Astronomical SURVival Statistics) provides astronomy survival analysis for right- and left-censored data including the maximum-likelihood Kaplan-Meier estimator and several univariate two-sample tests, bivariate correlation measures, and linear regressions. ASURV is written in FORTRAN 77, and is stand-alone and does not call any specialized libraries.
Refining the statistical model for quantitative immunostaining of surface-functionalized nanoparticles by AFM.

Science.gov (United States)

MacCuspie, Robert I; Gorka, Danielle E

2013-10-01

Recently, an atomic force microscopy (AFM)-based approach for quantifying the number of biological molecules conjugated to a nanoparticle surface at low number densities was reported. The number of target molecules conjugated to the analyte nanoparticle can be determined with single nanoparticle fidelity using antibody-mediated self-assembly to decorate the analyte nanoparticles with probe nanoparticles (i.e., quantitative immunostaining). This work refines the statistical models used to quantitatively interpret the observations when AFM is used to image the resulting structures. The refinements add terms to the previous statistical models to account for the physical sizes of the analyte nanoparticles, conjugated molecules, antibodies, and probe nanoparticles. Thus, a more physically realistic statistical computation can be implemented for a given sample of known qualitative composition, using the software scripts provided. Example AFM data sets, using horseradish peroxidase conjugated to gold nanoparticles, are presented to illustrate how to implement this method successfully.
Estimation of Peaking Factor Uncertainty due to Manufacturing Tolerance using Statistical Sampling Method

Energy Technology Data Exchange (ETDEWEB)

Lee, Kyung Hoon; Park, Ho Jin; Lee, Chung Chan; Cho, Jin Young [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2015-10-15

The purpose of this paper is to study the effect on output parameters in the lattice physics calculation due to the last input uncertainty such as manufacturing deviations from nominal value for material composition and geometric dimensions. In a nuclear design and analysis, the lattice physics calculations are usually employed to generate lattice parameters for the nodal core simulation and pin power reconstruction. These lattice parameters which consist of homogenized few-group cross-sections, assembly discontinuity factors, and form-functions can be affected by input uncertainties which arise from three different sources: 1) multi-group cross-section uncertainties, 2) the uncertainties associated with methods and modeling approximations utilized in lattice physics codes, and 3) fuel/assembly manufacturing uncertainties. In this paper, data provided by the light water reactor (LWR) uncertainty analysis in modeling (UAM) benchmark has been used as the manufacturing uncertainties. First, the effect of each input parameter has been investigated through sensitivity calculations at the fuel assembly level. Then, uncertainty in prediction of peaking factor due to the most sensitive input parameter has been estimated using the statistical sampling method, often called the brute force method. For our analysis, the two-dimensional transport lattice code DeCART2D and its ENDF/B-VII.1 based 47-group library were used to perform the lattice physics calculation. Sensitivity calculations have been performed in order to study the influence of manufacturing tolerances on the lattice parameters. The manufacturing tolerance that has the largest influence on the k-inf is the fuel density. The second most sensitive parameter is the outer clad diameter.
Spatial scan statistics to assess sampling strategy of antimicrobial resistance monitoring programme

DEFF Research Database (Denmark)

Vieira, Antonio; Houe, Hans; Wegener, Henrik Caspar

2009-01-01

Pie collection and analysis of data on antimicrobial resistance in human and animal Populations are important for establishing a baseline of the occurrence of resistance and for determining trends over time. In animals, targeted monitoring with a stratified sampling plan is normally used. However...... sampled by the Danish Integrated Antimicrobial Resistance Monitoring and Research Programme (DANMAP), by identifying spatial Clusters of samples and detecting areas with significantly high or low sampling rates. These analyses were performed for each year and for the total 5-year study period for all...... by an antimicrobial monitoring program....
Short time-scale optical variability properties of the largest AGN sample observed with Kepler/K2

Science.gov (United States)

Aranzana, E.; Körding, E.; Uttley, P.; Scaringi, S.; Bloemen, S.

2018-05-01

We present the first short time-scale (˜hours to days) optical variability study of a large sample of active galactic nuclei (AGNs) observed with the Kepler/K2 mission. The sample contains 252 AGN observed over four campaigns with ˜30 min cadence selected from the Million Quasar Catalogue with R magnitude <19. We performed time series analysis to determine their variability properties by means of the power spectral densities (PSDs) and applied Monte Carlo techniques to find the best model parameters that fit the observed power spectra. A power-law model is sufficient to describe all the PSDs of our sample. A variety of power-law slopes were found indicating that there is not a universal slope for all AGNs. We find that the rest-frame amplitude variability in the frequency range of 6 × 10-6-10-4 Hz varies from 1to10 per cent with an average of 1.7 per cent. We explore correlations between the variability amplitude and key parameters of the AGN, finding a significant correlation of rest-frame short-term variability amplitude with redshift. We attribute this effect to the known `bluer when brighter' variability of quasars combined with the fixed bandpass of Kepler data. This study also enables us to distinguish between Seyferts and blazars and confirm AGN candidates. For our study, we have compared results obtained from light curves extracted using different aperture sizes and with and without detrending. We find that limited detrending of the optimal photometric precision light curve is the best approach, although some systematic effects still remain present.
Dental Calculus Links Statistically to Angina Pectoris: 26-Year Observational Study.

Science.gov (United States)

Söder, Birgitta; Meurman, Jukka H; Söder, Per-Östen

2016-01-01

Dental infections, such as periodontitis, associate with atherosclerosis and its complications. We studied a cohort followed-up since 1985 for incidence of angina pectoris with the hypothesis that calculus accumulation, proxy for poor oral hygiene, links to this symptom. In our Swedish prospective cohort study of 1676 randomly selected subjects followed-up for 26 years. In 1985 all subjects underwent clinical oral examination and answered a questionnaire assessing background variables such as socio-economic status and pack-years of smoking. By using data from the Center of Epidemiology, Swedish National Board of Health and Welfare, Sweden we analyzed the association of oral health parameters with the prevalence of in-hospital verified angina pectoris classified according to the WHO International Classification of Diseases, using descriptive statistics and logistic regression analysis. Of the 1676 subjects, 51 (28 women/23 men) had been diagnosed with angina pectoris at a mean age of 59.8 ± 2.9 years. No difference was observed in age and gender between patients with angina pectoris and subjects without. Neither was there any difference in education level and smoking habits (in pack years), Gingival index and Plaque index between the groups. Angina pectoris patients had significantly more often their first maxillary molar tooth extracted (d. 16) than the other subjects (p = 0.02). Patients also showed significantly higher dental calculus index values than the subjects without angina pectoris (p = 0.01). Multiple regression analysis showed odds ratio 2.21 (95% confidence interval 1.17-4.17) in the association between high calculus index and angina pectoris (p = 0.015). Our study hypothesis was confirmed by showing for the first time that high dental calculus score indeed associated with the incidence of angina pectoris in this cohort study.
Effect of carboxymethylcellulose on the rheological and filtration properties of bentonite clay samples determined by experimental planning and statistical analysis

Directory of Open Access Journals (Sweden)

B. M. A. Brito

Full Text Available Abstract Over the past few years, considerable research has been conducted using the techniques of mixture delineation and statistical modeling. Through this methodology, applications in various technological fields have been found/optimized, especially in clay technology, leading to greater efficiency and reliability. This work studied the influence of carboxymethylcellulose on the rheological and filtration properties of bentonite dispersions to be applied in water-based drilling fluids using experimental planning and statistical analysis for clay mixtures. The dispersions were prepared according to Petrobras standard EP-1EP-00011-A, which deals with the testing of water-based drilling fluid viscosifiers for oil prospecting. The clay mixtures were transformed into sodic compounds, and carboxymethylcellulose additives of high and low molar mass were added, in order to improve their rheology and filtrate volume. Experimental planning and statistical analysis were used to verify the effect. The regression models were calculated for the relation between the compositions and the following rheological properties: apparent viscosity, plastic viscosity, and filtrate volume. The significance and validity of the models were confirmed. The results showed that the 3D response surfaces of the compositions with high molecular weight carboxymethylcellulose added were the ones that most contributed to the rise in apparent viscosity and plastic viscosity, and that those with low molecular weight were the ones that most helped in the reduction of the filtrate volume. Another important observation is that the experimental planning and statistical analysis can be used as an important auxiliary tool to optimize the rheological properties and filtrate volume of bentonite clay dispersions for use in drilling fluids when carboxymethylcellulose is added.
Alfvén waves in the foreshock propagating upstream in the plasma rest frame: statistics from Cluster observations

Directory of Open Access Journals (Sweden)

Y. Narita

2004-07-01

Full Text Available We statistically study various properties of low-frequency waves such as frequencies, wave numbers, phase velocities, and polarization in the plasma rest frame in the terrestrial foreshock. Using Cluster observations the wave telescope or k-filtering is applied to investigate wave numbers and rest frame frequencies. We find that most of the foreshock waves propagate upstream along the magnetic field at phase velocity close to the Alfvén velocity. We identify that frequencies are around 0.1xΩ_cp and wave numbers are around 0.1xΩ_cp/V_A, where Ω_cp is the proton cyclotron frequency and V_A is the Alfvén velocity. Our results confirm the conclusions drawn from ISEE observations and strongly support the existence of Alfvén waves in the foreshock.
A Statistical Treatment of Bioassay Pour Fractions

Science.gov (United States)

Barengoltz, Jack; Hughes, David W.

2014-01-01

The binomial probability distribution is used to treat the statistics of a microbiological sample that is split into two parts, with only one part evaluated for spore count. One wishes to estimate the total number of spores in the sample based on the counts obtained from the part that is evaluated (pour fraction). Formally, the binomial distribution is recharacterized as a function of the observed counts (successes), with the total number (trials) an unknown. The pour fraction is the probability of success per spore (trial). This distribution must be renormalized in terms of the total number. Finally, the new renormalized distribution is integrated and mathematically inverted to yield the maximum estimate of the total number as a function of a desired level of confidence ( P(fraction. The extension to recovery efficiency corrections is also presented. Now the product of recovery efficiency and pour fraction may be small enough that the likely value may be much larger than the usual calculation: the number of spores divided by that product. The use of this analysis would not be limited to microbiological data.
On the Use of Biomineral Oxygen Isotope Data to Identify Human Migrants in the Archaeological Record: Intra-Sample Variation, Statistical Methods and Geographical Considerations.

Directory of Open Access Journals (Sweden)

Emma Lightfoot

Full Text Available Oxygen isotope analysis of archaeological skeletal remains is an increasingly popular tool to study past human migrations. It is based on the assumption that human body chemistry preserves the δ18O of precipitation in such a way as to be a useful technique for identifying migrants and, potentially, their homelands. In this study, the first such global survey, we draw on published human tooth enamel and bone bioapatite data to explore the validity of using oxygen isotope analyses to identify migrants in the archaeological record. We use human δ18O results to show that there are large variations in human oxygen isotope values within a population sample. This may relate to physiological factors influencing the preservation of the primary isotope signal, or due to human activities (such as brewing, boiling, stewing, differential access to water sources and so on causing variation in ingested water and food isotope values. We compare the number of outliers identified using various statistical methods. We determine that the most appropriate method for identifying migrants is dependent on the data but is likely to be the IQR or median absolute deviation from the median under most archaeological circumstances. Finally, through a spatial assessment of the dataset, we show that the degree of overlap in human isotope values from different locations across Europe is such that identifying individuals' homelands on the basis of oxygen isotope analysis alone is not possible for the regions analysed to date. Oxygen isotope analysis is a valid method for identifying first-generation migrants from an archaeological site when used appropriately, however it is difficult to identify migrants using statistical methods for a sample size of less than c. 25 individuals. In the absence of local previous analyses, each sample should be treated as an individual dataset and statistical techniques can be used to identify migrants, but in most cases pinpointing a specific
Particle System Based Adaptive Sampling on Spherical Parameter Space to Improve the MDL Method for Construction of Statistical Shape Models

Directory of Open Access Journals (Sweden)

Rui Xu

2013-01-01

Full Text Available Minimum description length (MDL based group-wise registration was a state-of-the-art method to determine the corresponding points of 3D shapes for the construction of statistical shape models (SSMs. However, it suffered from the problem that determined corresponding points did not uniformly spread on original shapes, since corresponding points were obtained by uniformly sampling the aligned shape on the parameterized space of unit sphere. We proposed a particle-system based method to obtain adaptive sampling positions on the unit sphere to resolve this problem. Here, a set of particles was placed on the unit sphere to construct a particle system whose energy was related to the distortions of parameterized meshes. By minimizing this energy, each particle was moved on the unit sphere. When the system became steady, particles were treated as vertices to build a spherical mesh, which was then relaxed to slightly adjust vertices to obtain optimal sampling-positions. We used 47 cases of (left and right lungs and 50 cases of livers, (left and right kidneys, and spleens for evaluations. Experiments showed that the proposed method was able to resolve the problem of the original MDL method, and the proposed method performed better in the generalization and specificity tests.
The optimally sampled galaxy-wide stellar initial mass function. Observational tests and the publicly available GalIMF code

Science.gov (United States)

Yan, Zhiqiang; Jerabkova, Tereza; Kroupa, Pavel

2017-11-01

Here we present a full description of the integrated galaxy-wide initial mass function (IGIMF) theory in terms of the optimal sampling and compare it with available observations. Optimal sampling is the method we use to discretize the IMF deterministically into stellar masses. Evidence indicates that nature may be closer to deterministic sampling as observations suggest a smaller scatter of various relevant observables than random sampling would give, which may result from a high level of self-regulation during the star formation process. We document the variation of IGIMFs under various assumptions. The results of the IGIMF theory are consistent with the empirical relation between the total mass of a star cluster and the mass of its most massive star, and the empirical relation between the star formation rate (SFR) of a galaxy and the mass of its most massive cluster. Particularly, we note a natural agreement with the empirical relation between the IMF power-law index and the SFR of a galaxy. The IGIMF also results in a relation between the SFR of a galaxy and the mass of its most massive star such that, if there were no binaries, galaxies with SFR first time, we show optimally sampled galaxy-wide IMFs (OSGIMF) that mimic the IGIMF with an additional serrated feature. Finally, a Python module, GalIMF, is provided allowing the calculation of the IGIMF and OSGIMF dependent on the galaxy-wide SFR and metallicity. A copy of the python code model is available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/607/A126
RCT: Module 2.03, Counting Errors and Statistics, Course 8768

Energy Technology Data Exchange (ETDEWEB)

Hillmer, Kurt T. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2017-04-01

Radiological sample analysis involves the observation of a random process that may or may not occur and an estimation of the amount of radioactive material present based on that observation. Across the country, radiological control personnel are using the activity measurements to make decisions that may affect the health and safety of workers at those facilities and their surrounding environments. This course will present an overview of measurement processes, a statistical evaluation of both measurements and equipment performance, and some actions to take to minimize the sources of error in count room operations. This course will prepare the student with the skills necessary for radiological control technician (RCT) qualification by passing quizzes, tests, and the RCT Comprehensive Phase 1, Unit 2 Examination (TEST 27566) and by providing in the field skills.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.