Statistical error estimation of the Feynman-α method using the bootstrap method
International Nuclear Information System (INIS)
Endo, Tomohiro; Yamamoto, Akio; Yagi, Takahiro; Pyeon, Cheol Ho
2016-01-01
Applicability of the bootstrap method is investigated to estimate the statistical error of the Feynman-α method, which is one of the subcritical measurement techniques on the basis of reactor noise analysis. In the Feynman-α method, the statistical error can be simply estimated from multiple measurements of reactor noise, however it requires additional measurement time to repeat the multiple times of measurements. Using a resampling technique called 'bootstrap method' standard deviation and confidence interval of measurement results obtained by the Feynman-α method can be estimated as the statistical error, using only a single measurement of reactor noise. In order to validate our proposed technique, we carried out a passive measurement of reactor noise without any external source, i.e. with only inherent neutron source by spontaneous fission and (α,n) reactions in nuclear fuels at the Kyoto University Criticality Assembly. Through the actual measurement, it is confirmed that the bootstrap method is applicable to approximately estimate the statistical error of measurement results obtained by the Feynman-α method. (author)
Order statistics & inference estimation methods
Balakrishnan, N
1991-01-01
The literature on order statistics and inferenc eis quite extensive and covers a large number of fields ,but most of it is dispersed throughout numerous publications. This volume is the consolidtion of the most important results and places an emphasis on estimation. Both theoretical and computational procedures are presented to meet the needs of researchers, professionals, and students. The methods of estimation discussed are well-illustrated with numerous practical examples from both the physical and life sciences, including sociology,psychology,a nd electrical and chemical engineering. A co
Methods for estimating low-flow statistics for Massachusetts streams
Ries, Kernell G.; Friesz, Paul J.
2000-01-01
Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The
The estimation of the measurement results with using statistical methods
International Nuclear Information System (INIS)
Ukrmetrteststandard, 4, Metrologichna Str., 03680, Kyiv (Ukraine))" data-affiliation=" (State Enterprise Ukrmetrteststandard, 4, Metrologichna Str., 03680, Kyiv (Ukraine))" >Velychko, O; UkrNDIspirtbioprod, 3, Babushkina Lane, 03190, Kyiv (Ukraine))" data-affiliation=" (State Scientific Institution UkrNDIspirtbioprod, 3, Babushkina Lane, 03190, Kyiv (Ukraine))" >Gordiyenko, T
2015-01-01
The row of international standards and guides describe various statistical methods that apply for a management, control and improvement of processes with the purpose of realization of analysis of the technical measurement results. The analysis of international standards and guides on statistical methods estimation of the measurement results recommendations for those applications in laboratories is described. For realization of analysis of standards and guides the cause-and-effect Ishikawa diagrams concerting to application of statistical methods for estimation of the measurement results are constructed
The estimation of the measurement results with using statistical methods
Velychko, O.; Gordiyenko, T.
2015-02-01
The row of international standards and guides describe various statistical methods that apply for a management, control and improvement of processes with the purpose of realization of analysis of the technical measurement results. The analysis of international standards and guides on statistical methods estimation of the measurement results recommendations for those applications in laboratories is described. For realization of analysis of standards and guides the cause-and-effect Ishikawa diagrams concerting to application of statistical methods for estimation of the measurement results are constructed.
Statistical methods of parameter estimation for deterministically chaotic time series
Pisarenko, V. F.; Sornette, D.
2004-03-01
We discuss the possibility of applying some standard statistical methods (the least-square method, the maximum likelihood method, and the method of statistical moments for estimation of parameters) to deterministically chaotic low-dimensional dynamic system (the logistic map) containing an observational noise. A “segmentation fitting” maximum likelihood (ML) method is suggested to estimate the structural parameter of the logistic map along with the initial value x1 considered as an additional unknown parameter. The segmentation fitting method, called “piece-wise” ML, is similar in spirit but simpler and has smaller bias than the “multiple shooting” previously proposed. Comparisons with different previously proposed techniques on simulated numerical examples give favorable results (at least, for the investigated combinations of sample size N and noise level). Besides, unlike some suggested techniques, our method does not require the a priori knowledge of the noise variance. We also clarify the nature of the inherent difficulties in the statistical analysis of deterministically chaotic time series and the status of previously proposed Bayesian approaches. We note the trade off between the need of using a large number of data points in the ML analysis to decrease the bias (to guarantee consistency of the estimation) and the unstable nature of dynamical trajectories with exponentially fast loss of memory of the initial condition. The method of statistical moments for the estimation of the parameter of the logistic map is discussed. This method seems to be the unique method whose consistency for deterministically chaotic time series is proved so far theoretically (not only numerically).
Statistical Methods for Estimating the Uncertainty in the Best Basis Inventories
International Nuclear Information System (INIS)
WILMARTH, S.R.
2000-01-01
This document describes the statistical methods used to determine sample-based uncertainty estimates for the Best Basis Inventory (BBI). For each waste phase, the equation for the inventory of an analyte in a tank is Inventory (Kg or Ci) = Concentration x Density x Waste Volume. the total inventory is the sum of the inventories in the different waste phases. Using tanks sample data: statistical methods are used to obtain estimates of the mean concentration of an analyte the density of the waste, and their standard deviations. The volumes of waste in the different phases, and their standard deviations, are estimated based on other types of data. The three estimates are multiplied to obtain the inventory estimate. The standard deviations are combined to obtain a standard deviation of the inventory. The uncertainty estimate for the Best Basis Inventory (BBI) is the approximate 95% confidence interval on the inventory
Statistical methods of estimating mining costs
Long, K.R.
2011-01-01
Until it was defunded in 1995, the U.S. Bureau of Mines maintained a Cost Estimating System (CES) for prefeasibility-type economic evaluations of mineral deposits and estimating costs at producing and non-producing mines. This system had a significant role in mineral resource assessments to estimate costs of developing and operating known mineral deposits and predicted undiscovered deposits. For legal reasons, the U.S. Geological Survey cannot update and maintain CES. Instead, statistical tools are under development to estimate mining costs from basic properties of mineral deposits such as tonnage, grade, mineralogy, depth, strip ratio, distance from infrastructure, rock strength, and work index. The first step was to reestimate "Taylor's Rule" which relates operating rate to available ore tonnage. The second step was to estimate statistical models of capital and operating costs for open pit porphyry copper mines with flotation concentrators. For a sample of 27 proposed porphyry copper projects, capital costs can be estimated from three variables: mineral processing rate, strip ratio, and distance from nearest railroad before mine construction began. Of all the variables tested, operating costs were found to be significantly correlated only with strip ratio.
Southard, Rodney E.
2013-01-01
The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical
Testing a statistical method of global mean palotemperature estimations in a long climate simulation
Energy Technology Data Exchange (ETDEWEB)
Zorita, E.; Gonzalez-Rouco, F. [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Hydrophysik
2001-07-01
Current statistical methods of reconstructing the climate of the last centuries are based on statistical models linking climate observations (temperature, sea-level-pressure) and proxy-climate data (tree-ring chronologies, ice-cores isotope concentrations, varved sediments, etc.). These models are calibrated in the instrumental period, and the longer time series of proxy data are then used to estimate the past evolution of the climate variables. Using such methods the global mean temperature of the last 600 years has been recently estimated. In this work this method of reconstruction is tested using data from a very long simulation with a climate model. This testing allows to estimate the errors of the estimations as a function of the number of proxy data and the time scale at which the estimations are probably reliable. (orig.)
Statistical estimation Monte Carlo for unreliability evaluation of highly reliable system
International Nuclear Information System (INIS)
Xiao Gang; Su Guanghui; Jia Dounan; Li Tianduo
2000-01-01
Based on analog Monte Carlo simulation, statistical Monte Carlo methods for unreliable evaluation of highly reliable system are constructed, including direct statistical estimation Monte Carlo method and weighted statistical estimation Monte Carlo method. The basal element is given, and the statistical estimation Monte Carlo estimators are derived. Direct Monte Carlo simulation method, bounding-sampling method, forced transitions Monte Carlo method, direct statistical estimation Monte Carlo and weighted statistical estimation Monte Carlo are used to evaluate unreliability of a same system. By comparing, weighted statistical estimation Monte Carlo estimator has smallest variance, and has highest calculating efficiency
Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry
2013-08-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.
Kim, Yoonsang; Emery, Sherry
2013-01-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415
Methods for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma
Esralew, Rachel A.; Smith, S. Jerrod
2010-01-01
Flow statistics can be used to provide decision makers with surface-water information needed for activities such as water-supply permitting, flow regulation, and other water rights issues. Flow statistics could be needed at any location along a stream. Most often, streamflow statistics are needed at ungaged sites, where no flow data are available to compute the statistics. Methods are presented in this report for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma. Flow statistics included the (1) annual (period of record), (2) seasonal (summer-autumn and winter-spring), and (3) 12 monthly duration statistics, including the 20th, 50th, 80th, 90th, and 95th percentile flow exceedances, and the annual mean-flow (mean of daily flows for the period of record). Flow statistics were calculated from daily streamflow information collected from 235 streamflow-gaging stations throughout Oklahoma and areas in adjacent states. A drainage-area ratio method is the preferred method for estimating flow statistics at an ungaged location that is on a stream near a gage. The method generally is reliable only if the drainage-area ratio of the two sites is between 0.5 and 1.5. Regression equations that relate flow statistics to drainage-basin characteristics were developed for the purpose of estimating selected flow-duration and annual mean-flow statistics for ungaged streams that are not near gaging stations on the same stream. Regression equations were developed from flow statistics and drainage-basin characteristics for 113 unregulated gaging stations. Separate regression equations were developed by using U.S. Geological Survey streamflow-gaging stations in regions with similar drainage-basin characteristics. These equations can increase the accuracy of regression equations used for estimating flow-duration and annual mean-flow statistics at ungaged stream locations in Oklahoma. Streamflow-gaging stations were grouped by selected drainage
A statistical method to estimate low-energy hadronic cross sections
Balassa, Gábor; Kovács, Péter; Wolf, György
2018-02-01
In this article we propose a model based on the Statistical Bootstrap approach to estimate the cross sections of different hadronic reactions up to a few GeV in c.m.s. energy. The method is based on the idea, when two particles collide a so-called fireball is formed, which after a short time period decays statistically into a specific final state. To calculate the probabilities we use a phase space description extended with quark combinatorial factors and the possibility of more than one fireball formation. In a few simple cases the probability of a specific final state can be calculated analytically, where we show that the model is able to reproduce the ratios of the considered cross sections. We also show that the model is able to describe proton-antiproton annihilation at rest. In the latter case we used a numerical method to calculate the more complicated final state probabilities. Additionally, we examined the formation of strange and charmed mesons as well, where we used existing data to fit the relevant model parameters.
Statistics of Parameter Estimates: A Concrete Example
Aguilar, Oscar
2015-01-01
© 2015 Society for Industrial and Applied Mathematics. Most mathematical models include parameters that need to be determined from measurements. The estimated values of these parameters and their uncertainties depend on assumptions made about noise levels, models, or prior knowledge. But what can we say about the validity of such estimates, and the influence of these assumptions? This paper is concerned with methods to address these questions, and for didactic purposes it is written in the context of a concrete nonlinear parameter estimation problem. We will use the results of a physical experiment conducted by Allmaras et al. at Texas A&M University [M. Allmaras et al., SIAM Rev., 55 (2013), pp. 149-167] to illustrate the importance of validation procedures for statistical parameter estimation. We describe statistical methods and data analysis tools to check the choices of likelihood and prior distributions, and provide examples of how to compare Bayesian results with those obtained by non-Bayesian methods based on different types of assumptions. We explain how different statistical methods can be used in complementary ways to improve the understanding of parameter estimates and their uncertainties.
Estimation of In Situ Stresses with Hydro-Fracturing Tests and a Statistical Method
Lee, Hikweon; Ong, See Hong
2018-03-01
At great depths, where borehole-based field stress measurements such as hydraulic fracturing are challenging due to difficult downhole conditions or prohibitive costs, in situ stresses can be indirectly estimated using wellbore failures such as borehole breakouts and/or drilling-induced tensile failures detected by an image log. As part of such efforts, a statistical method has been developed in which borehole breakouts detected on an image log are used for this purpose (Song et al. in Proceedings on the 7th international symposium on in situ rock stress, 2016; Song and Chang in J Geophys Res Solid Earth 122:4033-4052, 2017). The method employs a grid-searching algorithm in which the least and maximum horizontal principal stresses ( S h and S H) are varied, and the corresponding simulated depth-related breakout width distribution as a function of the breakout angle ( θ B = 90° - half of breakout width) is compared to that observed along the borehole to determine a set of S h and S H having the lowest misfit between them. An important advantage of the method is that S h and S H can be estimated simultaneously in vertical wells. To validate the statistical approach, the method is applied to a vertical hole where a set of field hydraulic fracturing tests have been carried out. The stress estimations using the proposed method were found to be in good agreement with the results interpreted from the hydraulic fracturing test measurements.
Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S
2015-01-16
Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.
Statistically Efficient Methods for Pitch and DOA Estimation
DEFF Research Database (Denmark)
Jensen, Jesper Rindom; Christensen, Mads Græsbøll; Jensen, Søren Holdt
2013-01-01
, it was recently considered to estimate the DOA and pitch jointly. In this paper, we propose two novel methods for DOA and pitch estimation. They both yield maximum-likelihood estimates in white Gaussian noise scenar- ios, where the SNR may be different across channels, as opposed to state-of-the-art methods......Traditionally, direction-of-arrival (DOA) and pitch estimation of multichannel, periodic sources have been considered as two separate problems. Separate estimation may render the task of resolving sources with similar DOA or pitch impossible, and it may decrease the estimation accuracy. Therefore...
[Flavouring estimation of quality of grape wines with use of methods of mathematical statistics].
Yakuba, Yu F; Khalaphyan, A A; Temerdashev, Z A; Bessonov, V V; Malinkin, A D
2016-01-01
The questions of forming of wine's flavour integral estimation during the tasting are discussed, the advantages and disadvantages of the procedures are declared. As investigating materials we used the natural white and red wines of Russian manufactures, which were made with the traditional technologies from Vitis Vinifera, straight hybrids, blending and experimental wines (more than 300 different samples). The aim of the research was to set the correlation between the content of wine's nonvolatile matter and wine's tasting quality rating by mathematical statistics methods. The content of organic acids, amino acids and cations in wines were considered as the main factors influencing on the flavor. Basically, they define the beverage's quality. The determination of those components in wine's samples was done by the electrophoretic method «CAPEL». Together with the analytical checking of wine's samples quality the representative group of specialists simultaneously carried out wine's tasting estimation using 100 scores system. The possibility of statistical modelling of correlation of wine's tasting estimation based on analytical data of amino acids and cations determination reasonably describing the wine's flavour was examined. The statistical modelling of correlation between the wine's tasting estimation and the content of major cations (ammonium, potassium, sodium, magnesium, calcium), free amino acids (proline, threonine, arginine) and the taking into account the level of influence on flavour and analytical valuation within fixed limits of quality accordance were done with Statistica. Adequate statistical models which are able to predict tasting estimation that is to determine the wine's quality using the content of components forming the flavour properties have been constructed. It is emphasized that along with aromatic (volatile) substances the nonvolatile matter - mineral substances and organic substances - amino acids such as proline, threonine, arginine
Wedemeyer, Gary A.; Nelson, Nancy C.
1975-01-01
Gaussian and nonparametric (percentile estimate and tolerance interval) statistical methods were used to estimate normal ranges for blood chemistry (bicarbonate, bilirubin, calcium, hematocrit, hemoglobin, magnesium, mean cell hemoglobin concentration, osmolality, inorganic phosphorus, and pH for juvenile rainbow (Salmo gairdneri, Shasta strain) trout held under defined environmental conditions. The percentile estimate and Gaussian methods gave similar normal ranges, whereas the tolerance interval method gave consistently wider ranges for all blood variables except hemoglobin. If the underlying frequency distribution is unknown, the percentile estimate procedure would be the method of choice.
Directory of Open Access Journals (Sweden)
A. R. Rote
2010-01-01
Full Text Available Three new simple, economic spectrophotometric methods were developed and validated for the estimation of nabumetone in bulk and tablet dosage form. First method includes determination of nabumetone at absorption maxima 330 nm, second method applied was area under curve for analysis of nabumetone in the wavelength range of 326-334 nm and third method was First order derivative spectra with scaling factor 4. Beer law obeyed in the concentration range of 10-30 μg/mL for all three methods. The correlation coefficients were found to be 0.9997, 0.9998 and 0.9998 by absorption maxima, area under curve and first order derivative spectra. Results of analysis were validated statistically and by performing recovery studies. The mean percent recoveries were found satisfactory for all three methods. The developed methods were also compared statistically using one way ANOVA. The proposed methods have been successfully applied for the estimation of nabumetone in bulk and pharmaceutical tablet dosage form.
Statistical Methods for Environmental Pollution Monitoring
Energy Technology Data Exchange (ETDEWEB)
Gilbert, Richard O. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
1987-01-01
The application of statistics to environmental pollution monitoring studies requires a knowledge of statistical analysis methods particularly well suited to pollution data. This book fills that need by providing sampling plans, statistical tests, parameter estimation procedure techniques, and references to pertinent publications. Most of the statistical techniques are relatively simple, and examples, exercises, and case studies are provided to illustrate procedures. The book is logically divided into three parts. Chapters 1, 2, and 3 are introductory chapters. Chapters 4 through 10 discuss field sampling designs and Chapters 11 through 18 deal with a broad range of statistical analysis procedures. Some statistical techniques given here are not commonly seen in statistics book. For example, see methods for handling correlated data (Sections 4.5 and 11.12), for detecting hot spots (Chapter 10), and for estimating a confidence interval for the mean of a lognormal distribution (Section 13.2). Also, Appendix B lists a computer code that estimates and tests for trends over time at one or more monitoring stations using nonparametric methods (Chapters 16 and 17). Unfortunately, some important topics could not be included because of their complexity and the need to limit the length of the book. For example, only brief mention could be made of time series analysis using Box-Jenkins methods and of kriging techniques for estimating spatial and spatial-time patterns of pollution, although multiple references on these topics are provided. Also, no discussion of methods for assessing risks from environmental pollution could be included.
Statistical methods and materials characterisation
International Nuclear Information System (INIS)
Wallin, K.R.W.
2010-01-01
Statistics is a wide mathematical area, which covers a myriad of analysis and estimation options, some of which suit special cases better than others. A comprehensive coverage of the whole area of statistics would be an enormous effort and would also be outside the capabilities of this author. Therefore, this does not intend to be a textbook on statistical methods available for general data analysis and decision making. Instead it will highlight a certain special statistical case applicable to mechanical materials characterization. The methods presented here do not in any way rule out other statistical methods by which to analyze mechanical property material data. (orig.)
Application of Statistical Methods of Rain Rate Estimation to Data From The TRMM Precipitation Radar
Meneghini, R.; Jones, J. A.; Iguchi, T.; Okamoto, K.; Liao, L.; Busalacchi, Antonio J. (Technical Monitor)
2000-01-01
The TRMM Precipitation Radar is well suited to statistical methods in that the measurements over any given region are sparsely sampled in time. Moreover, the instantaneous rain rate estimates are often of limited accuracy at high rain rates because of attenuation effects and at light rain rates because of receiver sensitivity. For the estimation of the time-averaged rain characteristics over an area both errors are relevant. By enlarging the space-time region over which the data are collected, the sampling error can be reduced. However. the bias and distortion of the estimated rain distribution generally will remain if estimates at the high and low rain rates are not corrected. In this paper we use the TRMM PR data to investigate the behavior of 2 statistical methods the purpose of which is to estimate the rain rate over large space-time domains. Examination of large-scale rain characteristics provides a useful starting point. The high correlation between the mean and standard deviation of rain rate implies that the conditional distribution of this quantity can be approximated by a one-parameter distribution. This property is used to explore the behavior of the area-time-integral (ATI) methods where fractional area above a threshold is related to the mean rain rate. In the usual application of the ATI method a correlation is established between these quantities. However, if a particular form of the rain rate distribution is assumed and if the ratio of the mean to standard deviation is known, then not only the mean but the full distribution can be extracted from a measurement of fractional area above a threshold. The second method is an extension of this idea where the distribution is estimated from data over a range of rain rates chosen in an intermediate range where the effects of attenuation and poor sensitivity can be neglected. The advantage of estimating the distribution itself rather than the mean value is that it yields the fraction of rain contributed by
2016-08-17
Specialized Finite Set Statistics (FISST)-based Estimation Methods to Enhance Space Situational Awareness in Medium Earth Orbit (MEO) and Geostationary...terms of specialized Geostationary Earth Orbit (GEO) elements to estimate the state of resident space objects in the geostationary regime. Justification...AFRL-RV-PS- AFRL-RV-PS- TR-2016-0114 TR-2016-0114 SPECIALIZED FINITE SET STATISTICS (FISST)- BASED ESTIMATION METHODS TO ENHANCE SPACE SITUATIONAL
Wang, Yikai; Kang, Jian; Kemmer, Phebe B; Guo, Ying
2016-01-01
Currently, network-oriented analysis of fMRI data has become an important tool for understanding brain organization and brain networks. Among the range of network modeling methods, partial correlation has shown great promises in accurately detecting true brain network connections. However, the application of partial correlation in investigating brain connectivity, especially in large-scale brain networks, has been limited so far due to the technical challenges in its estimation. In this paper, we propose an efficient and reliable statistical method for estimating partial correlation in large-scale brain network modeling. Our method derives partial correlation based on the precision matrix estimated via Constrained L1-minimization Approach (CLIME), which is a recently developed statistical method that is more efficient and demonstrates better performance than the existing methods. To help select an appropriate tuning parameter for sparsity control in the network estimation, we propose a new Dens-based selection method that provides a more informative and flexible tool to allow the users to select the tuning parameter based on the desired sparsity level. Another appealing feature of the Dens-based method is that it is much faster than the existing methods, which provides an important advantage in neuroimaging applications. Simulation studies show that the Dens-based method demonstrates comparable or better performance with respect to the existing methods in network estimation. We applied the proposed partial correlation method to investigate resting state functional connectivity using rs-fMRI data from the Philadelphia Neurodevelopmental Cohort (PNC) study. Our results show that partial correlation analysis removed considerable between-module marginal connections identified by full correlation analysis, suggesting these connections were likely caused by global effects or common connection to other nodes. Based on partial correlation, we find that the most significant
ASYMPTOTIC COMPARISONS OF U-STATISTICS, V-STATISTICS AND LIMITS OF BAYES ESTIMATES BY DEFICIENCIES
Toshifumi, Nomachi; Hajime, Yamato; Graduate School of Science and Engineering, Kagoshima University:Miyakonojo College of Technology; Faculty of Science, Kagoshima University
2001-01-01
As estimators of estimable parameters, we consider three statistics which are U-statistic, V-statistic and limit of Bayes estimate. This limit of Bayes estimate, called LB-statistic in this paper, is obtained from Bayes estimate of estimable parameter based on Dirichlet process, by letting its parameter tend to zero. For the estimable parameter with non-degenerate kernel, the asymptotic relative efficiencies of LB-statistic with respect to U-statistic and V-statistic and that of V-statistic w...
Directory of Open Access Journals (Sweden)
Pedro L. Valencia
2017-04-01
Full Text Available We provide initial rate data from enzymatic reaction experiments and tis processing to estimate the kinetic parameters from the substrate uncompetitive inhibition equation using the median method published by Eisenthal and Cornish-Bowden (Cornish-Bowden and Eisenthal, 1974; Eisenthal and Cornish-Bowden, 1974. The method was denominated the direct linear plot and consists in the calculation of the median from a dataset of kinetic parameters Vmax and Km from the Michaelis–Menten equation. In this opportunity we present the procedure to applicate the direct linear plot to the substrate uncompetitive inhibition equation; a three-parameter equation. The median method is characterized for its robustness and its insensibility to outlier. The calculations are presented in an Excel datasheet and a computational algorithm was developed in the free software Python. The kinetic parameters of the substrate uncompetitive inhibition equation Vmax, Km and Ks were calculated using three experimental points from the dataset formed by 13 experimental points. All the 286 combinations were calculated. The dataset of kinetic parameters resulting from this combinatorial was used to calculate the median which corresponds to the statistic estimator of the real kinetic parameters. A comparative statistical analyses between the median method and the least squares was published in Valencia et al. [3].
Estimation of global network statistics from incomplete data.
Directory of Open Access Journals (Sweden)
Catherine A Bliss
Full Text Available Complex networks underlie an enormous variety of social, biological, physical, and virtual systems. A profound complication for the science of complex networks is that in most cases, observing all nodes and all network interactions is impossible. Previous work addressing the impacts of partial network data is surprisingly limited, focuses primarily on missing nodes, and suggests that network statistics derived from subsampled data are not suitable estimators for the same network statistics describing the overall network topology. We generate scaling methods to predict true network statistics, including the degree distribution, from only partial knowledge of nodes, links, or weights. Our methods are transparent and do not assume a known generating process for the network, thus enabling prediction of network statistics for a wide variety of applications. We validate analytical results on four simulated network classes and empirical data sets of various sizes. We perform subsampling experiments by varying proportions of sampled data and demonstrate that our scaling methods can provide very good estimates of true network statistics while acknowledging limits. Lastly, we apply our techniques to a set of rich and evolving large-scale social networks, Twitter reply networks. Based on 100 million tweets, we use our scaling techniques to propose a statistical characterization of the Twitter Interactome from September 2008 to November 2008. Our treatment allows us to find support for Dunbar's hypothesis in detecting an upper threshold for the number of active social contacts that individuals maintain over the course of one week.
Emoto, K.; Saito, T.; Shiomi, K.
2017-12-01
Short-period (2 s) seismograms. We found that the energy of the coda of long-period seismograms shows a spatially flat distribution. This phenomenon is well known in short-period seismograms and results from the scattering by small-scale heterogeneities. We estimate the statistical parameters that characterize the small-scale random heterogeneity by modelling the spatiotemporal energy distribution of long-period seismograms. We analyse three moderate-size earthquakes that occurred in southwest Japan. We calculate the spatial distribution of the energy density recorded by a dense seismograph network in Japan at the period bands of 8-16 s, 4-8 s and 2-4 s and model them by using 3-D finite difference (FD) simulations. Compared to conventional methods based on statistical theories, we can calculate more realistic synthetics by using the FD simulation. It is not necessary to assume a uniform background velocity, body or surface waves and scattering properties considered in general scattering theories. By taking the ratio of the energy of the coda area to that of the entire area, we can separately estimate the scattering and the intrinsic absorption effects. Our result reveals the spectrum of the random inhomogeneity in a wide wavenumber range including the intensity around the corner wavenumber as P(m) = 8πε2a3/(1 + a2m2)2, where ε = 0.05 and a = 3.1 km, even though past studies analysing higher-frequency records could not detect the corner. Finally, we estimate the intrinsic attenuation by modelling the decay rate of the energy. The method proposed in this study is suitable for quantifying the statistical properties of long-wavelength subsurface random inhomogeneity, which leads the way to characterizing a wider wavenumber range of spectra, including the corner wavenumber.
Statistical Methods for Particle Physics (4/4)
CERN. Geneva
2012-01-01
The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.
Statistical Methods for Particle Physics (1/4)
CERN. Geneva
2012-01-01
The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.
Statistical Methods for Particle Physics (2/4)
CERN. Geneva
2012-01-01
The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.
Statistical Methods for Particle Physics (3/4)
CERN. Geneva
2012-01-01
The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.
Workshop on Analytical Methods in Statistics
Jurečková, Jana; Maciak, Matúš; Pešta, Michal
2017-01-01
This volume collects authoritative contributions on analytical methods and mathematical statistics. The methods presented include resampling techniques; the minimization of divergence; estimation theory and regression, eventually under shape or other constraints or long memory; and iterative approximations when the optimal solution is difficult to achieve. It also investigates probability distributions with respect to their stability, heavy-tailness, Fisher information and other aspects, both asymptotically and non-asymptotically. The book not only presents the latest mathematical and statistical methods and their extensions, but also offers solutions to real-world problems including option pricing. The selected, peer-reviewed contributions were originally presented at the workshop on Analytical Methods in Statistics, AMISTAT 2015, held in Prague, Czech Republic, November 10-13, 2015.
International Nuclear Information System (INIS)
Espinoza-Ojeda, O M; Santoyo, E; Andaverde, J
2011-01-01
Approximate and rigorous solutions of seven heat transfer models were statistically examined, for the first time, to estimate stabilized formation temperatures (SFT) of geothermal and petroleum boreholes. Constant linear and cylindrical heat source models were used to describe the heat flow (either conductive or conductive/convective) involved during a borehole drilling. A comprehensive statistical assessment of the major error sources associated with the use of these models was carried out. The mathematical methods (based on approximate and rigorous solutions of heat transfer models) were thoroughly examined by using four statistical analyses: (i) the use of linear and quadratic regression models to infer the SFT; (ii) the application of statistical tests of linearity to evaluate the actual relationship between bottom-hole temperatures and time function data for each selected method; (iii) the comparative analysis of SFT estimates between the approximate and rigorous predictions of each analytical method using a β ratio parameter to evaluate the similarity of both solutions, and (iv) the evaluation of accuracy in each method using statistical tests of significance, and deviation percentages between 'true' formation temperatures and SFT estimates (predicted from approximate and rigorous solutions). The present study also enabled us to determine the sensitivity parameters that should be considered for a reliable calculation of SFT, as well as to define the main physical and mathematical constraints where the approximate and rigorous methods could provide consistent SFT estimates
Robust statistical methods with R
Jureckova, Jana
2005-01-01
Robust statistical methods were developed to supplement the classical procedures when the data violate classical assumptions. They are ideally suited to applied research across a broad spectrum of study, yet most books on the subject are narrowly focused, overly theoretical, or simply outdated. Robust Statistical Methods with R provides a systematic treatment of robust procedures with an emphasis on practical application.The authors work from underlying mathematical tools to implementation, paying special attention to the computational aspects. They cover the whole range of robust methods, including differentiable statistical functions, distance of measures, influence functions, and asymptotic distributions, in a rigorous yet approachable manner. Highlighting hands-on problem solving, many examples and computational algorithms using the R software supplement the discussion. The book examines the characteristics of robustness, estimators of real parameter, large sample properties, and goodness-of-fit tests. It...
Pattern statistics on Markov chains and sensitivity to parameter estimation
Directory of Open Access Journals (Sweden)
Nuel Grégory
2006-10-01
Full Text Available Abstract Background: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,.... Results: In the particular case where pattern statistics (overlap counting only computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. Conclusion: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation.
Directory of Open Access Journals (Sweden)
Mamykin A. V.
2017-10-01
Full Text Available The authors propose a method for determination of the electro-physical characteristics of electrical insulating liquids on the example of different types of gasoline. The method is based on the spectral impedance measurements of a capacitor electrochemical cell filled with the liquid under study. The application of sinusoidal test voltage in the frequency range of 0,1—10 Hz provides more accurate measurements in comparison with known traditional methods. A portable device for measuring total electrical resistance (impedance of dielectric liquids was designed and constructed. An approach for express estimation of octane number of automobile gasoline using spectroimpedance measurements and statistical multi variation methods of data analysis has been proposed and tested.
Academic Training Lecture: Statistical Methods for Particle Physics
PH Department
2012-01-01
2, 3, 4 and 5 April 2012 Academic Training Lecture Regular Programme from 11:00 to 12:00 - Bldg. 222-R-001 - Filtration Plant Statistical Methods for Particle Physics by Glen Cowan (Royal Holloway) The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.
On estimating perturbative coefficients in quantum field theory and statistical physics
International Nuclear Information System (INIS)
Samuel, M.A.; Stanford Univ., CA
1994-05-01
The authors present a method for estimating perturbative coefficients in quantum field theory and Statistical Physics. They are able to obtain reliable error-bars for each estimate. The results, in all cases, are excellent
Statistically and Computationally Efficient Estimating Equations for Large Spatial Datasets
Sun, Ying; Stein, Michael L.
2014-01-01
For Gaussian process models, likelihood based methods are often difficult to use with large irregularly spaced spatial datasets, because exact calculations of the likelihood for n observations require O(n3) operations and O(n2) memory. Various approximation methods have been developed to address the computational difficulties. In this paper, we propose new unbiased estimating equations based on score equation approximations that are both computationally and statistically efficient. We replace the inverse covariance matrix that appears in the score equations by a sparse matrix to approximate the quadratic forms, then set the resulting quadratic forms equal to their expected values to obtain unbiased estimating equations. The sparse matrix is constructed by a sparse inverse Cholesky approach to approximate the inverse covariance matrix. The statistical efficiency of the resulting unbiased estimating equations are evaluated both in theory and by numerical studies. Our methods are applied to nearly 90,000 satellite-based measurements of water vapor levels over a region in the Southeast Pacific Ocean.
Statistically and Computationally Efficient Estimating Equations for Large Spatial Datasets
Sun, Ying
2014-11-07
For Gaussian process models, likelihood based methods are often difficult to use with large irregularly spaced spatial datasets, because exact calculations of the likelihood for n observations require O(n3) operations and O(n2) memory. Various approximation methods have been developed to address the computational difficulties. In this paper, we propose new unbiased estimating equations based on score equation approximations that are both computationally and statistically efficient. We replace the inverse covariance matrix that appears in the score equations by a sparse matrix to approximate the quadratic forms, then set the resulting quadratic forms equal to their expected values to obtain unbiased estimating equations. The sparse matrix is constructed by a sparse inverse Cholesky approach to approximate the inverse covariance matrix. The statistical efficiency of the resulting unbiased estimating equations are evaluated both in theory and by numerical studies. Our methods are applied to nearly 90,000 satellite-based measurements of water vapor levels over a region in the Southeast Pacific Ocean.
Advanced statistical methods in data science
Chen, Jiahua; Lu, Xuewen; Yi, Grace; Yu, Hao
2016-01-01
This book gathers invited presentations from the 2nd Symposium of the ICSA- CANADA Chapter held at the University of Calgary from August 4-6, 2015. The aim of this Symposium was to promote advanced statistical methods in big-data sciences and to allow researchers to exchange ideas on statistics and data science and to embraces the challenges and opportunities of statistics and data science in the modern world. It addresses diverse themes in advanced statistical analysis in big-data sciences, including methods for administrative data analysis, survival data analysis, missing data analysis, high-dimensional and genetic data analysis, longitudinal and functional data analysis, the design and analysis of studies with response-dependent and multi-phase designs, time series and robust statistics, statistical inference based on likelihood, empirical likelihood and estimating functions. The editorial group selected 14 high-quality presentations from this successful symposium and invited the presenters to prepare a fu...
Switzer, P.; Harden, J.W.; Mark, R.K.
1988-01-01
A statistical method for estimating rates of soil development in a given region based on calibration from a series of dated soils is used to estimate ages of soils in the same region that are not dated directly. The method is designed specifically to account for sampling procedures and uncertainties that are inherent in soil studies. Soil variation and measurement error, uncertainties in calibration dates and their relation to the age of the soil, and the limited number of dated soils are all considered. Maximum likelihood (ML) is employed to estimate a parametric linear calibration curve, relating soil development to time or age on suitably transformed scales. Soil variation on a geomorphic surface of a certain age is characterized by replicate sampling of soils on each surface; such variation is assumed to have a Gaussian distribution. The age of a geomorphic surface is described by older and younger bounds. This technique allows age uncertainty to be characterized by either a Gaussian distribution or by a triangular distribution using minimum, best-estimate, and maximum ages. The calibration curve is taken to be linear after suitable (in certain cases logarithmic) transformations, if required, of the soil parameter and age variables. Soil variability, measurement error, and departures from linearity are described in a combined fashion using Gaussian distributions with variances particular to each sampled geomorphic surface and the number of sample replicates. Uncertainty in age of a geomorphic surface used for calibration is described using three parameters by one of two methods. In the first method, upper and lower ages are specified together with a coverage probability; this specification is converted to a Gaussian distribution with the appropriate mean and variance. In the second method, "absolute" older and younger ages are specified together with a most probable age; this specification is converted to an asymmetric triangular distribution with mode at the
Directory of Open Access Journals (Sweden)
Brion Philippe
2015-12-01
Full Text Available Using as much administrative data as possible is a general trend among most national statistical institutes. Different kinds of administrative sources, from tax authorities or other administrative bodies, are very helpful material in the production of business statistics. However, these sources often have to be completed by information collected through statistical surveys. This article describes the way Insee has implemented such a strategy in order to produce French structural business statistics. The originality of the French procedure is that administrative and survey variables are used jointly for the same enterprises, unlike the majority of multisource systems, in which the two kinds of sources generally complement each other for different categories of units. The idea is to use, as much as possible, the richness of the administrative sources combined with the timeliness of a survey, even if the latter is conducted only on a sample of enterprises. One main issue is the classification of enterprises within the NACE nomenclature, which is a cornerstone variable in producing the breakdown of the results by industry. At a given date, two values of the corresponding code may coexist: the value of the register, not necessarily up to date, and the value resulting from the data collected via the survey, but only from a sample of enterprises. Using all this information together requires the implementation of specific statistical estimators combining some properties of the difference estimators with calibration techniques. This article presents these estimators, as well as their statistical properties, and compares them with those of other methods.
Statistical inference for remote sensing-based estimates of net deforestation
Ronald E. McRoberts; Brian F. Walters
2012-01-01
Statistical inference requires expression of an estimate in probabilistic terms, usually in the form of a confidence interval. An approach to constructing confidence intervals for remote sensing-based estimates of net deforestation is illustrated. The approach is based on post-classification methods using two independent forest/non-forest classifications because...
International Nuclear Information System (INIS)
Kang, Won-Hee; Kliese, Alyce
2014-01-01
Lifeline networks, such as transportation, water supply, sewers, telecommunications, and electrical and gas networks, are essential elements for the economic and societal functions of urban areas, but their components are highly susceptible to natural or man-made hazards. In this context, it is essential to provide effective pre-disaster hazard mitigation strategies and prompt post-disaster risk management efforts based on rapid system reliability assessment. This paper proposes a rapid reliability estimation method for node-pair connectivity analysis of lifeline networks especially when the network components are statistically correlated. Recursive procedures are proposed to compound all network nodes until they become a single super node representing the connectivity between the origin and destination nodes. The proposed method is applied to numerical network examples and benchmark interconnected power and water networks in Memphis, Shelby County. The connectivity analysis results show the proposed method's reasonable accuracy and remarkable efficiency as compared to the Monte Carlo simulations
Estimation of social value of statistical life using willingness-to-pay method in Nanjing, China.
Yang, Zhao; Liu, Pan; Xu, Xin
2016-10-01
Rational decision making regarding the safety related investment programs greatly depends on the economic valuation of traffic crashes. The primary objective of this study was to estimate the social value of statistical life in the city of Nanjing in China. A stated preference survey was conducted to investigate travelers' willingness to pay for traffic risk reduction. Face-to-face interviews were conducted at stations, shopping centers, schools, and parks in different districts in the urban area of Nanjing. The respondents were categorized into two groups, including motorists and non-motorists. Both the binary logit model and mixed logit model were developed for the two groups of people. The results revealed that the mixed logit model is superior to the fixed coefficient binary logit model. The factors that significantly affect people's willingness to pay for risk reduction include income, education, gender, age, drive age (for motorists), occupation, whether the charged fees were used to improve private vehicle equipment (for motorists), reduction in fatality rate, and change in travel cost. The Monte Carlo simulation method was used to generate the distribution of value of statistical life (VSL). Based on the mixed logit model, the VSL had a mean value of 3,729,493 RMB ($586,610) with a standard deviation of 2,181,592 RMB ($343,142) for motorists; and a mean of 3,281,283 RMB ($505,318) with a standard deviation of 2,376,975 RMB ($366,054) for non-motorists. Using the tax system to illustrate the contribution of different income groups to social funds, the social value of statistical life was estimated. The average social value of statistical life was found to be 7,184,406 RMB ($1,130,032). Copyright © 2016 Elsevier Ltd. All rights reserved.
Understanding advanced statistical methods
Westfall, Peter
2013-01-01
Introduction: Probability, Statistics, and ScienceReality, Nature, Science, and ModelsStatistical Processes: Nature, Design and Measurement, and DataModelsDeterministic ModelsVariabilityParametersPurely Probabilistic Statistical ModelsStatistical Models with Both Deterministic and Probabilistic ComponentsStatistical InferenceGood and Bad ModelsUses of Probability ModelsRandom Variables and Their Probability DistributionsIntroductionTypes of Random Variables: Nominal, Ordinal, and ContinuousDiscrete Probability Distribution FunctionsContinuous Probability Distribution FunctionsSome Calculus-Derivatives and Least SquaresMore Calculus-Integrals and Cumulative Distribution FunctionsProbability Calculation and SimulationIntroductionAnalytic Calculations, Discrete and Continuous CasesSimulation-Based ApproximationGenerating Random NumbersIdentifying DistributionsIntroductionIdentifying Distributions from Theory AloneUsing Data: Estimating Distributions via the HistogramQuantiles: Theoretical and Data-Based Estimate...
Statistical estimation of process holdup
International Nuclear Information System (INIS)
Harris, S.P.
1988-01-01
Estimates of potential process holdup and their random and systematic error variances are derived to improve the inventory difference (ID) estimate and its associated measure of uncertainty for a new process at the Savannah River Plant. Since the process is in a start-up phase, data have not yet accumulated for statistical modelling. The material produced in the facility will be a very pure, highly enriched 235U with very small isotopic variability. Therefore, data published in LANL's unclassified report on Estimation Methods for Process Holdup of a Special Nuclear Materials was used as a starting point for the modelling process. LANL's data were gathered through a series of designed measurements of special nuclear material (SNM) holdup at two of their materials-processing facilities. Also, they had taken steps to improve the quality of data through controlled, larger scale, experiments outside of LANL at highly enriched uranium processing facilities. The data they have accumulated are on an equipment component basis. Our modelling has been restricted to the wet chemistry area. We have developed predictive models for each of our process components based on the LANL data. 43 figs
Statistical Model-Based Face Pose Estimation
Institute of Scientific and Technical Information of China (English)
GE Xinliang; YANG Jie; LI Feng; WANG Huahua
2007-01-01
A robust face pose estimation approach is proposed by using face shape statistical model approach and pose parameters are represented by trigonometric functions. The face shape statistical model is firstly built by analyzing the face shapes from different people under varying poses. The shape alignment is vital in the process of building the statistical model. Then, six trigonometric functions are employed to represent the face pose parameters. Lastly, the mapping function is constructed between face image and face pose by linearly relating different parameters. The proposed approach is able to estimate different face poses using a few face training samples. Experimental results are provided to demonstrate its efficiency and accuracy.
A new method to determine the number of experimental data using statistical modeling methods
Energy Technology Data Exchange (ETDEWEB)
Jung, Jung-Ho; Kang, Young-Jin; Lim, O-Kaung; Noh, Yoojeong [Pusan National University, Busan (Korea, Republic of)
2017-06-15
For analyzing the statistical performance of physical systems, statistical characteristics of physical parameters such as material properties need to be estimated by collecting experimental data. For accurate statistical modeling, many such experiments may be required, but data are usually quite limited owing to the cost and time constraints of experiments. In this study, a new method for determining a rea- sonable number of experimental data is proposed using an area metric, after obtaining statistical models using the information on the underlying distribution, the Sequential statistical modeling (SSM) approach, and the Kernel density estimation (KDE) approach. The area metric is used as a convergence criterion to determine the necessary and sufficient number of experimental data to be acquired. The pro- posed method is validated in simulations, using different statistical modeling methods, different true models, and different convergence criteria. An example data set with 29 data describing the fatigue strength coefficient of SAE 950X is used for demonstrating the performance of the obtained statistical models that use a pre-determined number of experimental data in predicting the probability of failure for a target fatigue life.
Estimating the thickness of ultra thin sections for electron microscopy by image statistics
DEFF Research Database (Denmark)
Sporring, Jon; Khanmohammadi, Mahdieh; Darkner, Sune
2014-01-01
We propose a method for estimating the thickness of ultra thin histological sections by image statistics alone. Our method works for images, that are the realisations of a stationary and isotropic stochastic process, and it relies on the existence of statistical image-measures that are strictly m...
Szulc, Stefan
1965-01-01
Statistical Methods provides a discussion of the principles of the organization and technique of research, with emphasis on its application to the problems in social statistics. This book discusses branch statistics, which aims to develop practical ways of collecting and processing numerical data and to adapt general statistical methods to the objectives in a given field.Organized into five parts encompassing 22 chapters, this book begins with an overview of how to organize the collection of such information on individual units, primarily as accomplished by government agencies. This text then
Statistical delay estimation in digital circuits using VHDL
Directory of Open Access Journals (Sweden)
Milić Miljana Lj.
2014-01-01
Full Text Available The most important feature of modern integrated circuit is the speed. It depends on circuit's delay. For the design of high-speed digital circuits, it is necessary to evaluate delays in the earliest stages of design, thus making it easy to modify and redesign a circuit if it's too slow. This paper gives an approach for efficient delay estimation in the describing phase of the circuit design. The method can statistically estimate the minimum and maximum delay of all possible paths and signal transitions in the circuit, considering the practical implementation of circuits, and information about the parameters' tolerances. The method uses a VHDL description and is verified on ISCAS85 benchmark circuits. Matlab was used for data processing.
Experimental uncertainty estimation and statistics for data having interval uncertainty.
Energy Technology Data Exchange (ETDEWEB)
Kreinovich, Vladik (Applied Biomathematics, Setauket, New York); Oberkampf, William Louis (Applied Biomathematics, Setauket, New York); Ginzburg, Lev (Applied Biomathematics, Setauket, New York); Ferson, Scott (Applied Biomathematics, Setauket, New York); Hajagos, Janos (Applied Biomathematics, Setauket, New York)
2007-05-01
This report addresses the characterization of measurements that include epistemic uncertainties in the form of intervals. It reviews the application of basic descriptive statistics to data sets which contain intervals rather than exclusively point estimates. It describes algorithms to compute various means, the median and other percentiles, variance, interquartile range, moments, confidence limits, and other important statistics and summarizes the computability of these statistics as a function of sample size and characteristics of the intervals in the data (degree of overlap, size and regularity of widths, etc.). It also reviews the prospects for analyzing such data sets with the methods of inferential statistics such as outlier detection and regressions. The report explores the tradeoff between measurement precision and sample size in statistical results that are sensitive to both. It also argues that an approach based on interval statistics could be a reasonable alternative to current standard methods for evaluating, expressing and propagating measurement uncertainties.
Methods and statistics for combining motif match scores.
Bailey, T L; Gribskov, M
1998-01-01
Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.
Thermodynamic properties of organic compounds estimation methods, principles and practice
Janz, George J
1967-01-01
Thermodynamic Properties of Organic Compounds: Estimation Methods, Principles and Practice, Revised Edition focuses on the progression of practical methods in computing the thermodynamic characteristics of organic compounds. Divided into two parts with eight chapters, the book concentrates first on the methods of estimation. Topics presented are statistical and combined thermodynamic functions; free energy change and equilibrium conversions; and estimation of thermodynamic properties. The next discussions focus on the thermodynamic properties of simple polyatomic systems by statistical the
International Nuclear Information System (INIS)
Roth, D.J.; Swickard, S.M.; Stang, D.B.; Deguire, M.R.
1990-03-01
A review and statistical analysis of the ultrasonic velocity method for estimating the porosity fraction in polycrystalline materials is presented. Initially, a semi-empirical model is developed showing the origin of the linear relationship between ultrasonic velocity and porosity fraction. Then, from a compilation of data produced by many researchers, scatter plots of velocity versus percent porosity data are shown for Al2O3, MgO, porcelain-based ceramics, PZT, SiC, Si3N4, steel, tungsten, UO2,(U0.30Pu0.70)C, and YBa2Cu3O(7-x). Linear regression analysis produced predicted slope, intercept, correlation coefficient, level of significance, and confidence interval statistics for the data. Velocity values predicted from regression analysis for fully-dense materials are in good agreement with those calculated from elastic properties
Using statistical inference for decision making in best estimate analyses
International Nuclear Information System (INIS)
Sermer, P.; Weaver, K.; Hoppe, F.; Olive, C.; Quach, D.
2008-01-01
For broad classes of safety analysis problems, one needs to make decisions when faced with randomly varying quantities which are also subject to errors. The means for doing this involves a statistical approach which takes into account the nature of the physical problems, and the statistical constraints they impose. We describe the methodology for doing this which has been developed at Nuclear Safety Solutions, and we draw some comparisons to other methods which are commonly used in Canada and internationally. Our methodology has the advantages of being robust and accurate and compares favourably to other best estimate methods. (author)
Statistical method for resolving the photon-photoelectron-counting inversion problem
International Nuclear Information System (INIS)
Wu Jinlong; Li Tiejun; Peng, Xiang; Guo Hong
2011-01-01
A statistical inversion method is proposed for the photon-photoelectron-counting statistics in quantum key distribution experiment. With the statistical viewpoint, this problem is equivalent to the parameter estimation for an infinite binomial mixture model. The coarse-graining idea and Bayesian methods are applied to deal with this ill-posed problem, which is a good simple example to show the successful application of the statistical methods to the inverse problem. Numerical results show the applicability of the proposed strategy. The coarse-graining idea for the infinite mixture models should be general to be used in the future.
Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P
1999-01-01
Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149
Statistical estimation of nuclear reactor dynamic parameters
International Nuclear Information System (INIS)
Cummins, J.D.
1962-02-01
This report discusses the study of the noise in nuclear reactors and associated power plant. The report is divided into three distinct parts. In the first part parameters which influence the dynamic behaviour of some reactors will be specified and their effect on dynamic performance described. Methods of estimating dynamic parameters using statistical signals will be described in detail together with descriptions of the usefulness of the results, the accuracy and related topics. Some experiments which have been and which might be performed on nuclear reactors will be described. In the second part of the report a digital computer programme will be described. The computer programme derives the correlation functions and the spectra of signals. The programme will compute the frequency response both gain and phase for physical items of plant for which simultaneous recordings of input and output signal variations have been made. Estimations of the accuracy of the correlation functions and the spectra may be computed using the programme and the amplitude distribution of signals may also b computed. The programme is written in autocode for the Ferranti Mercury computer. In the third part of the report a practical example of the use of the method and the digital programme is presented. In order to eliminate difficulties of interpretation a very simple plant model was chosen i.e. a simple first order lag. Several interesting properties of statistical signals were measured and will be discussed. (author)
Statistical distributions applications and parameter estimates
Thomopoulos, Nick T
2017-01-01
This book gives a description of the group of statistical distributions that have ample application to studies in statistics and probability. Understanding statistical distributions is fundamental for researchers in almost all disciplines. The informed researcher will select the statistical distribution that best fits the data in the study at hand. Some of the distributions are well known to the general researcher and are in use in a wide variety of ways. Other useful distributions are less understood and are not in common use. The book describes when and how to apply each of the distributions in research studies, with a goal to identify the distribution that best applies to the study. The distributions are for continuous, discrete, and bivariate random variables. In most studies, the parameter values are not known a priori, and sample data is needed to estimate parameter values. In other scenarios, no sample data is available, and the researcher seeks some insight that allows the estimate of ...
The application of statistical methods to assess economic assets
Directory of Open Access Journals (Sweden)
D. V. Dianov
2017-01-01
Full Text Available The article is devoted to consideration and evaluation of machinery, equipment and special equipment, methodological aspects of the use of standards for assessment of buildings and structures in current prices, the valuation of residential, specialized houses, office premises, assessment and reassessment of existing and inactive military assets, the application of statistical methods to obtain the relevant cost estimates.The objective of the scientific article is to consider possible application of statistical tools in the valuation of the assets, composing the core group of elements of national wealth – the fixed assets. Firstly, capital tangible assets constitute the basis of material base of a new value creation, products and non-financial services. The gain, accumulated of tangible assets of a capital nature is a part of the gross domestic product, and from its volume and specific weight in the composition of GDP we can judge the scope of reproductive processes in the country.Based on the methodological materials of the state statistics bodies of the Russian Federation, regulations of the theory of statistics, which describe the methods of statistical analysis such as the index, average values, regression, the methodical approach is structured in the application of statistical tools to obtain value estimates of property, plant and equipment with significant accumulated depreciation. Until now, the use of statistical methodology in the practice of economic assessment of assets is only fragmentary. This applies to both Federal Legislation (Federal law № 135 «On valuation activities in the Russian Federation» dated 16.07.1998 in edition 05.07.2016 and the methodological documents and regulations of the estimated activities, in particular, the valuation activities’ standards. A particular problem is the use of a digital database of Rosstat (Federal State Statistics Service, as to the specific fixed assets the comparison should be carried
A Method of Nuclear Software Reliability Estimation
International Nuclear Information System (INIS)
Park, Gee Yong; Eom, Heung Seop; Cheon, Se Woo; Jang, Seung Cheol
2011-01-01
A method on estimating software reliability for nuclear safety software is proposed. This method is based on the software reliability growth model (SRGM) where the behavior of software failure is assumed to follow the non-homogeneous Poisson process. Several modeling schemes are presented in order to estimate and predict more precisely the number of software defects based on a few of software failure data. The Bayesian statistical inference is employed to estimate the model parameters by incorporating the software test cases into the model. It is identified that this method is capable of accurately estimating the remaining number of software defects which are on-demand type directly affecting safety trip functions. The software reliability can be estimated from a model equation and one method of obtaining the software reliability is proposed
Improved air ventilation rate estimation based on a statistical model
International Nuclear Information System (INIS)
Brabec, M.; Jilek, K.
2004-01-01
A new approach to air ventilation rate estimation from CO measurement data is presented. The approach is based on a state-space dynamic statistical model, allowing for quick and efficient estimation. Underlying computations are based on Kalman filtering, whose practical software implementation is rather easy. The key property is the flexibility of the model, allowing various artificial regimens of CO level manipulation to be treated. The model is semi-parametric in nature and can efficiently handle time-varying ventilation rate. This is a major advantage, compared to some of the methods which are currently in practical use. After a formal introduction of the statistical model, its performance is demonstrated on real data from routine measurements. It is shown how the approach can be utilized in a more complex situation of major practical relevance, when time-varying air ventilation rate and radon entry rate are to be estimated simultaneously from concurrent radon and CO measurements
International Nuclear Information System (INIS)
Molchan, G.M.; Kronrod, T.L.; Dmitrieva, O.E.
1995-03-01
The catalog of earthquakes of Italy (1900-1993) is analyzed in the present work. The following problems have been considered: 1) a choice of the operating magnitude, 2) an analysis of data completeness, and 3) a grouping (in time and in space). The catalog has been separated into main shocks and aftershocks. Statistical estimations of seismicity parameters (a,b) are performed for the seismogenetic zones defined by GNDT. The non-standard elements of the analysis performed are: (a) statistical estimation and comparison of seismicity parameters under the condition of arbitrary data grouping in magnitude, time and space; (b) use of a not conventional statistical method for the aftershock identification; the method is based on the idea of optimizing two kinds of errors in the aftershock identification process; (c) use of the aftershock zones to reveal seismically- interrelated seismogenic zones. This procedure contributes to the stability of the estimation of the ''b-value'' Refs, 25 figs, tabs
ESTIMATING RELIABILITY OF DISTURBANCES IN SATELLITE TIME SERIES DATA BASED ON STATISTICAL ANALYSIS
Directory of Open Access Journals (Sweden)
Z.-G. Zhou
2016-06-01
Full Text Available Normally, the status of land cover is inherently dynamic and changing continuously on temporal scale. However, disturbances or abnormal changes of land cover — caused by such as forest fire, flood, deforestation, and plant diseases — occur worldwide at unknown times and locations. Timely detection and characterization of these disturbances is of importance for land cover monitoring. Recently, many time-series-analysis methods have been developed for near real-time or online disturbance detection, using satellite image time series. However, the detection results were only labelled with “Change/ No change” by most of the present methods, while few methods focus on estimating reliability (or confidence level of the detected disturbances in image time series. To this end, this paper propose a statistical analysis method for estimating reliability of disturbances in new available remote sensing image time series, through analysis of full temporal information laid in time series data. The method consists of three main steps. (1 Segmenting and modelling of historical time series data based on Breaks for Additive Seasonal and Trend (BFAST. (2 Forecasting and detecting disturbances in new time series data. (3 Estimating reliability of each detected disturbance using statistical analysis based on Confidence Interval (CI and Confidence Levels (CL. The method was validated by estimating reliability of disturbance regions caused by a recent severe flooding occurred around the border of Russia and China. Results demonstrated that the method can estimate reliability of disturbances detected in satellite image with estimation error less than 5% and overall accuracy up to 90%.
Multivariate statistical methods and data mining in particle physics (4/4)
CERN. Geneva
2008-01-01
The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.
Multivariate statistical methods and data mining in particle physics (2/4)
CERN. Geneva
2008-01-01
The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.
Multivariate statistical methods and data mining in particle physics (1/4)
CERN. Geneva
2008-01-01
The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.
Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
DEFF Research Database (Denmark)
Petersen, JH; Holst, KK; Budtz-Jørgensen, Esben
2010-01-01
The Hubble constant enters big bang cosmology by quantifying the expansion rate of the universe. Existing statistical methods used to estimate Hubble’s constant only partially take into account random measurement errors. As a consequence, estimates of Hubble’s constant are statistically...
Statistics of Monte Carlo methods used in radiation transport calculation
International Nuclear Information System (INIS)
Datta, D.
2009-01-01
Radiation transport calculation can be carried out by using either deterministic or statistical methods. Radiation transport calculation based on statistical methods is basic theme of the Monte Carlo methods. The aim of this lecture is to describe the fundamental statistics required to build the foundations of Monte Carlo technique for radiation transport calculation. Lecture note is organized in the following way. Section (1) will describe the introduction of Basic Monte Carlo and its classification towards the respective field. Section (2) will describe the random sampling methods, a key component of Monte Carlo radiation transport calculation, Section (3) will provide the statistical uncertainty of Monte Carlo estimates, Section (4) will describe in brief the importance of variance reduction techniques while sampling particles such as photon, or neutron in the process of radiation transport
Ore reserve estimation: a summary of principles and methods
International Nuclear Information System (INIS)
Marques, J.P.M.
1985-01-01
The mining industry has experienced substantial improvements with the increasing utilization of computerized and electronic devices throughout the last few years. In the ore reserve estimation field the main methods have undergone recent advances in order to improve their overall efficiency. This paper presents the three main groups of ore reserve estimation methods presently used worldwide: Conventional, Statistical and Geostatistical, and elaborates a detaited description and comparative analysis of each. The Conventional Methods are the oldest, less complex and most employed ones. The Geostatistical Methods are the most recent precise and more complex ones. The Statistical Methods are intermediate to the others in complexity, diffusion and chronological order. (D.J.M.) [pt
Multiple Illuminant Colour Estimation via Statistical Inference on Factor Graphs.
Mutimbu, Lawrence; Robles-Kelly, Antonio
2016-08-31
This paper presents a method to recover a spatially varying illuminant colour estimate from scenes lit by multiple light sources. Starting with the image formation process, we formulate the illuminant recovery problem in a statistically datadriven setting. To do this, we use a factor graph defined across the scale space of the input image. In the graph, we utilise a set of illuminant prototypes computed using a data driven approach. As a result, our method delivers a pixelwise illuminant colour estimate being devoid of libraries or user input. The use of a factor graph also allows for the illuminant estimates to be recovered making use of a maximum a posteriori (MAP) inference process. Moreover, we compute the probability marginals by performing a Delaunay triangulation on our factor graph. We illustrate the utility of our method for pixelwise illuminant colour recovery on widely available datasets and compare against a number of alternatives. We also show sample colour correction results on real-world images.
Ziegeweid, Jeffrey R.; Lorenz, David L.; Sanocki, Chris A.; Czuba, Christiana R.
2015-12-24
Knowledge of the magnitude and frequency of low flows in streams, which are flows in a stream during prolonged dry weather, is fundamental for water-supply planning and design; waste-load allocation; reservoir storage design; and maintenance of water quality and quantity for irrigation, recreation, and wildlife conservation. This report presents the results of a statewide study for which regional regression equations were developed for estimating 13 flow-duration curve statistics and 10 low-flow frequency statistics at ungaged stream locations in Minnesota. The 13 flow-duration curve statistics estimated by regression equations include the 0.0001, 0.001, 0.02, 0.05, 0.1, 0.25, 0.50, 0.75, 0.9, 0.95, 0.99, 0.999, and 0.9999 exceedance-probability quantiles. The low-flow frequency statistics include annual and seasonal (spring, summer, fall, winter) 7-day mean low flows, seasonal 30-day mean low flows, and summer 122-day mean low flows for a recurrence interval of 10 years. Estimates of the 13 flow-duration curve statistics and the 10 low-flow frequency statistics are provided for 196 U.S. Geological Survey continuous-record streamgages using streamflow data collected through September 30, 2012.
Statistical inference based on latent ability estimates
Hoijtink, H.J.A.; Boomsma, A.
The quality of approximations to first and second order moments (e.g., statistics like means, variances, regression coefficients) based on latent ability estimates is being discussed. The ability estimates are obtained using either the Rasch, oi the two-parameter logistic model. Straightforward use
Energy Technology Data Exchange (ETDEWEB)
Lee, Kyung Hoon; Park, Ho Jin; Lee, Chung Chan; Cho, Jin Young [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)
2015-10-15
The purpose of this paper is to study the effect on output parameters in the lattice physics calculation due to the last input uncertainty such as manufacturing deviations from nominal value for material composition and geometric dimensions. In a nuclear design and analysis, the lattice physics calculations are usually employed to generate lattice parameters for the nodal core simulation and pin power reconstruction. These lattice parameters which consist of homogenized few-group cross-sections, assembly discontinuity factors, and form-functions can be affected by input uncertainties which arise from three different sources: 1) multi-group cross-section uncertainties, 2) the uncertainties associated with methods and modeling approximations utilized in lattice physics codes, and 3) fuel/assembly manufacturing uncertainties. In this paper, data provided by the light water reactor (LWR) uncertainty analysis in modeling (UAM) benchmark has been used as the manufacturing uncertainties. First, the effect of each input parameter has been investigated through sensitivity calculations at the fuel assembly level. Then, uncertainty in prediction of peaking factor due to the most sensitive input parameter has been estimated using the statistical sampling method, often called the brute force method. For our analysis, the two-dimensional transport lattice code DeCART2D and its ENDF/B-VII.1 based 47-group library were used to perform the lattice physics calculation. Sensitivity calculations have been performed in order to study the influence of manufacturing tolerances on the lattice parameters. The manufacturing tolerance that has the largest influence on the k-inf is the fuel density. The second most sensitive parameter is the outer clad diameter.
Statistical Software for State Space Methods
Directory of Open Access Journals (Sweden)
Jacques J. F. Commandeur
2011-05-01
Full Text Available In this paper we review the state space approach to time series analysis and establish the notation that is adopted in this special volume of the Journal of Statistical Software. We first provide some background on the history of state space methods for the analysis of time series. This is followed by a concise overview of linear Gaussian state space analysis including the modelling framework and appropriate estimation methods. We discuss the important class of unobserved component models which incorporate a trend, a seasonal, a cycle, and fixed explanatory and intervention variables for the univariate and multivariate analysis of time series. We continue the discussion by presenting methods for the computation of different estimates for the unobserved state vector: filtering, prediction, and smoothing. Estimation approaches for the other parameters in the model are also considered. Next, we discuss how the estimation procedures can be used for constructing confidence intervals, detecting outlier observations and structural breaks, and testing model assumptions of residual independence, homoscedasticity, and normality. We then show how ARIMA and ARIMA components models fit in the state space framework to time series analysis. We also provide a basic introduction for non-Gaussian state space models. Finally, we present an overview of the software tools currently available for the analysis of time series with state space methods as they are discussed in the other contributions to this special volume.
A chronicle of permutation statistical methods 1920–2000, and beyond
Berry, Kenneth J; Mielke Jr , Paul W
2014-01-01
The focus of this book is on the birth and historical development of permutation statistical methods from the early 1920s to the near present. Beginning with the seminal contributions of R.A. Fisher, E.J.G. Pitman, and others in the 1920s and 1930s, permutation statistical methods were initially introduced to validate the assumptions of classical statistical methods. Permutation methods have advantages over classical methods in that they are optimal for small data sets and non-random samples, are data-dependent, and are free of distributional assumptions. Permutation probability values may be exact, or estimated via moment- or resampling-approximation procedures. Because permutation methods are inherently computationally-intensive, the evolution of computers and computing technology that made modern permutation methods possible accompanies the historical narrative. Permutation analogs of many well-known statistical tests are presented in a historical context, including multiple correlation and regression, ana...
New methods of testing nonlinear hypothesis using iterative NLLS estimator
Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.
2017-11-01
This research paper discusses the method of testing nonlinear hypothesis using iterative Nonlinear Least Squares (NLLS) estimator. Takeshi Amemiya [1] explained this method. However in the present research paper, a modified Wald test statistic due to Engle, Robert [6] is proposed to test the nonlinear hypothesis using iterative NLLS estimator. An alternative method for testing nonlinear hypothesis using iterative NLLS estimator based on nonlinear hypothesis using iterative NLLS estimator based on nonlinear studentized residuals has been proposed. In this research article an innovative method of testing nonlinear hypothesis using iterative restricted NLLS estimator is derived. Pesaran and Deaton [10] explained the methods of testing nonlinear hypothesis. This paper uses asymptotic properties of nonlinear least squares estimator proposed by Jenrich [8]. The main purpose of this paper is to provide very innovative methods of testing nonlinear hypothesis using iterative NLLS estimator, iterative NLLS estimator based on nonlinear studentized residuals and iterative restricted NLLS estimator. Eakambaram et al. [12] discussed least absolute deviation estimations versus nonlinear regression model with heteroscedastic errors and also they studied the problem of heteroscedasticity with reference to nonlinear regression models with suitable illustration. William Grene [13] examined the interaction effect in nonlinear models disused by Ai and Norton [14] and suggested ways to examine the effects that do not involve statistical testing. Peter [15] provided guidelines for identifying composite hypothesis and addressing the probability of false rejection for multiple hypotheses.
International Nuclear Information System (INIS)
Vincent, C.H.
1982-01-01
Bayes' principle is applied to the differential counting measurement of a positive quantity in which the statistical errors are not necessarily small in relation to the true value of the quantity. The methods of estimation derived are found to give consistent results and to avoid the anomalous negative estimates sometimes obtained by conventional methods. One of the methods given provides a simple means of deriving the required estimates from conventionally presented results and appears to have wide potential applications. Both methods provide the actual posterior probability distribution of the quantity to be measured. A particularly important potential application is the correction of counts on low radioacitvity samples for background. (orig.)
Statistical Estimation of Heterogeneities: A New Frontier in Well Testing
Neuman, S. P.; Guadagnini, A.; Illman, W. A.; Riva, M.; Vesselinov, V. V.
2001-12-01
Well-testing methods have traditionally relied on analytical solutions of groundwater flow equations in relatively simple domains, consisting of one or at most a few units having uniform hydraulic properties. Recently, attention has been shifting toward methods and solutions that would allow one to characterize subsurface heterogeneities in greater detail. On one hand, geostatistical inverse methods are being used to assess the spatial variability of parameters, such as permeability and porosity, on the basis of multiple cross-hole pressure interference tests. On the other hand, analytical solutions are being developed to describe the mean and variance (first and second statistical moments) of flow to a well in a randomly heterogeneous medium. Geostatistical inverse interpretation of cross-hole tests yields a smoothed but detailed "tomographic" image of how parameters actually vary in three-dimensional space, together with corresponding measures of estimation uncertainty. Moment solutions may soon allow one to interpret well tests in terms of statistical parameters such as the mean and variance of log permeability, its spatial autocorrelation and statistical anisotropy. The idea of geostatistical cross-hole tomography is illustrated through pneumatic injection tests conducted in unsaturated fractured tuff at the Apache Leap Research Site near Superior, Arizona. The idea of using moment equations to interpret well-tests statistically is illustrated through a recently developed three-dimensional solution for steady state flow to a well in a bounded, randomly heterogeneous, statistically anisotropic aquifer.
Gray bootstrap method for estimating frequency-varying random vibration signals with small samples
Directory of Open Access Journals (Sweden)
Wang Yanqing
2014-04-01
Full Text Available During environment testing, the estimation of random vibration signals (RVS is an important technique for the airborne platform safety and reliability. However, the available methods including extreme value envelope method (EVEM, statistical tolerances method (STM and improved statistical tolerance method (ISTM require large samples and typical probability distribution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated interval, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM and gray method (GM in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.
Kansas's forests, 2005: statistics, methods, and quality assurance
Patrick D. Miles; W. Keith Moser; Charles J. Barnett
2011-01-01
The first full annual inventory of Kansas's forests was completed in 2005 after 8,868 plots were selected and 468 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of Kansas inventory is presented...
Nebraska's forests, 2005: statistics, methods, and quality assurance
Patrick D. Miles; Dacia M. Meneguzzo; Charles J. Barnett
2011-01-01
The first full annual inventory of Nebraska's forests was completed in 2005 after 8,335 plots were selected and 274 forested plots were visited and measured. This report includes detailed information on forest inventory methods, and data quality estimates. Tables of various important resource statistics are presented. Detailed analysis of the inventory data are...
Bin mode estimation methods for Compton camera imaging
International Nuclear Information System (INIS)
Ikeda, S.; Odaka, H.; Uemura, M.; Takahashi, T.; Watanabe, S.; Takeda, S.
2014-01-01
We study the image reconstruction problem of a Compton camera which consists of semiconductor detectors. The image reconstruction is formulated as a statistical estimation problem. We employ a bin-mode estimation (BME) and extend an existing framework to a Compton camera with multiple scatterers and absorbers. Two estimation algorithms are proposed: an accelerated EM algorithm for the maximum likelihood estimation (MLE) and a modified EM algorithm for the maximum a posteriori (MAP) estimation. Numerical simulations demonstrate the potential of the proposed methods
Chaudhuri, Probal
1992-01-01
We consider a class of $U$-statistics type estimates for multivariate location. The estimates extend some $R$-estimates to multivariate data. In particular, the class of estimates includes the multivariate median considered by Gini and Galvani (1929) and Haldane (1948) and a multivariate extension of the well-known Hodges-Lehmann (1963) estimate. We explore large sample behavior of these estimates by deriving a Bahadur type representation for them. In the process of developing these asymptoti...
Application of pedagogy reflective in statistical methods course and practicum statistical methods
Julie, Hongki
2017-08-01
Subject Elementary Statistics, Statistical Methods and Statistical Methods Practicum aimed to equip students of Mathematics Education about descriptive statistics and inferential statistics. The students' understanding about descriptive and inferential statistics were important for students on Mathematics Education Department, especially for those who took the final task associated with quantitative research. In quantitative research, students were required to be able to present and describe the quantitative data in an appropriate manner, to make conclusions from their quantitative data, and to create relationships between independent and dependent variables were defined in their research. In fact, when students made their final project associated with quantitative research, it was not been rare still met the students making mistakes in the steps of making conclusions and error in choosing the hypothetical testing process. As a result, they got incorrect conclusions. This is a very fatal mistake for those who did the quantitative research. There were some things gained from the implementation of reflective pedagogy on teaching learning process in Statistical Methods and Statistical Methods Practicum courses, namely: 1. Twenty two students passed in this course and and one student did not pass in this course. 2. The value of the most accomplished student was A that was achieved by 18 students. 3. According all students, their critical stance could be developed by them, and they could build a caring for each other through a learning process in this course. 4. All students agreed that through a learning process that they undergo in the course, they can build a caring for each other.
Unemployment estimation: Spatial point referenced methods and models
Pereira, Soraia
2017-06-26
Portuguese Labor force survey, from 4th quarter of 2014 onwards, started geo-referencing the sampling units, namely the dwellings in which the surveys are carried. This opens new possibilities in analysing and estimating unemployment and its spatial distribution across any region. The labor force survey choose, according to an preestablished sampling criteria, a certain number of dwellings across the nation and survey the number of unemployed in these dwellings. Based on this survey, the National Statistical Institute of Portugal presently uses direct estimation methods to estimate the national unemployment figures. Recently, there has been increased interest in estimating these figures in smaller areas. Direct estimation methods, due to reduced sampling sizes in small areas, tend to produce fairly large sampling variations therefore model based methods, which tend to
Illinois' Forests, 2005: Statistics, Methods, and Quality Assurance
Susan J. Crocker; Charles J. Barnett; Mark A. Hatfield
2013-01-01
The first full annual inventory of Illinois' forests was completed in 2005. This report contains 1) descriptive information on methods, statistics, and quality assurance of data collection, 2) a glossary of terms, 3) tables that summarize quality assurance, and 4) a core set of tabular estimates for a variety of forest resources. A detailed analysis of inventory...
An Introduction to Confidence Intervals for Both Statistical Estimates and Effect Sizes.
Capraro, Mary Margaret
This paper summarizes methods of estimating confidence intervals, including classical intervals and intervals for effect sizes. The recent American Psychological Association (APA) Task Force on Statistical Inference report suggested that confidence intervals should always be reported, and the fifth edition of the APA "Publication Manual"…
Zheng Hui; Schanzer Dena L; Gilmore Jason
2011-01-01
Abstract Background As many respiratory viruses are responsible for influenza like symptoms, accurate measures of the disease burden are not available and estimates are generally based on statistical methods. The objective of this study was to estimate absenteeism rates and hours lost due to seasonal influenza and compare these estimates with estimates of absenteeism attributable to the two H1N1 pandemic waves that occurred in 2009. Methods Key absenteeism variables were extracted from Statis...
Information-theoretic methods for estimating of complicated probability distributions
Zong, Zhi
2006-01-01
Mixing up various disciplines frequently produces something that are profound and far-reaching. Cybernetics is such an often-quoted example. Mix of information theory, statistics and computing technology proves to be very useful, which leads to the recent development of information-theory based methods for estimating complicated probability distributions. Estimating probability distribution of a random variable is the fundamental task for quite some fields besides statistics, such as reliability, probabilistic risk analysis (PSA), machine learning, pattern recognization, image processing, neur
Statistical Estimation of the Age of the Universe
DEFF Research Database (Denmark)
Petersen, Jørgen Holm
The Hubble constant enters big bang cosmology by quantifying the expansion rate of the universe. It is shown that the standard technique for estimation of Hubble's constant is statistically inconsistent and results in a systematically too low value. An alternative, consistent estimator of Hubble...
Mendoza-Rosas, Ana Teresa; De la Cruz-Reyna, Servando
2008-09-01
The probabilistic analysis of volcanic eruption time series is an essential step for the assessment of volcanic hazard and risk. Such series describe complex processes involving different types of eruptions over different time scales. A statistical method linking geological and historical eruption time series is proposed for calculating the probabilities of future eruptions. The first step of the analysis is to characterize the eruptions by their magnitudes. As is the case in most natural phenomena, lower magnitude events are more frequent, and the behavior of the eruption series may be biased by such events. On the other hand, eruptive series are commonly studied using conventional statistics and treated as homogeneous Poisson processes. However, time-dependent series, or sequences including rare or extreme events, represented by very few data of large eruptions require special methods of analysis, such as the extreme-value theory applied to non-homogeneous Poisson processes. Here we propose a general methodology for analyzing such processes attempting to obtain better estimates of the volcanic hazard. This is done in three steps: Firstly, the historical eruptive series is complemented with the available geological eruption data. The linking of these series is done assuming an inverse relationship between the eruption magnitudes and the occurrence rate of each magnitude class. Secondly, we perform a Weibull analysis of the distribution of repose time between successive eruptions. Thirdly, the linked eruption series are analyzed as a non-homogeneous Poisson process with a generalized Pareto distribution as intensity function. As an application, the method is tested on the eruption series of five active polygenetic Mexican volcanoes: Colima, Citlaltépetl, Nevado de Toluca, Popocatépetl and El Chichón, to obtain hazard estimates.
The Impact of Statistical Leakage Models on Design Yield Estimation
Directory of Open Access Journals (Sweden)
Rouwaida Kanj
2011-01-01
Full Text Available Device mismatch and process variation models play a key role in determining the functionality and yield of sub-100 nm design. Average characteristics are often of interest, such as the average leakage current or the average read delay. However, detecting rare functional fails is critical for memory design and designers often seek techniques that enable accurately modeling such events. Extremely leaky devices can inflict functionality fails. The plurality of leaky devices on a bitline increase the dimensionality of the yield estimation problem. Simplified models are possible by adopting approximations to the underlying sum of lognormals. The implications of such approximations on tail probabilities may in turn bias the yield estimate. We review different closed form approximations and compare against the CDF matching method, which is shown to be most effective method for accurate statistical leakage modeling.
Efficient bootstrap estimates for tail statistics
Breivik, Øyvind; Aarnes, Ole Johan
2017-03-01
Bootstrap resamples can be used to investigate the tail of empirical distributions as well as return value estimates from the extremal behaviour of the sample. Specifically, the confidence intervals on return value estimates or bounds on in-sample tail statistics can be obtained using bootstrap techniques. However, non-parametric bootstrapping from the entire sample is expensive. It is shown here that it suffices to bootstrap from a small subset consisting of the highest entries in the sequence to make estimates that are essentially identical to bootstraps from the entire sample. Similarly, bootstrap estimates of confidence intervals of threshold return estimates are found to be well approximated by using a subset consisting of the highest entries. This has practical consequences in fields such as meteorology, oceanography and hydrology where return values are calculated from very large gridded model integrations spanning decades at high temporal resolution or from large ensembles of independent and identically distributed model fields. In such cases the computational savings are substantial.
THE GROWTH POINTS OF STATISTICAL METHODS
Orlov A. I.
2014-01-01
On the basis of a new paradigm of applied mathematical statistics, data analysis and economic-mathematical methods are identified; we have also discussed five topical areas in which modern applied statistics is developing as well as the other statistical methods, i.e. five "growth points" – nonparametric statistics, robustness, computer-statistical methods, statistics of interval data, statistics of non-numeric data
An improved method for statistical analysis of raw accelerator mass spectrometry data
International Nuclear Information System (INIS)
Gutjahr, A.; Phillips, F.; Kubik, P.W.; Elmore, D.
1987-01-01
Hierarchical statistical analysis is an appropriate method for statistical treatment of raw accelerator mass spectrometry (AMS) data. Using Monte Carlo simulations we show that this method yields more accurate estimates of isotope ratios and analytical uncertainty than the generally used propagation of errors approach. The hierarchical analysis is also useful in design of experiments because it can be used to identify sources of variability. 8 refs., 2 figs
History by history statistical estimators in the BEAM code system
International Nuclear Information System (INIS)
Walters, B.R.B.; Kawrakow, I.; Rogers, D.W.O.
2002-01-01
A history by history method for estimating uncertainties has been implemented in the BEAMnrc and DOSXYZnrc codes replacing the method of statistical batches. This method groups scored quantities (e.g., dose) by primary history. When phase-space sources are used, this method groups incident particles according to the primary histories that generated them. This necessitated adding markers (negative energy) to phase-space files to indicate the first particle generated by a new primary history. The new method greatly reduces the uncertainty in the uncertainty estimate. The new method eliminates one dimension (which kept the results for each batch) from all scoring arrays, resulting in memory requirement being decreased by a factor of 2. Correlations between particles in phase-space sources are taken into account. The only correlations with any significant impact on uncertainty are those introduced by particle recycling. Failure to account for these correlations can result in a significant underestimate of the uncertainty. The previous method of accounting for correlations due to recycling by placing all recycled particles in the same batch did work. Neither the new method nor the batch method take into account correlations between incident particles when a phase-space source is restarted so one must avoid restarts
Engineer’s estimate reliability and statistical characteristics of bids
Directory of Open Access Journals (Sweden)
Fariborz M. Tehrani
2016-12-01
Full Text Available The objective of this report is to provide a methodology for examining bids and evaluating the performance of engineer’s estimates in capturing the true cost of projects. This study reviews the cost development for transportation projects in addition to two sources of uncertainties in a cost estimate, including modeling errors and inherent variability. Sample projects are highway maintenance projects with a similar scope of the work, size, and schedule. Statistical analysis of engineering estimates and bids examines the adaptability of statistical models for sample projects. Further, the variation of engineering cost estimates from inception to implementation has been presented and discussed for selected projects. Moreover, the applicability of extreme values theory is assessed for available data. The results indicate that the performance of engineer’s estimate is best evaluated based on trimmed average of bids, excluding discordant bids.
Influence of the statistical distribution of bioassay measurement errors on the intake estimation
International Nuclear Information System (INIS)
Lee, T. Y; Kim, J. K
2006-01-01
The purpose of this study is to provide the guidance necessary for making a selection of error distributions by analyzing influence of statistical distribution for a type of bioassay measurement error on the intake estimation. For this purpose, intakes were estimated using maximum likelihood method for cases that error distributions are normal and lognormal, and comparisons between two distributions for the estimated intakes were made. According to the results of this study, in case that measurement results for lung retention are somewhat greater than the limit of detection it appeared that distribution types have negligible influence on the results. Whereas in case of measurement results for the daily excretion rate, the results obtained from assumption of a lognormal distribution were 10% higher than those obtained from assumption of a normal distribution. In view of these facts, in case where uncertainty component is governed by counting statistics it is considered that distribution type have no influence on intake estimation. Whereas in case where the others are predominant, it is concluded that it is clearly desirable to estimate the intake assuming a lognormal distribution
Estimation methods for process holdup of special nuclear materials
International Nuclear Information System (INIS)
Pillay, K.K.S.; Picard, R.R.; Marshall, R.S.
1984-06-01
The US Nuclear Regulatory Commission sponsored a research study at the Los Alamos National Laboratory to explore the possibilities of developing statistical estimation methods for materials holdup at highly enriched uranium (HEU)-processing facilities. Attempts at using historical holdup data from processing facilities and selected holdup measurements at two operating facilities confirmed the need for high-quality data and reasonable control over process parameters in developing statistical models for holdup estimations. A major effort was therefore directed at conducting large-scale experiments to demonstrate the value of statistical estimation models from experimentally measured data of good quality. Using data from these experiments, we developed statistical models to estimate residual inventories of uranium in large process equipment and facilities. Some of the important findings of this investigation are the following: prediction models for the residual holdup of special nuclear material (SNM) can be developed from good-quality historical data on holdup; holdup data from several of the equipment used at HEU-processing facilities, such as air filters, ductwork, calciners, dissolvers, pumps, pipes, and pipe fittings, readily lend themselves to statistical modeling of holdup; holdup profiles of process equipment such as glove boxes, precipitators, and rotary drum filters can change with time; therefore, good estimation of residual inventories in these types of equipment requires several measurements at the time of inventory; although measurement of residual holdup of SNM in large facilities is a challenging task, reasonable estimates of the hidden inventories of holdup to meet the regulatory requirements can be accomplished through a combination of good measurements and the use of statistical models. 44 references, 62 figures, 43 tables
South Dakota's forests, 2005: statistics, methods, and quality assurance
Patrick D. Miles; Ronald J. Piva; Charles J. Barnett
2011-01-01
The first full annual inventory of South Dakota's forests was completed in 2005 after 8,302 plots were selected and 325 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of the South Dakota...
North Dakota's forests, 2005: statistics, methods, and quality assurance
Patrick D. Miles; David E. Haugen; Charles J. Barnett
2011-01-01
The first full annual inventory of North Dakota's forests was completed in 2005 after 7,622 plots were selected and 164 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of the North Dakota...
A Fast LMMSE Channel Estimation Method for OFDM Systems
Directory of Open Access Journals (Sweden)
Zhou Wen
2009-01-01
Full Text Available A fast linear minimum mean square error (LMMSE channel estimation method has been proposed for Orthogonal Frequency Division Multiplexing (OFDM systems. In comparison with the conventional LMMSE channel estimation, the proposed channel estimation method does not require the statistic knowledge of the channel in advance and avoids the inverse operation of a large dimension matrix by using the fast Fourier transform (FFT operation. Therefore, the computational complexity can be reduced significantly. The normalized mean square errors (NMSEs of the proposed method and the conventional LMMSE estimation have been derived. Numerical results show that the NMSE of the proposed method is very close to that of the conventional LMMSE method, which is also verified by computer simulation. In addition, computer simulation shows that the performance of the proposed method is almost the same with that of the conventional LMMSE method in terms of bit error rate (BER.
A statistical approach to estimate the LYAPUNOV spectrum in disc brake squeal
Oberst, S.; Lai, J. C. S.
2015-01-01
The estimation of squeal propensity of a brake system from the prediction of unstable vibration modes using the linear complex eigenvalue analysis (CEA) in the frequency domain has its fair share of successes and failures. While the CEA is almost standard practice for the automotive industry, time domain methods and the estimation of LYAPUNOV spectra have not received much attention in brake squeal analyses. One reason is the challenge in estimating the true LYAPUNOV exponents and their discrimination against spurious ones in experimental data. A novel method based on the application of the ECKMANN-RUELLE matrices is proposed here to estimate LYAPUNOV exponents by using noise in a statistical procedure. It is validated with respect to parameter variations and dimension estimates. By counting the number of non-overlapping confidence intervals for LYAPUNOV exponent distributions obtained by moving a window of increasing size over bootstrapped same-length estimates of an observation function, a dispersion measure's width is calculated and fed into a BAYESIAN beta-binomial model. Results obtained using this method for benchmark models of white and pink noise as well as the classical HENON map indicate that true LYAPUNOV exponents can be isolated from spurious ones with high confidence. The method is then applied to accelerometer and microphone data obtained from brake squeal tests. Estimated LYAPUNOV exponents indicate that the pad's out-of-plane vibration behaves quasi-periodically on the brink to chaos while the microphone's squeal signal remains periodic.
International Nuclear Information System (INIS)
Zhang Youcai; Yang Xiaohu; Springel, Volker
2010-01-01
We study the topology of cosmic large-scale structure through the genus statistics, using galaxy catalogs generated from the Millennium Simulation and observational data from the latest Sloan Digital Sky Survey Data Release (SDSS DR7). We introduce a new method for constructing galaxy density fields and for measuring the genus statistics of its isodensity surfaces. It is based on a Delaunay tessellation field estimation (DTFE) technique that allows the definition of a piece-wise continuous density field and the exact computation of the topology of its polygonal isodensity contours, without introducing any free numerical parameter. Besides this new approach, we also employ the traditional approaches of smoothing the galaxy distribution with a Gaussian of fixed width, or by adaptively smoothing with a kernel that encloses a constant number of neighboring galaxies. Our results show that the Delaunay-based method extracts the largest amount of topological information. Unlike the traditional approach for genus statistics, it is able to discriminate between the different theoretical galaxy catalogs analyzed here, both in real space and in redshift space, even though they are based on the same underlying simulation model. In particular, the DTFE approach detects with high confidence a discrepancy of one of the semi-analytic models studied here compared with the SDSS data, while the other models are found to be consistent.
Permutation statistical methods an integrated approach
Berry, Kenneth J; Johnston, Janis E
2016-01-01
This research monograph provides a synthesis of a number of statistical tests and measures, which, at first consideration, appear disjoint and unrelated. Numerous comparisons of permutation and classical statistical methods are presented, and the two methods are compared via probability values and, where appropriate, measures of effect size. Permutation statistical methods, compared to classical statistical methods, do not rely on theoretical distributions, avoid the usual assumptions of normality and homogeneity of variance, and depend only on the data at hand. This text takes a unique approach to explaining statistics by integrating a large variety of statistical methods, and establishing the rigor of a topic that to many may seem to be a nascent field in statistics. This topic is new in that it took modern computing power to make permutation methods available to people working in the mainstream of research. This research monograph addresses a statistically-informed audience, and can also easily serve as a ...
Conventional estimating method of earthquake response of mechanical appendage system
International Nuclear Information System (INIS)
Aoki, Shigeru; Suzuki, Kohei
1981-01-01
Generally, for the estimation of the earthquake response of appendage structure system installed in main structure system, the method of floor response analysis using the response spectra at the point of installing the appendage system has been used. On the other hand, the research on the estimation of the earthquake response of appendage system by the statistical procedure based on probability process theory has been reported. The development of a practical method for simply estimating the response is an important subject in aseismatic engineering. In this study, the method of estimating the earthquake response of appendage system in the general case that the natural frequencies of both structure systems were different was investigated. First, it was shown that floor response amplification factor was able to be estimated simply by giving the ratio of the natural frequencies of both structure systems, and its statistical property was clarified. Next, it was elucidated that the procedure of expressing acceleration, velocity and displacement responses with tri-axial response spectra simultaneously was able to be applied to the expression of FRAF. The applicability of this procedure to nonlinear system was examined. (Kako, I.)
On the estimation of multiple random integrals and U-statistics
Major, Péter
2013-01-01
This work starts with the study of those limit theorems in probability theory for which classical methods do not work. In many cases some form of linearization can help to solve the problem, because the linearized version is simpler. But in order to apply such a method we have to show that the linearization causes a negligible error. The estimation of this error leads to some important large deviation type problems, and the main subject of this work is their investigation. We provide sharp estimates of the tail distribution of multiple integrals with respect to a normalized empirical measure and so-called degenerate U-statistics and also of the supremum of appropriate classes of such quantities. The proofs apply a number of useful techniques of modern probability that enable us to investigate the non-linear functionals of independent random variables. This lecture note yields insights into these methods, and may also be useful for those who only want some new tools to help them prove limit theorems when stand...
Estimation of Lithological Classification in Taipei Basin: A Bayesian Maximum Entropy Method
Wu, Meng-Ting; Lin, Yuan-Chien; Yu, Hwa-Lung
2015-04-01
In environmental or other scientific applications, we must have a certain understanding of geological lithological composition. Because of restrictions of real conditions, only limited amount of data can be acquired. To find out the lithological distribution in the study area, many spatial statistical methods used to estimate the lithological composition on unsampled points or grids. This study applied the Bayesian Maximum Entropy (BME method), which is an emerging method of the geological spatiotemporal statistics field. The BME method can identify the spatiotemporal correlation of the data, and combine not only the hard data but the soft data to improve estimation. The data of lithological classification is discrete categorical data. Therefore, this research applied Categorical BME to establish a complete three-dimensional Lithological estimation model. Apply the limited hard data from the cores and the soft data generated from the geological dating data and the virtual wells to estimate the three-dimensional lithological classification in Taipei Basin. Keywords: Categorical Bayesian Maximum Entropy method, Lithological Classification, Hydrogeological Setting
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix
Hu, Zongliang; Dong, Kai; Dai, Wenlin; Tong, Tiejun
2017-01-01
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix
Hu, Zongliang
2017-09-27
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix.
Hu, Zongliang; Dong, Kai; Dai, Wenlin; Tong, Tiejun
2017-09-21
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
DEFF Research Database (Denmark)
Badger, Jake; Frank, Helmut; Hahmann, Andrea N.
2014-01-01
This paper demonstrates that a statistical dynamical method can be used to accurately estimate the wind climate at a wind farm site. In particular, postprocessing of mesoscale model output allows an efficient calculation of the local wind climate required for wind resource estimation at a wind...
The large break LOCA evaluation method with the simplified statistic approach
International Nuclear Information System (INIS)
Kamata, Shinya; Kubo, Kazuo
2004-01-01
USNRC published the Code Scaling, Applicability and Uncertainty (CSAU) evaluation methodology to large break LOCA which supported the revised rule for Emergency Core Cooling System performance in 1989. In USNRC regulatory guide 1.157, it is required that the peak cladding temperature (PCT) cannot exceed 2200deg F with high probability 95th percentile. In recent years, overseas countries have developed statistical methodology and best estimate code with the model which can provide more realistic simulation for the phenomena based on the CSAU evaluation methodology. In order to calculate PCT probability distribution by Monte Carlo trials, there are approaches such as the response surface technique using polynomials, the order statistics method, etc. For the purpose of performing rational statistic analysis, Mitsubishi Heavy Industries, LTD (MHI) tried to develop the statistic LOCA method using the best estimate LOCA code MCOBRA/TRAC and the simplified code HOTSPOT. HOTSPOT is a Monte Carlo heat conduction solver to evaluate the uncertainties of the significant fuel parameters at the PCT positions of the hot rod. The direct uncertainty sensitivity studies can be performed without the response surface because the Monte Carlo simulation for key parameters can be performed in short time using HOTSPOT. With regard to the parameter uncertainties, MHI established the treatment that the bounding conditions are given for LOCA boundary and plant initial conditions, the Monte Carlo simulation using HOTSPOT is applied to the significant fuel parameters. The paper describes the large break LOCA evaluation method with the simplified statistic approach and the results of the application of the method to the representative four-loop nuclear power plant. (author)
Conditional maximum-entropy method for selecting prior distributions in Bayesian statistics
Abe, Sumiyoshi
2014-11-01
The conditional maximum-entropy method (abbreviated here as C-MaxEnt) is formulated for selecting prior probability distributions in Bayesian statistics for parameter estimation. This method is inspired by a statistical-mechanical approach to systems governed by dynamics with largely separated time scales and is based on three key concepts: conjugate pairs of variables, dimensionless integration measures with coarse-graining factors and partial maximization of the joint entropy. The method enables one to calculate a prior purely from a likelihood in a simple way. It is shown, in particular, how it not only yields Jeffreys's rules but also reveals new structures hidden behind them.
Directory of Open Access Journals (Sweden)
Victor V. Nikitin
2013-01-01
Full Text Available The article introduces the algorithm of Russia’s regions investment potential estimation, developed by means of multivariate statistical methods, determines the factors, reflecting regions investment state. The integral indicator was developed on their basis, using statistical data. The article presents regions’ classification on the basis of the integral index
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Coalescent methods for estimating phylogenetic trees.
Liu, Liang; Yu, Lili; Kubatko, Laura; Pearl, Dennis K; Edwards, Scott V
2009-10-01
We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.
A simple method for estimating the convection- dispersion equation ...
African Journals Online (AJOL)
Jane
2011-08-31
Aug 31, 2011 ... approach of modeling solute transport in porous media uses the deterministic ... Methods of estimating CDE transport parameters can be divided into statistical ..... diffusion-type model for longitudinal mixing of fluids in flow.
Statistical methods towards more efficient infiltration measurements.
Franz, T; Krebs, P
2006-01-01
A comprehensive knowledge about the infiltration situation in a catchment is required for operation and maintenance. Due to the high expenditures, an optimisation of necessary measurement campaigns is essential. Methods based on multivariate statistics were developed to improve the information yield of measurements by identifying appropriate gauge locations. The methods have a high degree of freedom against data needs. They were successfully tested on real and artificial data. For suitable catchments, it is estimated that the optimisation potential amounts up to 30% accuracy improvement compared to nonoptimised gauge distributions. Beside this, a correlation between independent reach parameters and dependent infiltration rates could be identified, which is not dominated by the groundwater head.
Statistical image reconstruction methods for simultaneous emission/transmission PET scans
International Nuclear Information System (INIS)
Erdogan, H.; Fessler, J.A.
1996-01-01
Transmission scans are necessary for estimating the attenuation correction factors (ACFs) to yield quantitatively accurate PET emission images. To reduce the total scan time, post-injection transmission scans have been proposed in which one can simultaneously acquire emission and transmission data using rod sources and sinogram windowing. However, since the post-injection transmission scans are corrupted by emission coincidences, accurate correction for attenuation becomes more challenging. Conventional methods (emission subtraction) for ACF computation from post-injection scans are suboptimal and require relatively long scan times. We introduce statistical methods based on penalized-likelihood objectives to compute ACFs and then use them to reconstruct lower noise PET emission images from simultaneous transmission/emission scans. Simulations show the efficacy of the proposed methods. These methods improve image quality and SNR of the estimates as compared to conventional methods
Statistical methods applied to gamma-ray spectroscopy algorithms in nuclear security missions.
Fagan, Deborah K; Robinson, Sean M; Runkle, Robert C
2012-10-01
Gamma-ray spectroscopy is a critical research and development priority to a range of nuclear security missions, specifically the interdiction of special nuclear material involving the detection and identification of gamma-ray sources. We categorize existing methods by the statistical methods on which they rely and identify methods that have yet to be considered. Current methods estimate the effect of counting uncertainty but in many cases do not address larger sources of decision uncertainty, which may be significantly more complex. Thus, significantly improving algorithm performance may require greater coupling between the problem physics that drives data acquisition and statistical methods that analyze such data. Untapped statistical methods, such as Bayes Modeling Averaging and hierarchical and empirical Bayes methods, could reduce decision uncertainty by rigorously and comprehensively incorporating all sources of uncertainty. Application of such methods should further meet the needs of nuclear security missions by improving upon the existing numerical infrastructure for which these analyses have not been conducted. Copyright © 2012 Elsevier Ltd. All rights reserved.
A SOFTWARE RELIABILITY ESTIMATION METHOD TO NUCLEAR SAFETY SOFTWARE
Directory of Open Access Journals (Sweden)
GEE-YONG PARK
2014-02-01
Full Text Available A method for estimating software reliability for nuclear safety software is proposed in this paper. This method is based on the software reliability growth model (SRGM, where the behavior of software failure is assumed to follow a non-homogeneous Poisson process. Two types of modeling schemes based on a particular underlying method are proposed in order to more precisely estimate and predict the number of software defects based on very rare software failure data. The Bayesian statistical inference is employed to estimate the model parameters by incorporating software test cases as a covariate into the model. It was identified that these models are capable of reasonably estimating the remaining number of software defects which directly affects the reactor trip functions. The software reliability might be estimated from these modeling equations, and one approach of obtaining software reliability value is proposed in this paper.
Register-based statistics statistical methods for administrative data
Wallgren, Anders
2014-01-01
This book provides a comprehensive and up to date treatment of theory and practical implementation in Register-based statistics. It begins by defining the area, before explaining how to structure such systems, as well as detailing alternative approaches. It explains how to create statistical registers, how to implement quality assurance, and the use of IT systems for register-based statistics. Further to this, clear details are given about the practicalities of implementing such statistical methods, such as protection of privacy and the coordination and coherence of such an undertaking. Thi
Moddemeijer, R
In the case of two signals with independent pairs of observations (x(n),y(n)) a statistic to estimate the variance of the histogram based mutual information estimator has been derived earlier. We present such a statistic for dependent pairs. To derive this statistic it is necessary to avail of a
Evolutionary Computation Methods and their applications in Statistics
Directory of Open Access Journals (Sweden)
Francesco Battaglia
2013-05-01
Full Text Available A brief discussion of the genesis of evolutionary computation methods, their relationship to artificial intelligence, and the contribution of genetics and Darwin’s theory of natural evolution is provided. Then, the main evolutionary computation methods are illustrated: evolution strategies, genetic algorithms, estimation of distribution algorithms, differential evolution, and a brief description of some evolutionary behavior methods such as ant colony and particle swarm optimization. We also discuss the role of the genetic algorithm for multivariate probability distribution random generation, rather than as a function optimizer. Finally, some relevant applications of genetic algorithm to statistical problems are reviewed: selection of variables in regression, time series model building, outlier identification, cluster analysis, design of experiments.
Performance evaluation of the spectral centroid downshift method for attenuation estimation.
Samimi, Kayvan; Varghese, Tomy
2015-05-01
Estimation of frequency-dependent ultrasonic attenuation is an important aspect of tissue characterization. Along with other acoustic parameters studied in quantitative ultrasound, the attenuation coefficient can be used to differentiate normal and pathological tissue. The spectral centroid downshift (CDS) method is one the most common frequencydomain approaches applied to this problem. In this study, a statistical analysis of this method's performance was carried out based on a parametric model of the signal power spectrum in the presence of electronic noise. The parametric model used for the power spectrum of received RF data assumes a Gaussian spectral profile for the transmit pulse, and incorporates effects of attenuation, windowing, and electronic noise. Spectral moments were calculated and used to estimate second-order centroid statistics. A theoretical expression for the variance of a maximum likelihood estimator of attenuation coefficient was derived in terms of the centroid statistics and other model parameters, such as transmit pulse center frequency and bandwidth, RF data window length, SNR, and number of regression points. Theoretically predicted estimation variances were compared with experimentally estimated variances on RF data sets from both computer-simulated and physical tissue-mimicking phantoms. Scan parameter ranges for this study were electronic SNR from 10 to 70 dB, transmit pulse standard deviation from 0.5 to 4.1 MHz, transmit pulse center frequency from 2 to 8 MHz, and data window length from 3 to 17 mm. Acceptable agreement was observed between theoretical predictions and experimentally estimated values with differences smaller than 0.05 dB/cm/MHz across the parameter ranges investigated. This model helps predict the best attenuation estimation variance achievable with the CDS method, in terms of said scan parameters.
Fast and Statistically Efficient Fundamental Frequency Estimation
DEFF Research Database (Denmark)
Nielsen, Jesper Kjær; Jensen, Tobias Lindstrøm; Jensen, Jesper Rindom
2016-01-01
Fundamental frequency estimation is a very important task in many applications involving periodic signals. For computational reasons, fast autocorrelation-based estimation methods are often used despite parametric estimation methods having superior estimation accuracy. However, these parametric...... a recursive solver. Via benchmarks, we demonstrate that the computation time is reduced by approximately two orders of magnitude. The proposed fast algorithm is available for download online....
Statistical methods for nuclear material management
International Nuclear Information System (INIS)
Bowen, W.M.; Bennett, C.A.
1988-12-01
This book is intended as a reference manual of statistical methodology for nuclear material management practitioners. It describes statistical methods currently or potentially important in nuclear material management, explains the choice of methods for specific applications, and provides examples of practical applications to nuclear material management problems. Together with the accompanying training manual, which contains fully worked out problems keyed to each chapter, this book can also be used as a textbook for courses in statistical methods for nuclear material management. It should provide increased understanding and guidance to help improve the application of statistical methods to nuclear material management problems
Statistical methods for nuclear material management
Energy Technology Data Exchange (ETDEWEB)
Bowen W.M.; Bennett, C.A. (eds.)
1988-12-01
This book is intended as a reference manual of statistical methodology for nuclear material management practitioners. It describes statistical methods currently or potentially important in nuclear material management, explains the choice of methods for specific applications, and provides examples of practical applications to nuclear material management problems. Together with the accompanying training manual, which contains fully worked out problems keyed to each chapter, this book can also be used as a textbook for courses in statistical methods for nuclear material management. It should provide increased understanding and guidance to help improve the application of statistical methods to nuclear material management problems.
International Nuclear Information System (INIS)
Yamaoka, Naoto; Watanabe, Wataru; Hontani, Hidekata
2010-01-01
Most of the time when we construct statistical point cloud model, we need to calculate the corresponding points. Constructed statistical model will not be the same if we use different types of method to calculate the corresponding points. This article proposes the effect to statistical model of human organ made by different types of method to calculate the corresponding points. We validated the performance of statistical model by registering a surface of an organ in a 3D medical image. We compare two methods to calculate corresponding points. The first, the 'Generalized Multi-Dimensional Scaling (GMDS)', determines the corresponding points by the shapes of two curved surfaces. The second approach, the 'Entropy-based Particle system', chooses corresponding points by calculating a number of curved surfaces statistically. By these methods we construct the statistical models and using these models we conducted registration with the medical image. For the estimation, we use non-parametric belief propagation and this method estimates not only the position of the organ but also the probability density of the organ position. We evaluate how the two different types of method that calculates corresponding points affects the statistical model by change in probability density of each points. (author)
Targeted estimation of nuisance parameters to obtain valid statistical inference.
van der Laan, Mark J
2014-01-01
In order to obtain concrete results, we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so-called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically unbiased at any data distribution in the statistical model, it is essential to use data-adaptive estimators of these nuisance parameters such as ensemble learning, and specifically super-learning. Because such estimators involve optimal trade-off of bias and variance w.r.t. the infinite dimensional nuisance parameter itself, they result in a sub-optimal bias/variance trade-off for the resulting real-valued estimator of the estimand. We demonstrate that additional targeting of the estimators of these nuisance parameters guarantees that this bias for the estimand is second order and thereby allows us to prove theorems that establish asymptotic linearity of the estimator of the treatment specific mean under regularity conditions. These insights result in novel targeted minimum loss-based estimators (TMLEs) that use ensemble learning with additional targeted bias reduction to construct estimators of the nuisance parameters. In particular, we construct collaborative TMLEs (C-TMLEs) with known influence curve allowing for statistical inference, even though these C-TMLEs involve variable selection for the propensity score based on a criterion that measures how effective the resulting fit of the propensity score is in removing bias for the estimand. As a particular special
Kergadallan, Xavier; Bernardara, Pietro; Benoit, Michel; Andreewsky, Marc; Weiss, Jérôme
2013-04-01
Estimating the probability of occurrence of extreme sea levels is a central issue for the protection of the coast. Return periods of sea level with wave set-up contribution are estimated here in one site : Cherbourg in France in the English Channel. The methodology follows two steps : the first one is computation of joint probability of simultaneous wave height and still sea level, the second one is interpretation of that joint probabilities to assess a sea level for a given return period. Two different approaches were evaluated to compute joint probability of simultaneous wave height and still sea level : the first one is multivariate extreme values distributions of logistic type in which all components of the variables become large simultaneously, the second one is conditional approach for multivariate extreme values in which only one component of the variables have to be large. Two different methods were applied to estimate sea level with wave set-up contribution for a given return period : Monte-Carlo simulation in which estimation is more accurate but needs higher calculation time and classical ocean engineering design contours of type inverse-FORM in which the method is simpler and allows more complex estimation of wave setup part (wave propagation to the coast for example). We compare results from the two different approaches with the two different methods. To be able to use both Monte-Carlo simulation and design contours methods, wave setup is estimated with an simple empirical formula. We show advantages of the conditional approach compared to the multivariate extreme values approach when extreme sea-level occurs when either surge or wave height is large. We discuss the validity of the ocean engineering design contours method which is an alternative when computation of sea levels is too complex to use Monte-Carlo simulation method.
Gagnon, Pieter; Margolis, Robert; Melius, Jennifer; Phillips, Caleb; Elmore, Ryan
2018-02-01
We provide a detailed estimate of the technical potential of rooftop solar photovoltaic (PV) electricity generation throughout the contiguous United States. This national estimate is based on an analysis of select US cities that combines light detection and ranging (lidar) data with a validated analytical method for determining rooftop PV suitability employing geographic information systems. We use statistical models to extend this analysis to estimate the quantity and characteristics of roofs in areas not covered by lidar data. Finally, we model PV generation for all rooftops to yield technical potential estimates. At the national level, 8.13 billion m2 of suitable roof area could host 1118 GW of PV capacity, generating 1432 TWh of electricity per year. This would equate to 38.6% of the electricity that was sold in the contiguous United States in 2013. This estimate is substantially higher than a previous estimate made by the National Renewable Energy Laboratory. The difference can be attributed to increases in PV module power density, improved estimation of building suitability, higher estimates of total number of buildings, and improvements in PV performance simulation tools that previously tended to underestimate productivity. Also notable, the nationwide percentage of buildings suitable for at least some PV deployment is high—82% for buildings smaller than 5000 ft2 and over 99% for buildings larger than that. In most states, rooftop PV could enable small, mostly residential buildings to offset the majority of average household electricity consumption. Even in some states with a relatively poor solar resource, such as those in the Northeast, the residential sector has the potential to offset around 100% of its total electricity consumption with rooftop PV.
Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.
2017-12-01
Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.
Dumedah, Gift; Walker, Jeffrey P.; Chik, Li
2014-07-01
Soil moisture information is critically important for water management operations including flood forecasting, drought monitoring, and groundwater recharge estimation. While an accurate and continuous record of soil moisture is required for these applications, the available soil moisture data, in practice, is typically fraught with missing values. There are a wide range of methods available to infilling hydrologic variables, but a thorough inter-comparison between statistical methods and artificial neural networks has not been made. This study examines 5 statistical methods including monthly averages, weighted Pearson correlation coefficient, a method based on temporal stability of soil moisture, and a weighted merging of the three methods, together with a method based on the concept of rough sets. Additionally, 9 artificial neural networks are examined, broadly categorized into feedforward, dynamic, and radial basis networks. These 14 infilling methods were used to estimate missing soil moisture records and subsequently validated against known values for 13 soil moisture monitoring stations for three different soil layer depths in the Yanco region in southeast Australia. The evaluation results show that the top three highest performing methods are the nonlinear autoregressive neural network, rough sets method, and monthly replacement. A high estimation accuracy (root mean square error (RMSE) of about 0.03 m/m) was found in the nonlinear autoregressive network, due to its regression based dynamic network which allows feedback connections through discrete-time estimation. An equally high accuracy (0.05 m/m RMSE) in the rough sets procedure illustrates the important role of temporal persistence of soil moisture, with the capability to account for different soil moisture conditions.
Statistical methods in quality assurance
International Nuclear Information System (INIS)
Eckhard, W.
1980-01-01
During the different phases of a production process - planning, development and design, manufacturing, assembling, etc. - most of the decision rests on a base of statistics, the collection, analysis and interpretation of data. Statistical methods can be thought of as a kit of tools to help to solve problems in the quality functions of the quality loop with respect to produce quality products and to reduce quality costs. Various statistical methods are represented, typical examples for their practical application are demonstrated. (RW)
On the Methods for Estimating the Corneoscleral Limbus.
Jesus, Danilo A; Iskander, D Robert
2017-08-01
The aim of this study was to develop computational methods for estimating limbus position based on the measurements of three-dimensional (3-D) corneoscleral topography and ascertain whether corneoscleral limbus routinely estimated from the frontal image corresponds to that derived from topographical information. Two new computational methods for estimating the limbus position are proposed: One based on approximating the raw anterior eye height data by series of Zernike polynomials and one that combines the 3-D corneoscleral topography with the frontal grayscale image acquired with the digital camera in-built in the profilometer. The proposed methods are contrasted against a previously described image-only-based procedure and to a technique of manual image annotation. The estimates of corneoscleral limbus radius were characterized with a high precision. The group average (mean ± standard deviation) of the maximum difference between estimates derived from all considered methods was 0.27 ± 0.14 mm and reached up to 0.55 mm. The four estimating methods lead to statistically significant differences (nonparametric ANOVA (the Analysis of Variance) test, p 0.05). Precise topographical limbus demarcation is possible either from the frontal digital images of the eye or from the 3-D topographical information of corneoscleral region. However, the results demonstrated that the corneoscleral limbus estimated from the anterior eye topography does not always correspond to that obtained through image-only based techniques. The experimental findings have shown that 3-D topography of anterior eye, in the absence of a gold standard, has the potential to become a new computational methodology for estimating the corneoscleral limbus.
A new method to assess the statistical convergence of monte carlo solutions
International Nuclear Information System (INIS)
Forster, R.A.
1991-01-01
Accurate Monte Carlo confidence intervals (CIs), which are formed with an estimated mean and an estimated standard deviation, can only be created when the number of particle histories N becomes large enough so that the central limit theorem can be applied. The Monte Carlo user has a limited number of marginal methods to assess the fulfillment of this condition, such as statistical error reduction proportional to 1/√N with error magnitude guidelines and third and fourth moment estimators. A new method is presented here to assess the statistical convergence of Monte Carlo solutions by analyzing the shape of the empirical probability density function (PDF) of history scores. Related work in this area includes the derivation of analytic score distributions for a two-state Monte Carlo problem. Score distribution histograms have been generated to determine when a small number of histories accounts for a large fraction of the result. This summary describes initial studies of empirical Monte Carlo history score PDFs created from score histograms of particle transport simulations. 7 refs., 1 fig
Lu, Qiongshi; Li, Boyang; Ou, Derek; Erlendsdottir, Margret; Powles, Ryan L; Jiang, Tony; Hu, Yiming; Chang, David; Jin, Chentian; Dai, Wei; He, Qidu; Liu, Zefeng; Mukherjee, Shubhabrata; Crane, Paul K; Zhao, Hongyu
2017-12-07
Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (N total ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Estimating the Probability of Traditional Copying, Conditional on Answer-Copying Statistics.
Allen, Jeff; Ghattas, Andrew
2016-06-01
Statistics for detecting copying on multiple-choice tests produce p values measuring the probability of a value at least as large as that observed, under the null hypothesis of no copying. The posterior probability of copying is arguably more relevant than the p value, but cannot be derived from Bayes' theorem unless the population probability of copying and probability distribution of the answer-copying statistic under copying are known. In this article, the authors develop an estimator for the posterior probability of copying that is based on estimable quantities and can be used with any answer-copying statistic. The performance of the estimator is evaluated via simulation, and the authors demonstrate how to apply the formula using actual data. Potential uses, generalizability to other types of cheating, and limitations of the approach are discussed.
Hu, Juju; Hu, Haijiang; Ji, Yinghua
2010-03-15
Periodic nonlinearity that ranges from tens of nanometers to a few nanometers in heterodyne interferometer limits its use in high accuracy measurement. A novel method is studied to detect the nonlinearity errors based on the electrical subdivision and the analysis method of statistical signal in heterodyne Michelson interferometer. Under the movement of micropositioning platform with the uniform velocity, the method can detect the nonlinearity errors by using the regression analysis and Jackknife estimation. Based on the analysis of the simulations, the method can estimate the influence of nonlinearity errors and other noises for the dimensions measurement in heterodyne Michelson interferometer.
A Statistic-Based Calibration Method for TIADC System
Directory of Open Access Journals (Sweden)
Kuojun Yang
2015-01-01
Full Text Available Time-interleaved technique is widely used to increase the sampling rate of analog-to-digital converter (ADC. However, the channel mismatches degrade the performance of time-interleaved ADC (TIADC. Therefore, a statistic-based calibration method for TIADC is proposed in this paper. The average value of sampling points is utilized to calculate offset error, and the summation of sampling points is used to calculate gain error. After offset and gain error are obtained, they are calibrated by offset and gain adjustment elements in ADC. Timing skew is calibrated by an iterative method. The product of sampling points of two adjacent subchannels is used as a metric for calibration. The proposed method is employed to calibrate mismatches in a four-channel 5 GS/s TIADC system. Simulation results show that the proposed method can estimate mismatches accurately in a wide frequency range. It is also proved that an accurate estimation can be obtained even if the signal noise ratio (SNR of input signal is 20 dB. Furthermore, the results obtained from a real four-channel 5 GS/s TIADC system demonstrate the effectiveness of the proposed method. We can see that the spectra spurs due to mismatches have been effectively eliminated after calibration.
Statistical methods for ranking data
Alvo, Mayer
2014-01-01
This book introduces advanced undergraduate, graduate students and practitioners to statistical methods for ranking data. An important aspect of nonparametric statistics is oriented towards the use of ranking data. Rank correlation is defined through the notion of distance functions and the notion of compatibility is introduced to deal with incomplete data. Ranking data are also modeled using a variety of modern tools such as CART, MCMC, EM algorithm and factor analysis. This book deals with statistical methods used for analyzing such data and provides a novel and unifying approach for hypotheses testing. The techniques described in the book are illustrated with examples and the statistical software is provided on the authors’ website.
Statistical methods in nuclear theory
International Nuclear Information System (INIS)
Shubin, Yu.N.
1974-01-01
The paper outlines statistical methods which are widely used for describing properties of excited states of nuclei and nuclear reactions. It discusses physical assumptions lying at the basis of known distributions between levels (Wigner, Poisson distributions) and of widths of highly excited states (Porter-Thomas distribution, as well as assumptions used in the statistical theory of nuclear reactions and in the fluctuation analysis. The author considers the random matrix method, which consists in replacing the matrix elements of a residual interaction by random variables with a simple statistical distribution. Experimental data are compared with results of calculations using the statistical model. The superfluid nucleus model is considered with regard to superconducting-type pair correlations
Reverse survival method of fertility estimation: An evaluation
Directory of Open Access Journals (Sweden)
Thomas Spoorenberg
2014-07-01
Full Text Available Background: For the most part, demographers have relied on the ever-growing body of sample surveys collecting full birth history to derive total fertility estimates in less statistically developed countries. Yet alternative methods of fertility estimation can return very consistent total fertility estimates by using only basic demographic information. Objective: This paper evaluates the consistency and sensitivity of the reverse survival method -- a fertility estimation method based on population data by age and sex collected in one census or a single-round survey. Methods: A simulated population was first projected over 15 years using a set of fertility and mortality age and sex patterns. The projected population was then reverse survived using the Excel template FE_reverse_4.xlsx, provided with Timæus and Moultrie (2012. Reverse survival fertility estimates were then compared for consistency to the total fertility rates used to project the population. The sensitivity was assessed by introducing a series of distortions in the projection of the population and comparing the difference implied in the resulting fertility estimates. Results: The reverse survival method produces total fertility estimates that are very consistent and hardly affected by erroneous assumptions on the age distribution of fertility or by the use of incorrect mortality levels, trends, and age patterns. The quality of the age and sex population data that is 'reverse survived' determines the consistency of the estimates. The contribution of the method for the estimation of past and present trends in total fertility is illustrated through its application to the population data of five countries characterized by distinct fertility levels and data quality issues. Conclusions: Notwithstanding its simplicity, the reverse survival method of fertility estimation has seldom been applied. The method can be applied to a large body of existing and easily available population data
An estimator for statistical anisotropy from the CMB bispectrum
International Nuclear Information System (INIS)
Bartolo, N.; Dimastrogiovanni, E.; Matarrese, S.; Liguori, M.; Riotto, A.
2012-01-01
Various data analyses of the Cosmic Microwave Background (CMB) provide observational hints of statistical isotropy breaking. Some of these features can be studied within the framework of primordial vector fields in inflationary theories which generally display some level of statistical anisotropy both in the power spectrum and in higher-order correlation functions. Motivated by these observations and the recent theoretical developments in the study of primordial vector fields, we develop the formalism necessary to extract statistical anisotropy information from the three-point function of the CMB temperature anisotropy. We employ a simplified vector field model and parametrize the bispectrum of curvature fluctuations in such a way that all the information about statistical anisotropy is encoded in some parameters λ LM (which measure the anisotropic to the isotropic bispectrum amplitudes). For such a template bispectrum, we compute an optimal estimator for λ LM and the expected signal-to-noise ratio. We estimate that, for f NL ≅ 30, an experiment like Planck can be sensitive to a ratio of the anisotropic to the isotropic amplitudes of the bispectrum as small as 10%. Our results are complementary to the information coming from a power spectrum analysis and particularly relevant for those models where statistical anisotropy turns out to be suppressed in the power spectrum but not negligible in the bispectrum
Direction-of-Arrival Estimation Based on Sparse Recovery with Second-Order Statistics
Directory of Open Access Journals (Sweden)
H. Chen
2015-04-01
Full Text Available Traditional direction-of-arrival (DOA estimation techniques perform Nyquist-rate sampling of the received signals and as a result they require high storage. To reduce sampling ratio, we introduce level-crossing (LC sampling which captures samples whenever the signal crosses predetermined reference levels, and the LC-based analog-to-digital converter (LC ADC has been shown to efficiently sample certain classes of signals. In this paper, we focus on the DOA estimation problem by using second-order statistics based on the LC samplings recording on one sensor, along with the synchronous samplings of the another sensors, a sparse angle space scenario can be found by solving an $ell_1$ minimization problem, giving the number of sources and their DOA's. The experimental results show that our proposed method, when compared with some existing norm-based constrained optimization compressive sensing (CS algorithms, as well as subspace method, improves the DOA estimation performance, while using less samples when compared with Nyquist-rate sampling and reducing sensor activity especially for long time silence signal.
Statistical Methods in Integrative Genomics
Richardson, Sylvia; Tseng, George C.; Sun, Wei
2016-01-01
Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531
International Nuclear Information System (INIS)
Romero, Vicente J.; Burkardt, John V.; Gunzburger, Max D.; Peterson, Janet S.
2006-01-01
A recently developed centroidal Voronoi tessellation (CVT) sampling method is investigated here to assess its suitability for use in statistical sampling applications. CVT efficiently generates a highly uniform distribution of sample points over arbitrarily shaped M-dimensional parameter spaces. On several 2-D test problems CVT has recently been found to provide exceedingly effective and efficient point distributions for response surface generation. Additionally, for statistical function integration and estimation of response statistics associated with uniformly distributed random-variable inputs (uncorrelated), CVT has been found in initial investigations to provide superior points sets when compared against latin-hypercube and simple-random Monte Carlo methods and Halton and Hammersley quasi-random sequence methods. In this paper, the performance of all these sampling methods and a new variant ('Latinized' CVT) are further compared for non-uniform input distributions. Specifically, given uncorrelated normal inputs in a 2-D test problem, statistical sampling efficiencies are compared for resolving various statistics of response: mean, variance, and exceedence probabilities
Gleason, C. J.; Im, J.
2011-12-01
Airborne LiDAR remote sensing has been used effectively in assessing forest biomass because of its canopy penetrating effects and its ability to accurately describe the canopy surface. Current research in assessing biomass using airborne LiDAR focuses on either the individual tree as a base unit of study or statistical representations of a small aggregation of trees (i.e., plot level), and both methods usually rely on regression against field data to model the relationship between the LiDAR-derived data (e.g., volume) and biomass. This study estimates biomass for mixed forests and coniferous plantations (Picea Abies) within Heiberg Memorial Forest, Tully, NY, at both the plot and individual tree level. Plots are regularly spaced with a radius of 13m, and field data include diameter at breast height (dbh), tree height, and tree species. Field data collection and LiDAR data acquisition were seasonally coincident and both obtained in August of 2010. Resulting point cloud density was >5pts/m2. LiDAR data were processed to provide a canopy height surface, and a combination of watershed segmentation, active contouring, and genetic algorithm optimization was applied to delineate individual trees from the surface. This updated delineation method was shown to be more accurate than traditional watershed segmentation. Once trees had been delineated, four biomass estimation models were applied and compared: support vector regression (SVR), linear mixed effects regression (LME), random forest (RF), and Cubist regression. Candidate variables to be used in modeling were derived from the LiDAR surface, and include metrics of height, width, and volume per delineated tree footprint. Previously published allometric equations provided field estimates of biomass to inform the regressions and calculate their accuracy via leave-one-out cross validation. This study found that for forests such as found in the study area, aggregation of individual trees to form a plot-based estimate of
Monte Carlo based statistical power analysis for mediation models: methods and software.
Zhang, Zhiyong
2014-12-01
The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.
A method for statistical comparison of data sets and its uses in analysis of nuclear physics data
International Nuclear Information System (INIS)
Bityukov, S.I.; Smirnova, V.V.; Krasnikov, N.V.; Maksimushkina, A.V.; Nikitenko, A.N.
2014-01-01
Authors propose a method for statistical comparison of two data sets. The method is based on the method of statistical comparison of histograms. As an estimator of quality of the decision made, it is proposed to use the value which it is possible to call the probability that the decision (data sets are various) is correct [ru
Methods of statistical physics
Akhiezer, Aleksandr I
1981-01-01
Methods of Statistical Physics is an exposition of the tools of statistical mechanics, which evaluates the kinetic equations of classical and quantized systems. The book also analyzes the equations of macroscopic physics, such as the equations of hydrodynamics for normal and superfluid liquids and macroscopic electrodynamics. The text gives particular attention to the study of quantum systems. This study begins with a discussion of problems of quantum statistics with a detailed description of the basics of quantum mechanics along with the theory of measurement. An analysis of the asymptotic be
Landslide Susceptibility Statistical Methods: A Critical and Systematic Literature Review
Mihir, Monika; Malamud, Bruce; Rossi, Mauro; Reichenbach, Paola; Ardizzone, Francesca
2014-05-01
Landslide susceptibility assessment, the subject of this systematic review, is aimed at understanding the spatial probability of slope failures under a set of geomorphological and environmental conditions. It is estimated that about 375 landslides that occur globally each year are fatal, with around 4600 people killed per year. Past studies have brought out the increasing cost of landslide damages which primarily can be attributed to human occupation and increased human activities in the vulnerable environments. Many scientists, to evaluate and reduce landslide risk, have made an effort to efficiently map landslide susceptibility using different statistical methods. In this paper, we do a critical and systematic landslide susceptibility literature review, in terms of the different statistical methods used. For each of a broad set of studies reviewed we note: (i) study geography region and areal extent, (ii) landslide types, (iii) inventory type and temporal period covered, (iv) mapping technique (v) thematic variables used (vi) statistical models, (vii) assessment of model skill, (viii) uncertainty assessment methods, (ix) validation methods. We then pulled out broad trends within our review of landslide susceptibility, particularly regarding the statistical methods. We found that the most common statistical methods used in the study of landslide susceptibility include logistic regression, artificial neural network, discriminant analysis and weight of evidence. Although most of the studies we reviewed assessed the model skill, very few assessed model uncertainty. In terms of geographic extent, the largest number of landslide susceptibility zonations were in Turkey, Korea, Spain, Italy and Malaysia. However, there are also many landslides and fatalities in other localities, particularly India, China, Philippines, Nepal and Indonesia, Guatemala, and Pakistan, where there are much fewer landslide susceptibility studies available in the peer-review literature. This
DEFF Research Database (Denmark)
Malzahn, Dorthe; Opper, Manfred
2003-01-01
We employ the replica method of statistical physics to study the average case performance of learning systems. The new feature of our theory is that general distributions of data can be treated, which enables applications to real data. For a class of Bayesian prediction models which are based...... on Gaussian processes, we discuss Bootstrap estimates for learning curves....
Advances in Time Estimation Methods for Molecular Data.
Kumar, Sudhir; Hedges, S Blair
2016-04-01
Molecular dating has become central to placing a temporal dimension on the tree of life. Methods for estimating divergence times have been developed for over 50 years, beginning with the proposal of molecular clock in 1962. We categorize the chronological development of these methods into four generations based on the timing of their origin. In the first generation approaches (1960s-1980s), a strict molecular clock was assumed to date divergences. In the second generation approaches (1990s), the equality of evolutionary rates between species was first tested and then a strict molecular clock applied to estimate divergence times. The third generation approaches (since ∼2000) account for differences in evolutionary rates across the tree by using a statistical model, obviating the need to assume a clock or to test the equality of evolutionary rates among species. Bayesian methods in the third generation require a specific or uniform prior on the speciation-process and enable the inclusion of uncertainty in clock calibrations. The fourth generation approaches (since 2012) allow rates to vary from branch to branch, but do not need prior selection of a statistical model to describe the rate variation or the specification of speciation model. With high accuracy, comparable to Bayesian approaches, and speeds that are orders of magnitude faster, fourth generation methods are able to produce reliable timetrees of thousands of species using genome scale data. We found that early time estimates from second generation studies are similar to those of third and fourth generation studies, indicating that methodological advances have not fundamentally altered the timetree of life, but rather have facilitated time estimation by enabling the inclusion of more species. Nonetheless, we feel an urgent need for testing the accuracy and precision of third and fourth generation methods, including their robustness to misspecification of priors in the analysis of large phylogenies and data
Multivariate statistical methods a first course
Marcoulides, George A
2014-01-01
Multivariate statistics refer to an assortment of statistical methods that have been developed to handle situations in which multiple variables or measures are involved. Any analysis of more than two variables or measures can loosely be considered a multivariate statistical analysis. An introductory text for students learning multivariate statistical methods for the first time, this book keeps mathematical details to a minimum while conveying the basic principles. One of the principal strategies used throughout the book--in addition to the presentation of actual data analyses--is poin
Statistical methods for physical science
Stanford, John L
1994-01-01
This volume of Methods of Experimental Physics provides an extensive introduction to probability and statistics in many areas of the physical sciences, with an emphasis on the emerging area of spatial statistics. The scope of topics covered is wide-ranging-the text discusses a variety of the most commonly used classical methods and addresses newer methods that are applicable or potentially important. The chapter authors motivate readers with their insightful discussions, augmenting their material withKey Features* Examines basic probability, including coverage of standard distributions, time s
Landsman, V; Lou, W Y W; Graubard, B I
2015-05-20
We present a two-step approach for estimating hazard rates and, consequently, survival probabilities, by levels of general categorical exposure. The resulting estimator utilizes three sources of data: vital statistics data and census data are used at the first step to estimate the overall hazard rate for a given combination of gender and age group, and cohort data constructed from a nationally representative complex survey with linked mortality records, are used at the second step to divide the overall hazard rate by exposure levels. We present an explicit expression for the resulting estimator and consider two methods for variance estimation that account for complex multistage sample design: (1) the leaving-one-out jackknife method, and (2) the Taylor linearization method, which provides an analytic formula for the variance estimator. The methods are illustrated with smoking and all-cause mortality data from the US National Health Interview Survey Linked Mortality Files, and the proposed estimator is compared with a previously studied crude hazard rate estimator that uses survey data only. The advantages of a two-step approach and possible extensions of the proposed estimator are discussed. Copyright © 2015 John Wiley & Sons, Ltd.
St-Onge, Christina; Valois, Pierre; Abdous, Belkacem; Germain, Stephane
2009-01-01
To date, there have been no studies comparing parametric and nonparametric Item Characteristic Curve (ICC) estimation methods on the effectiveness of Person-Fit Statistics (PFS). The primary aim of this study was to determine if the use of ICCs estimated by nonparametric methods would increase the accuracy of item response theory-based PFS for…
Complex data modeling and computationally intensive methods for estimation and prediction
Secchi, Piercesare; Advances in Complex Data Modeling and Computational Methods in Statistics
2015-01-01
The book is addressed to statisticians working at the forefront of the statistical analysis of complex and high dimensional data and offers a wide variety of statistical models, computer intensive methods and applications: network inference from the analysis of high dimensional data; new developments for bootstrapping complex data; regression analysis for measuring the downsize reputational risk; statistical methods for research on the human genome dynamics; inference in non-euclidean settings and for shape data; Bayesian methods for reliability and the analysis of complex data; methodological issues in using administrative data for clinical and epidemiological research; regression models with differential regularization; geostatistical methods for mobility analysis through mobile phone data exploration. This volume is the result of a careful selection among the contributions presented at the conference "S.Co.2013: Complex data modeling and computationally intensive methods for estimation and prediction" held...
International Nuclear Information System (INIS)
Vidal-Codina, F.; Nguyen, N.C.; Giles, M.B.; Peraire, J.
2015-01-01
We present a model and variance reduction method for the fast and reliable computation of statistical outputs of stochastic elliptic partial differential equations. Our method consists of three main ingredients: (1) the hybridizable discontinuous Galerkin (HDG) discretization of elliptic partial differential equations (PDEs), which allows us to obtain high-order accurate solutions of the governing PDE; (2) the reduced basis method for a new HDG discretization of the underlying PDE to enable real-time solution of the parameterized PDE in the presence of stochastic parameters; and (3) a multilevel variance reduction method that exploits the statistical correlation among the different reduced basis approximations and the high-fidelity HDG discretization to accelerate the convergence of the Monte Carlo simulations. The multilevel variance reduction method provides efficient computation of the statistical outputs by shifting most of the computational burden from the high-fidelity HDG approximation to the reduced basis approximations. Furthermore, we develop a posteriori error estimates for our approximations of the statistical outputs. Based on these error estimates, we propose an algorithm for optimally choosing both the dimensions of the reduced basis approximations and the sizes of Monte Carlo samples to achieve a given error tolerance. We provide numerical examples to demonstrate the performance of the proposed method
Statistics for experimentalists
Cooper, B E
2014-01-01
Statistics for Experimentalists aims to provide experimental scientists with a working knowledge of statistical methods and search approaches to the analysis of data. The book first elaborates on probability and continuous probability distributions. Discussions focus on properties of continuous random variables and normal variables, independence of two random variables, central moments of a continuous distribution, prediction from a normal distribution, binomial probabilities, and multiplication of probabilities and independence. The text then examines estimation and tests of significance. Topics include estimators and estimates, expected values, minimum variance linear unbiased estimators, sufficient estimators, methods of maximum likelihood and least squares, and the test of significance method. The manuscript ponders on distribution-free tests, Poisson process and counting problems, correlation and function fitting, balanced incomplete randomized block designs and the analysis of covariance, and experiment...
International Nuclear Information System (INIS)
Zhang, Jinzhao; Segurado, Jacobo; Schneidesch, Christophe
2013-01-01
Since 1980's, Tractebel Engineering (TE) has being developed and applied a multi-physical modelling and safety analyses capability, based on a code package consisting of the best estimate 3D neutronic (PANTHER), system thermal hydraulic (RELAP5), core sub-channel thermal hydraulic (COBRA-3C), and fuel thermal mechanic (FRAPCON/FRAPTRAN) codes. A series of methodologies have been developed to perform and to license the reactor safety analysis and core reload design, based on the deterministic bounding approach. Following the recent trends in research and development as well as in industrial applications, TE has been working since 2010 towards the application of the statistical sensitivity and uncertainty analysis methods to the multi-physical modelling and licensing safety analyses. In this paper, the TE multi-physical modelling and safety analyses capability is first described, followed by the proposed TE best estimate plus statistical uncertainty analysis method (BESUAM). The chosen statistical sensitivity and uncertainty analysis methods (non-parametric order statistic method or bootstrap) and tool (DAKOTA) are then presented, followed by some preliminary results of their applications to FRAPCON/FRAPTRAN simulation of OECD RIA fuel rod codes benchmark and RELAP5/MOD3.3 simulation of THTF tests. (authors)
Simple method for quick estimation of aquifer hydrogeological parameters
Ma, C.; Li, Y. Y.
2017-08-01
Development of simple and accurate methods to determine the aquifer hydrogeological parameters was of importance for groundwater resources assessment and management. Aiming at the present issue of estimating aquifer parameters based on some data of the unsteady pumping test, a fitting function of Theis well function was proposed using fitting optimization method and then a unitary linear regression equation was established. The aquifer parameters could be obtained by solving coefficients of the regression equation. The application of the proposed method was illustrated, using two published data sets. By the error statistics and analysis on the pumping drawdown, it showed that the method proposed in this paper yielded quick and accurate estimates of the aquifer parameters. The proposed method could reliably identify the aquifer parameters from long distance observed drawdowns and early drawdowns. It was hoped that the proposed method in this paper would be helpful for practicing hydrogeologists and hydrologists.
Statistical estimation for truncated exponential families
Akahira, Masafumi
2017-01-01
This book presents new findings on nonregular statistical estimation. Unlike other books on this topic, its major emphasis is on helping readers understand the meaning and implications of both regularity and irregularity through a certain family of distributions. In particular, it focuses on a truncated exponential family of distributions with a natural parameter and truncation parameter as a typical nonregular family. This focus includes the (truncated) Pareto distribution, which is widely used in various fields such as finance, physics, hydrology, geology, astronomy, and other disciplines. The family is essential in that it links both regular and nonregular distributions, as it becomes a regular exponential family if the truncation parameter is known. The emphasis is on presenting new results on the maximum likelihood estimation of a natural parameter or truncation parameter if one of them is a nuisance parameter. In order to obtain more information on the truncation, the Bayesian approach is also considere...
Correction of Misclassifications Using a Proximity-Based Estimation Method
Directory of Open Access Journals (Sweden)
Shmulevich Ilya
2004-01-01
Full Text Available An estimation method for correcting misclassifications in signal and image processing is presented. The method is based on the use of context-based (temporal or spatial information in a sliding-window fashion. The classes can be purely nominal, that is, an ordering of the classes is not required. The method employs nonlinear operations based on class proximities defined by a proximity matrix. Two case studies are presented. In the first, the proposed method is applied to one-dimensional signals for processing data that are obtained by a musical key-finding algorithm. In the second, the estimation method is applied to two-dimensional signals for correction of misclassifications in images. In the first case study, the proximity matrix employed by the estimation method follows directly from music perception studies, whereas in the second case study, the optimal proximity matrix is obtained with genetic algorithms as the learning rule in a training-based optimization framework. Simulation results are presented in both case studies and the degree of improvement in classification accuracy that is obtained by the proposed method is assessed statistically using Kappa analysis.
Mathematical and statistical methods for actuarial sciences and finance
Sibillo, Marilena
2014-01-01
The interaction between mathematicians and statisticians working in the actuarial and financial fields is producing numerous meaningful scientific results. This volume, comprising a series of four-page papers, gathers new ideas relating to mathematical and statistical methods in the actuarial sciences and finance. The book covers a variety of topics of interest from both theoretical and applied perspectives, including: actuarial models; alternative testing approaches; behavioral finance; clustering techniques; coherent and non-coherent risk measures; credit-scoring approaches; data envelopment analysis; dynamic stochastic programming; financial contagion models; financial ratios; intelligent financial trading systems; mixture normality approaches; Monte Carlo-based methodologies; multicriteria methods; nonlinear parameter estimation techniques; nonlinear threshold models; particle swarm optimization; performance measures; portfolio optimization; pricing methods for structured and non-structured derivatives; r...
A comparison of Probability Of Detection (POD) data determined using different statistical methods
Fahr, A.; Forsyth, D.; Bullock, M.
1993-12-01
Different statistical methods have been suggested for determining probability of detection (POD) data for nondestructive inspection (NDI) techniques. A comparative assessment of various methods of determining POD was conducted using results of three NDI methods obtained by inspecting actual aircraft engine compressor disks which contained service induced cracks. The study found that the POD and 95 percent confidence curves as a function of crack size as well as the 90/95 percent crack length vary depending on the statistical method used and the type of data. The distribution function as well as the parameter estimation procedure used for determining POD and the confidence bound must be included when referencing information such as the 90/95 percent crack length. The POD curves and confidence bounds determined using the range interval method are very dependent on information that is not from the inspection data. The maximum likelihood estimators (MLE) method does not require such information and the POD results are more reasonable. The log-logistic function appears to model POD of hit/miss data relatively well and is easy to implement. The log-normal distribution using MLE provides more realistic POD results and is the preferred method. Although it is more complicated and slower to calculate, it can be implemented on a common spreadsheet program.
Statistical methods in personality assessment research.
Schinka, J A; LaLone, L; Broeckel, J A
1997-06-01
Emerging models of personality structure and advances in the measurement of personality and psychopathology suggest that research in personality and personality assessment has entered a stage of advanced development, in this article we examine whether researchers in these areas have taken advantage of new and evolving statistical procedures. We conducted a review of articles published in the Journal of Personality, Assessment during the past 5 years. Of the 449 articles that included some form of data analysis, 12.7% used only descriptive statistics, most employed only univariate statistics, and fewer than 10% used multivariate methods of data analysis. We discuss the cost of using limited statistical methods, the possible reasons for the apparent reluctance to employ advanced statistical procedures, and potential solutions to this technical shortcoming.
Statistical Methods in Psychology Journals.
Willkinson, Leland
1999-01-01
Proposes guidelines for revising the American Psychological Association (APA) publication manual or other APA materials to clarify the application of statistics in research reports. The guidelines are intended to induce authors and editors to recognize the thoughtless application of statistical methods. Contains 54 references. (SLD)
Trends in study design and the statistical methods employed in a leading general medicine journal.
Gosho, M; Sato, Y; Nagashima, K; Takahashi, S
2018-02-01
Study design and statistical methods have become core components of medical research, and the methodology has become more multifaceted and complicated over time. The study of the comprehensive details and current trends of study design and statistical methods is required to support the future implementation of well-planned clinical studies providing information about evidence-based medicine. Our purpose was to illustrate study design and statistical methods employed in recent medical literature. This was an extension study of Sato et al. (N Engl J Med 2017; 376: 1086-1087), which reviewed 238 articles published in 2015 in the New England Journal of Medicine (NEJM) and briefly summarized the statistical methods employed in NEJM. Using the same database, we performed a new investigation of the detailed trends in study design and individual statistical methods that were not reported in the Sato study. Due to the CONSORT statement, prespecification and justification of sample size are obligatory in planning intervention studies. Although standard survival methods (eg Kaplan-Meier estimator and Cox regression model) were most frequently applied, the Gray test and Fine-Gray proportional hazard model for considering competing risks were sometimes used for a more valid statistical inference. With respect to handling missing data, model-based methods, which are valid for missing-at-random data, were more frequently used than single imputation methods. These methods are not recommended as a primary analysis, but they have been applied in many clinical trials. Group sequential design with interim analyses was one of the standard designs, and novel design, such as adaptive dose selection and sample size re-estimation, was sometimes employed in NEJM. Model-based approaches for handling missing data should replace single imputation methods for primary analysis in the light of the information found in some publications. Use of adaptive design with interim analyses is increasing
Statistical methods for quality improvement
National Research Council Canada - National Science Library
Ryan, Thomas P
2011-01-01
...."-TechnometricsThis new edition continues to provide the most current, proven statistical methods for quality control and quality improvementThe use of quantitative methods offers numerous benefits...
Review of methods for level density estimation from resonance parameters
International Nuclear Information System (INIS)
Froehner, F.H.
1983-01-01
A number of methods are available for statistical analysis of resonance parameter sets, i.e. for estimation of level densities and average widths with account of missing levels. The main categories are (i) methods based on theories of level spacings (orthogonal-ensemble theory, Dyson-Mehta statistics), (ii) methods based on comparison with simulated cross section curves (Monte Carlo simulation, Garrison's autocorrelation method), (iii) methods exploiting the observed neutron width distribution by means of Bayesian or more approximate procedures such as maximum-likelihood, least-squares or moment methods, with various recipes for the treatment of detection thresholds and resolution effects. The present review will concentrate on (iii) with the aim of clarifying the basic mathematical concepts and the relationship between the various techniques. Recent theoretical progress in the treatment of resolution effects, detectability thresholds and p-wave admixture is described. (Auth.)
International Nuclear Information System (INIS)
Wu Jingqin.
1989-01-01
Yang Chizhong filtering and inferential measurement method is a new method used for variable statistics of ore deposits. In order to apply this theory to estimate the uranium ore reserves under the circumstances of regular or irregular prospecting grids, small ore bodies, less sampling points, and complex occurrence, the author has used this method to estimate the ore reserves in five ore bodies of two deposits and achieved satisfactory results. It is demonstrated that compared with the traditional block measurement method, this method is simple and clear in formula, convenient in application, rapid in calculation, accurate in results, less expensive, and high economic benefits. The procedure and experience in the application of this method and the preliminary evaluation of its results are mainly described
Statistical learning methods: Basics, control and performance
Energy Technology Data Exchange (ETDEWEB)
Zimmermann, J. [Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)]. E-mail: zimmerm@mppmu.mpg.de
2006-04-01
The basics of statistical learning are reviewed with a special emphasis on general principles and problems for all different types of learning methods. Different aspects of controlling these methods in a physically adequate way will be discussed. All principles and guidelines will be exercised on examples for statistical learning methods in high energy and astrophysics. These examples prove in addition that statistical learning methods very often lead to a remarkable performance gain compared to the competing classical algorithms.
Statistical learning methods: Basics, control and performance
International Nuclear Information System (INIS)
Zimmermann, J.
2006-01-01
The basics of statistical learning are reviewed with a special emphasis on general principles and problems for all different types of learning methods. Different aspects of controlling these methods in a physically adequate way will be discussed. All principles and guidelines will be exercised on examples for statistical learning methods in high energy and astrophysics. These examples prove in addition that statistical learning methods very often lead to a remarkable performance gain compared to the competing classical algorithms
Directory of Open Access Journals (Sweden)
Abul Kalam Azad
2014-05-01
Full Text Available The best Weibull distribution methods for the assessment of wind energy potential at different altitudes in desired locations are statistically diagnosed in this study. Seven different methods, namely graphical method (GM, method of moments (MOM, standard deviation method (STDM, maximum likelihood method (MLM, power density method (PDM, modified maximum likelihood method (MMLM and equivalent energy method (EEM were used to estimate the Weibull parameters and six statistical tools, namely relative percentage of error, root mean square error (RMSE, mean percentage of error, mean absolute percentage of error, chi-square error and analysis of variance were used to precisely rank the methods. The statistical fittings of the measured and calculated wind speed data are assessed for justifying the performance of the methods. The capacity factor and total energy generated by a small model wind turbine is calculated by numerical integration using Trapezoidal sums and Simpson’s rules. The results show that MOM and MLM are the most efficient methods for determining the value of k and c to fit Weibull distribution curves.
Statistical methods in nonlinear dynamics
Indian Academy of Sciences (India)
Sensitivity to initial conditions in nonlinear dynamical systems leads to exponential divergence of trajectories that are initially arbitrarily close, and hence to unpredictability. Statistical methods have been found to be helpful in extracting useful information about such systems. In this paper, we review briefly some statistical ...
International Nuclear Information System (INIS)
Frome, E.L.; Khare, M.
1980-01-01
Brodsky's paper 'A Statistical Method for Testing Epidemiological Results, as applied to the Hanford Worker Population', (Health Phys., 36, 611-628, 1979) proposed two test statistics for use in comparing the survival experience of a group of employees and controls. This letter states that both of the test statistics were computed using incorrect formulas and concludes that the results obtained using these statistics may also be incorrect. In his reply Brodsky concurs with the comments on the proper formulation of estimates of pooled standard errors in constructing test statistics but believes that the erroneous formulation does not invalidate the major points, results and discussions of his paper. (author)
Evaluation of non cyanide methods for hemoglobin estimation
Directory of Open Access Journals (Sweden)
Vinaya B Shah
2011-01-01
Full Text Available Background: The hemoglobincyanide method (HiCN method for measuring hemoglobin is used extensively worldwide; its advantages are the ready availability of a stable and internationally accepted reference standard calibrator. However, its use may create a problem, as the waste disposal of large volumes of reagent containing cyanide constitutes a potential toxic hazard. Aims and Objective: As an alternative to drabkin`s method of Hb estimation, we attempted to estimate hemoglobin by other non-cyanide methods: alkaline hematin detergent (AHD-575 using Triton X-100 as lyser and alkaline- borax method using quarternary ammonium detergents as lyser. Materials and Methods: The hemoglobin (Hb results on 200 samples of varying Hb concentrations obtained by these two cyanide free methods were compared with a cyanmethemoglobin method on a colorimeter which is light emitting diode (LED based. Hemoglobin was also estimated in one hundred blood donors and 25 blood samples of infants and compared by these methods. Statistical analysis used was Pearson`s correlation coefficient. Results: The response of the non cyanide method is linear for serially diluted blood samples over the Hb concentration range from 3gm/dl -20 gm/dl. The non cyanide methods has a precision of + 0.25g/dl (coefficient of variation= (2.34% and is suitable for use with fixed wavelength or with colorimeters at wavelength- 530 nm and 580 nm. Correlation of these two methods was excellent (r=0.98. The evaluation has shown it to be as reliable and reproducible as HiCN for measuring hemoglobin at all concentrations. The reagents used in non cyanide methods are non-biohazardous and did not affect the reliability of data determination and also the cost was less than HiCN method. Conclusions: Thus, non cyanide methods of Hb estimation offer possibility of safe and quality Hb estimation and should prove useful for routine laboratory use. Non cyanide methods is easily incorporated in hemobloginometers
Chen, Xingyuan; Miller, Gretchen R; Rubin, Yoram; Baldocchi, Dennis D
2012-12-01
The heat pulse method is widely used to measure water flux through plants; it works by using the speed at which a heat pulse is propagated through the system to infer the velocity of water through a porous medium. No systematic, non-destructive calibration procedure exists to determine the site-specific parameters necessary for calculating sap velocity, e.g., wood thermal diffusivity and probe spacing. Such parameter calibration is crucial to obtain the correct transpiration flux density from the sap flow measurements at the plant scale and subsequently to upscale tree-level water fluxes to canopy and landscape scales. The purpose of this study is to present a statistical framework for sampling and simultaneously estimating the tree's thermal diffusivity and probe spacing from in situ heat response curves collected by the implanted probes of a heat ratio measurement device. Conditioned on the time traces of wood temperature following a heat pulse, the parameters are inferred using a Bayesian inversion technique, based on the Markov chain Monte Carlo sampling method. The primary advantage of the proposed methodology is that it does not require knowledge of probe spacing or any further intrusive sampling of sapwood. The Bayesian framework also enables direct quantification of uncertainty in estimated sap flow velocity. Experiments using synthetic data show that repeated tests using the same apparatus are essential for obtaining reliable and accurate solutions. When applied to field conditions, these tests can be obtained in different seasons and can be automated using the existing data logging system. Empirical factors are introduced to account for the influence of non-ideal probe geometry on the estimation of heat pulse velocity, and are estimated in this study as well. The proposed methodology may be tested for its applicability to realistic field conditions, with an ultimate goal of calibrating heat ratio sap flow systems in practical applications.
Introduction to applied Bayesian statistics and estimation for social scientists
Lynch, Scott M
2007-01-01
""Introduction to Applied Bayesian Statistics and Estimation for Social Scientists"" covers the complete process of Bayesian statistical analysis in great detail from the development of a model through the process of making statistical inference. The key feature of this book is that it covers models that are most commonly used in social science research - including the linear regression model, generalized linear models, hierarchical models, and multivariate regression models - and it thoroughly develops each real-data example in painstaking detail.The first part of the book provides a detailed
Wiley, Jeffrey B.; Curran, Janet H.
2003-01-01
Methods for estimating daily mean flow-duration statistics for seven regions in Alaska and low-flow frequencies for one region, southeastern Alaska, were developed from daily mean discharges for streamflow-gaging stations in Alaska and conterminous basins in Canada. The 15-, 10-, 9-, 8-, 7-, 6-, 5-, 4-, 3-, 2-, and 1-percent duration flows were computed for the October-through-September water year for 222 stations in Alaska and conterminous basins in Canada. The 98-, 95-, 90-, 85-, 80-, 70-, 60-, and 50-percent duration flows were computed for the individual months of July, August, and September for 226 stations in Alaska and conterminous basins in Canada. The 98-, 95-, 90-, 85-, 80-, 70-, 60-, and 50-percent duration flows were computed for the season July-through-September for 65 stations in southeastern Alaska. The 7-day, 10-year and 7-day, 2-year low-flow frequencies for the season July-through-September were computed for 65 stations for most of southeastern Alaska. Low-flow analyses were limited to particular months or seasons in order to omit winter low flows, when ice effects reduce the quality of the records and validity of statistical assumptions. Regression equations for estimating the selected high-flow and low-flow statistics for the selected months and seasons for ungaged sites were developed from an ordinary-least-squares regression model using basin characteristics as independent variables. Drainage area and precipitation were significant explanatory variables for high flows, and drainage area, precipitation, mean basin elevation, and area of glaciers were significant explanatory variables for low flows. The estimating equations can be used at ungaged sites in Alaska and conterminous basins in Canada where streamflow regulation, streamflow diversion, urbanization, and natural damming and releasing of water do not affect the streamflow data for the given month or season. Standard errors of estimate ranged from 15 to 56 percent for high-duration flow
Training Methods for Image Noise Level Estimation on Wavelet Components
Directory of Open Access Journals (Sweden)
A. De Stefano
2004-12-01
Full Text Available The estimation of the standard deviation of noise contaminating an image is a fundamental step in wavelet-based noise reduction techniques. The method widely used is based on the mean absolute deviation (MAD. This model-based method assumes specific characteristics of the noise-contaminated image component. Three novel and alternative methods for estimating the noise standard deviation are proposed in this work and compared with the MAD method. Two of these methods rely on a preliminary training stage in order to extract parameters which are then used in the application stage. The sets used for training and testing, 13 and 5 images, respectively, are fully disjoint. The third method assumes specific statistical distributions for image and noise components. Results showed the prevalence of the training-based methods for the images and the range of noise levels considered.
Effect of the Target Motion Sampling Temperature Treatment Method on the Statistics and Performance
Viitanen, Tuomas; Leppänen, Jaakko
2014-06-01
Target Motion Sampling (TMS) is a stochastic on-the-fly temperature treatment technique that is being developed as a part of the Monte Carlo reactor physics code Serpent. The method provides for modeling of arbitrary temperatures in continuous-energy Monte Carlo tracking routines with only one set of cross sections stored in the computer memory. Previously, only the performance of the TMS method in terms of CPU time per transported neutron has been discussed. Since the effective cross sections are not calculated at any point of a transport simulation with TMS, reaction rate estimators must be scored using sampled cross sections, which is expected to increase the variances and, consequently, to decrease the figures-of-merit. This paper examines the effects of the TMS on the statistics and performance in practical calculations involving reaction rate estimation with collision estimators. Against all expectations it turned out that the usage of sampled response values has no practical effect on the performance of reaction rate estimators when using TMS with elevated basis cross section temperatures (EBT), i.e. the usual way. With 0 Kelvin cross sections a significant increase in the variances of capture rate estimators was observed right below the energy region of unresolved resonances, but at these energies the figures-of-merit could be increased using a simple resampling technique to decrease the variances of the responses. It was, however, noticed that the usage of the TMS method increases the statistical deviances of all estimators, including the flux estimator, by tens of percents in the vicinity of very strong resonances. This effect is actually not related to the usage of sampled responses, but is instead an inherent property of the TMS tracking method and concerns both EBT and 0 K calculations.
Statistical data analysis using SAS intermediate statistical methods
Marasinghe, Mervyn G
2018-01-01
The aim of this textbook (previously titled SAS for Data Analytics) is to teach the use of SAS for statistical analysis of data for advanced undergraduate and graduate students in statistics, data science, and disciplines involving analyzing data. The book begins with an introduction beyond the basics of SAS, illustrated with non-trivial, real-world, worked examples. It proceeds to SAS programming and applications, SAS graphics, statistical analysis of regression models, analysis of variance models, analysis of variance with random and mixed effects models, and then takes the discussion beyond regression and analysis of variance to conclude. Pedagogically, the authors introduce theory and methodological basis topic by topic, present a problem as an application, followed by a SAS analysis of the data provided and a discussion of results. The text focuses on applied statistical problems and methods. Key features include: end of chapter exercises, downloadable SAS code and data sets, and advanced material suitab...
Fish, Laurel J.; Halcoussis, Dennis; Phillips, G. Michael
2017-01-01
The Monte Carlo method and related multiple imputation methods are traditionally used in math, physics and science to estimate and analyze data and are now becoming standard tools in analyzing business and financial problems. However, few sources explain the application of the Monte Carlo method for individuals and business professionals who are…
Non-parametric order statistics method applied to uncertainty propagation in fuel rod calculations
International Nuclear Information System (INIS)
Arimescu, V.E.; Heins, L.
2001-01-01
Advances in modeling fuel rod behavior and accumulations of adequate experimental data have made possible the introduction of quantitative methods to estimate the uncertainty of predictions made with best-estimate fuel rod codes. The uncertainty range of the input variables is characterized by a truncated distribution which is typically a normal, lognormal, or uniform distribution. While the distribution for fabrication parameters is defined to cover the design or fabrication tolerances, the distribution of modeling parameters is inferred from the experimental database consisting of separate effects tests and global tests. The final step of the methodology uses a Monte Carlo type of random sampling of all relevant input variables and performs best-estimate code calculations to propagate these uncertainties in order to evaluate the uncertainty range of outputs of interest for design analysis, such as internal rod pressure and fuel centerline temperature. The statistical method underlying this Monte Carlo sampling is non-parametric order statistics, which is perfectly suited to evaluate quantiles of populations with unknown distribution. The application of this method is straightforward in the case of one single fuel rod, when a 95/95 statement is applicable: 'with a probability of 95% and confidence level of 95% the values of output of interest are below a certain value'. Therefore, the 0.95-quantile is estimated for the distribution of all possible values of one fuel rod with a statistical confidence of 95%. On the other hand, a more elaborate procedure is required if all the fuel rods in the core are being analyzed. In this case, the aim is to evaluate the following global statement: with 95% confidence level, the expected number of fuel rods which are not exceeding a certain value is all the fuel rods in the core except only a few fuel rods. In both cases, the thresholds determined by the analysis should be below the safety acceptable design limit. An indirect
DEFF Research Database (Denmark)
Sathe, Ameya
This report is prepared as a written contribution to the Remote Sensing Summer School, that is organized by the Department of Wind Energy, Technical University of Denmark. It provides an overview of the state-of-the-art with regards to estimating turbulence statistics from lidar measurements...... configuration. The so-called velocity Azimuth Display (VAD) and the Doppler Beam Swinging (DBS) methods of post processing the lidar data are investigated in greater details, partly due to their wide use in commercial lidars. It is demonstrated that the VAD or DBS techniques result in introducing significant...
Bootstrap-based confidence estimation in PCA and multivariate statistical process control
DEFF Research Database (Denmark)
Babamoradi, Hamid
be used to detect outliers in the data since the outliers can distort the bootstrap estimates. Bootstrap-based confidence limits were suggested as alternative to the asymptotic limits for control charts and contribution plots in MSPC (Paper II). The results showed that in case of the Q-statistic......Traditional/Asymptotic confidence estimation has limited applicability since it needs statistical theories to estimate the confidences, which are not available for all indicators/parameters. Furthermore, in case the theories are available for a specific indicator/parameter, the theories are based....... The goal was to improve process monitoring by improving the quality of MSPC charts and contribution plots. Bootstrapping algorithm to build confidence limits was illustrated in a case study format (Paper I). The main steps in the algorithm were discussed where a set of sensible choices (plus...
Statistical Methods for Fuzzy Data
Viertl, Reinhard
2011-01-01
Statistical data are not always precise numbers, or vectors, or categories. Real data are frequently what is called fuzzy. Examples where this fuzziness is obvious are quality of life data, environmental, biological, medical, sociological and economics data. Also the results of measurements can be best described by using fuzzy numbers and fuzzy vectors respectively. Statistical analysis methods have to be adapted for the analysis of fuzzy data. In this book, the foundations of the description of fuzzy data are explained, including methods on how to obtain the characterizing function of fuzzy m
A method for statistical steady state thermal analysis of reactor cores
International Nuclear Information System (INIS)
Whetton, P.A.
1981-01-01
In a previous publication the author presented a method for undertaking statistical steady state thermal analyses of reactor cores. The present paper extends the technique to an assessment of confidence limits for the resulting probability functions which define the probability that a given thermal response value will be exceeded in a reactor core. Establishing such confidence limits is considered an integral part of any statistical thermal analysis and essential if such analysis are to be considered in any regulatory process. In certain applications the use of a best estimate probability function may be justifiable but it is recognised that a demonstrably conservative probability function is required for any regulatory considerations. (orig.)
Effect of the Target Motion Sampling temperature treatment method on the statistics and performance
International Nuclear Information System (INIS)
Viitanen, Tuomas; Leppänen, Jaakko
2015-01-01
Highlights: • Use of the Target Motion Sampling (TMS) method with collision estimators is studied. • The expected values of the estimators agree with NJOY-based reference. • In most practical cases also the variances of the estimators are unaffected by TMS. • Transport calculation slow-down due to TMS dominates the impact on figures-of-merit. - Abstract: Target Motion Sampling (TMS) is a stochastic on-the-fly temperature treatment technique that is being developed as a part of the Monte Carlo reactor physics code Serpent. The method provides for modeling of arbitrary temperatures in continuous-energy Monte Carlo tracking routines with only one set of cross sections stored in the computer memory. Previously, only the performance of the TMS method in terms of CPU time per transported neutron has been discussed. Since the effective cross sections are not calculated at any point of a transport simulation with TMS, reaction rate estimators must be scored using sampled cross sections, which is expected to increase the variances and, consequently, to decrease the figures-of-merit. This paper examines the effects of the TMS on the statistics and performance in practical calculations involving reaction rate estimation with collision estimators. Against all expectations it turned out that the usage of sampled response values has no practical effect on the performance of reaction rate estimators when using TMS with elevated basis cross section temperatures (EBT), i.e. the usual way. With 0 Kelvin cross sections a significant increase in the variances of capture rate estimators was observed right below the energy region of unresolved resonances, but at these energies the figures-of-merit could be increased using a simple resampling technique to decrease the variances of the responses. It was, however, noticed that the usage of the TMS method increases the statistical deviances of all estimators, including the flux estimator, by tens of percents in the vicinity of very
Estimation methods for special nuclear materials holdup
International Nuclear Information System (INIS)
Pillay, K.K.S.; Picard, R.R.
1984-01-01
The potential value of statistical models for the estimation of residual inventories of special nuclear materials was examined using holdup data from processing facilities and through controlled experiments. Although the measurement of hidden inventories of special nuclear materials in large facilities is a challenging task, reliable estimates of these inventories can be developed through a combination of good measurements and the use of statistical models. 7 references, 5 figures
Development and testing of improved statistical wind power forecasting methods.
Energy Technology Data Exchange (ETDEWEB)
Mendes, J.; Bessa, R.J.; Keko, H.; Sumaili, J.; Miranda, V.; Ferreira, C.; Gama, J.; Botterud, A.; Zhou, Z.; Wang, J. (Decision and Information Sciences); (INESC Porto)
2011-12-06
(with spatial and/or temporal dependence). Statistical approaches to uncertainty forecasting basically consist of estimating the uncertainty based on observed forecasting errors. Quantile regression (QR) is currently a commonly used approach in uncertainty forecasting. In Chapter 3, we propose new statistical approaches to the uncertainty estimation problem by employing kernel density forecast (KDF) methods. We use two estimators in both offline and time-adaptive modes, namely, the Nadaraya-Watson (NW) and Quantilecopula (QC) estimators. We conduct detailed tests of the new approaches using QR as a benchmark. One of the major issues in wind power generation are sudden and large changes of wind power output over a short period of time, namely ramping events. In Chapter 4, we perform a comparative study of existing definitions and methodologies for ramp forecasting. We also introduce a new probabilistic method for ramp event detection. The method starts with a stochastic algorithm that generates wind power scenarios, which are passed through a high-pass filter for ramp detection and estimation of the likelihood of ramp events to happen. The report is organized as follows: Chapter 2 presents the results of the application of ITL training criteria to deterministic WPF; Chapter 3 reports the study on probabilistic WPF, including new contributions to wind power uncertainty forecasting; Chapter 4 presents a new method to predict and visualize ramp events, comparing it with state-of-the-art methodologies; Chapter 5 briefly summarizes the main findings and contributions of this report.
McDonald, A. David; Sandal, Leif Kristoffer
1998-01-01
Estimation of parameters in the drift and diffusion terms of stochastic differential equations involves simulation and generally requires substantial data sets. We examine a method that can be applied when available time series are limited to less than 20 observations per replication. We compare and contrast parameter estimation for linear and nonlinear first-order stochastic differential equations using two criterion functions: one based on a Chi-square statistic, put forward by Hurn and Lin...
Similar estimates of temperature impacts on global wheat yield by three independent methods
DEFF Research Database (Denmark)
Liu, Bing; Asseng, Senthold; Müller, Christoph
2016-01-01
The potential impact of global temperature change on global crop yield has recently been assessed with different methods. Here we show that grid-based and point-based simulations and statistical regressions (from historic records), without deliberate adaptation or CO2 fertilization effects, produ......-method ensemble, it was possible to quantify ‘method uncertainty’ in addition to model uncertainty. This significantly improves confidence in estimates of climate impacts on global food security.......The potential impact of global temperature change on global crop yield has recently been assessed with different methods. Here we show that grid-based and point-based simulations and statistical regressions (from historic records), without deliberate adaptation or CO2 fertilization effects, produce...... similar estimates of temperature impact on wheat yields at global and national scales. With a 1 °C global temperature increase, global wheat yield is projected to decline between 4.1% and 6.4%. Projected relative temperature impacts from different methods were similar for major wheat-producing countries...
Lin, Jen-Jen; Cheng, Jung-Yu; Huang, Li-Fei; Lin, Ying-Hsiu; Wan, Yung-Liang; Tsui, Po-Hsiang
2017-05-01
The Nakagami distribution is an approximation useful to the statistics of ultrasound backscattered signals for tissue characterization. Various estimators may affect the Nakagami parameter in the detection of changes in backscattered statistics. In particular, the moment-based estimator (MBE) and maximum likelihood estimator (MLE) are two primary methods used to estimate the Nakagami parameters of ultrasound signals. This study explored the effects of the MBE and different MLE approximations on Nakagami parameter estimations. Ultrasound backscattered signals of different scatterer number densities were generated using a simulation model, and phantom experiments and measurements of human liver tissues were also conducted to acquire real backscattered echoes. Envelope signals were employed to estimate the Nakagami parameters by using the MBE, first- and second-order approximations of MLE (MLE 1 and MLE 2 , respectively), and Greenwood approximation (MLE gw ) for comparisons. The simulation results demonstrated that, compared with the MBE and MLE 1 , the MLE 2 and MLE gw enabled more stable parameter estimations with small sample sizes. Notably, the required data length of the envelope signal was 3.6 times the pulse length. The phantom and tissue measurement results also showed that the Nakagami parameters estimated using the MLE 2 and MLE gw could simultaneously differentiate various scatterer concentrations with lower standard deviations and reliably reflect physical meanings associated with the backscattered statistics. Therefore, the MLE 2 and MLE gw are suggested as estimators for the development of Nakagami-based methodologies for ultrasound tissue characterization. Copyright © 2017 Elsevier B.V. All rights reserved.
Tziritis, E.
2016-03-01
The intrinsic vulnerability of a karstic aquifer system in central Greece was jointly assessed with the use of a statistical approach and PI method, as a function of topography, protective cover effectiveness and the degree to which this cover is bypassed due to flow conditions. The input data for the index-overlay PI method were derived from field works and 71 boreholes of the area; the information was obtained, subsequently its critical factors were compiled which included lithology, fissuring and karstification of bedrock, soil characteristics, hydrology, hydrogeology, topography and vegetation. The aforementioned parameters were processed jointly with the aid of a GIS and yielded the final estimation of intrinsic aquifer vulnerability to contamination. Results were compared with an equivalent spatially distributed probability map obtained through a stochastic approach. The calibration and test phase of the latter relied on morphometric conditions derived by terrain analyses of a digital elevation model as well as on geology and land use from thematic maps. This procedure allowed taking into account the topographic influences with respect to a deep system such as the local karstic aquifer of eastern Kopaida basin. Finally, results were validated with ground truth nitrate values obtained from 41 groundwater samples, highlighted the spatial delineation of susceptible areas to contamination in both cases and provided a robust tool for regional planning actions and water resources management schemes.
2011-01-01
Background As many respiratory viruses are responsible for influenza like symptoms, accurate measures of the disease burden are not available and estimates are generally based on statistical methods. The objective of this study was to estimate absenteeism rates and hours lost due to seasonal influenza and compare these estimates with estimates of absenteeism attributable to the two H1N1 pandemic waves that occurred in 2009. Methods Key absenteeism variables were extracted from Statistics Canada's monthly labour force survey (LFS). Absenteeism and the proportion of hours lost due to own illness or disability were modelled as a function of trend, seasonality and proxy variables for influenza activity from 1998 to 2009. Results Hours lost due to the H1N1/09 pandemic strain were elevated compared to seasonal influenza, accounting for a loss of 0.2% of potential hours worked annually. In comparison, an estimated 0.08% of hours worked annually were lost due to seasonal influenza illnesses. Absenteeism rates due to influenza were estimated at 12% per year for seasonal influenza over the 1997/98 to 2008/09 seasons, and 13% for the two H1N1/09 pandemic waves. Employees who took time off due to a seasonal influenza infection took an average of 14 hours off. For the pandemic strain, the average absence was 25 hours. Conclusions This study confirms that absenteeism due to seasonal influenza has typically ranged from 5% to 20%, with higher rates associated with multiple circulating strains. Absenteeism rates for the 2009 pandemic were similar to those occurring for seasonal influenza. Employees took more time off due to the pandemic strain than was typical for seasonal influenza. PMID:21486453
International Nuclear Information System (INIS)
Guha, S.; Taylor, J.H.
1996-01-01
It is critical that summary statistics on background data, or background levels, be computed based on standardized and defensible statistical methods because background levels are frequently used in subsequent analyses and comparisons performed by separate analysts over time. The final background for naturally occurring radionuclide concentrations in soil at a RCRA facility, and the associated statistical methods used to estimate these concentrations, are presented. The primary objective is to describe, via a case study, the statistical methods used to estimate 95% upper tolerance limits (UTL) on radionuclide background soil data sets. A 95% UTL on background samples can be used as a screening level concentration in the absence of definitive soil cleanup criteria for naturally occurring radionuclides. The statistical methods are based exclusively on EPA guidance. This paper includes an introduction, a discussion of the analytical results for the radionuclides and a detailed description of the statistical analyses leading to the determination of 95% UTLs. Soil concentrations reported are based on validated data. Data sets are categorized as surficial soil; samples collected at depths from zero to one-half foot; and deep soil, samples collected from 3 to 5 feet. These data sets were tested for statistical outliers and underlying distributions were determined by using the chi-squared test for goodness-of-fit. UTLs for the data sets were then computed based on the percentage of non-detects and the appropriate best-fit distribution (lognormal, normal, or non-parametric). For data sets containing greater than approximately 50% nondetects, nonparametric UTLs were computed
Review of best estimate plus uncertainty methods of thermal-hydraulic safety analysis
International Nuclear Information System (INIS)
Prosek, A.; Mavko, B.
2003-01-01
In 1988 United States Nuclear Regulatory Commission approved the revised rule on the acceptance of emergency core cooling system (ECCS) performance. Since that there has been significant interest in the development of codes and methodologies for best-estimate loss-of-coolant accident (LOCAs) analyses. Several new best estimate plus uncertainty methods (BEPUs) were developed in the world. The purpose of the paper is to review the developments in the direction of best estimate approaches with uncertainty quantification and to discuss the problems in practical applications of BEPU methods. In general, the licensee methods are following original methods. The study indicated that uncertainty analysis with random sampling of input parameters and the use of order statistics for desired tolerance limits of output parameters is today commonly accepted and mature approach. (author)
The choice of statistical methods for comparisons of dosimetric data in radiotherapy.
Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques
2014-09-18
Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman's test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman's rank and Kendall's rank tests. The Friedman's test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density-corrected methods as compared to the reference method. Spearman's and Kendall's rank tests indicated a positive correlation between the doses calculated with the different methods
International Nuclear Information System (INIS)
Kim, Yochan; Park, Jinkyun; Jung, Wondea; Jang, Inseok; Hyun Seong, Poong
2015-01-01
Despite recent efforts toward data collection for supporting human reliability analysis, there remains a lack of empirical basis in determining the effects of performance shaping factors (PSFs) on human error probabilities (HEPs). To enhance the empirical basis regarding the effects of the PSFs, a statistical methodology using a logistic regression and stepwise variable selection was proposed, and the effects of the PSF on HEPs related with the soft controls were estimated through the methodology. For this estimation, more than 600 human error opportunities related to soft controls in a computerized control room were obtained through laboratory experiments. From the eight PSF surrogates and combinations of these variables, the procedure quality, practice level, and the operation type were identified as significant factors for screen switch and mode conversion errors. The contributions of these significant factors to HEPs were also estimated in terms of a multiplicative form. The usefulness and limitation of the experimental data and the techniques employed are discussed herein, and we believe that the logistic regression and stepwise variable selection methods will provide a way to estimate the effects of PSFs on HEPs in an objective manner. - Highlights: • It is necessary to develop an empirical basis for the effects of the PSFs on the HEPs. • A statistical method using a logistic regression and variable selection was proposed. • The effects of PSFs on the HEPs of soft controls were empirically investigated. • The significant factors were identified and their effects were estimated
Estimation of Anonymous Email Network Characteristics through Statistical Disclosure Attacks
Directory of Open Access Journals (Sweden)
Javier Portela
2016-11-01
Full Text Available Social network analysis aims to obtain relational data from social systems to identify leaders, roles, and communities in order to model profiles or predict a specific behavior in users’ network. Preserving anonymity in social networks is a subject of major concern. Anonymity can be compromised by disclosing senders’ or receivers’ identity, message content, or sender-receiver relationships. Under strongly incomplete information, a statistical disclosure attack is used to estimate the network and node characteristics such as centrality and clustering measures, degree distribution, and small-world-ness. A database of email networks in 29 university faculties is used to study the method. A research on the small-world-ness and Power law characteristics of these email networks is also developed, helping to understand the behavior of small email networks.
Estimation of Anonymous Email Network Characteristics through Statistical Disclosure Attacks
Portela, Javier; García Villalba, Luis Javier; Silva Trujillo, Alejandra Guadalupe; Sandoval Orozco, Ana Lucila; Kim, Tai-Hoon
2016-01-01
Social network analysis aims to obtain relational data from social systems to identify leaders, roles, and communities in order to model profiles or predict a specific behavior in users’ network. Preserving anonymity in social networks is a subject of major concern. Anonymity can be compromised by disclosing senders’ or receivers’ identity, message content, or sender-receiver relationships. Under strongly incomplete information, a statistical disclosure attack is used to estimate the network and node characteristics such as centrality and clustering measures, degree distribution, and small-world-ness. A database of email networks in 29 university faculties is used to study the method. A research on the small-world-ness and Power law characteristics of these email networks is also developed, helping to understand the behavior of small email networks. PMID:27809275
Seasonal adjustment methods and real time trend-cycle estimation
Bee Dagum, Estela
2016-01-01
This book explores widely used seasonal adjustment methods and recent developments in real time trend-cycle estimation. It discusses in detail the properties and limitations of X12ARIMA, TRAMO-SEATS and STAMP - the main seasonal adjustment methods used by statistical agencies. Several real-world cases illustrate each method and real data examples can be followed throughout the text. The trend-cycle estimation is presented using nonparametric techniques based on moving averages, linear filters and reproducing kernel Hilbert spaces, taking recent advances into account. The book provides a systematical treatment of results that to date have been scattered throughout the literature. Seasonal adjustment and real time trend-cycle prediction play an essential part at all levels of activity in modern economies. They are used by governments to counteract cyclical recessions, by central banks to control inflation, by decision makers for better modeling and planning and by hospitals, manufacturers, builders, transportat...
Simulation methods to estimate design power: an overview for applied research.
Arnold, Benjamin F; Hogan, Daniel R; Colford, John M; Hubbard, Alan E
2011-06-20
Estimating the required sample size and statistical power for a study is an integral part of study design. For standard designs, power equations provide an efficient solution to the problem, but they are unavailable for many complex study designs that arise in practice. For such complex study designs, computer simulation is a useful alternative for estimating study power. Although this approach is well known among statisticians, in our experience many epidemiologists and social scientists are unfamiliar with the technique. This article aims to address this knowledge gap. We review an approach to estimate study power for individual- or cluster-randomized designs using computer simulation. This flexible approach arises naturally from the model used to derive conventional power equations, but extends those methods to accommodate arbitrarily complex designs. The method is universally applicable to a broad range of designs and outcomes, and we present the material in a way that is approachable for quantitative, applied researchers. We illustrate the method using two examples (one simple, one complex) based on sanitation and nutritional interventions to improve child growth. We first show how simulation reproduces conventional power estimates for simple randomized designs over a broad range of sample scenarios to familiarize the reader with the approach. We then demonstrate how to extend the simulation approach to more complex designs. Finally, we discuss extensions to the examples in the article, and provide computer code to efficiently run the example simulations in both R and Stata. Simulation methods offer a flexible option to estimate statistical power for standard and non-traditional study designs and parameters of interest. The approach we have described is universally applicable for evaluating study designs used in epidemiologic and social science research.
Similar Estimates of Temperature Impacts on Global Wheat Yield by Three Independent Methods
Liu, Bing; Asseng, Senthold; Muller, Christoph; Ewart, Frank; Elliott, Joshua; Lobell, David B.; Martre, Pierre; Ruane, Alex C.; Wallach, Daniel; Jones, James W.;
2016-01-01
The potential impact of global temperature change on global crop yield has recently been assessed with different methods. Here we show that grid-based and point-based simulations and statistical regressions (from historic records), without deliberate adaptation or CO2 fertilization effects, produce similar estimates of temperature impact on wheat yields at global and national scales. With a 1 C global temperature increase, global wheat yield is projected to decline between 4.1% and 6.4%. Projected relative temperature impacts from different methods were similar for major wheat-producing countries China, India, USA and France, but less so for Russia. Point-based and grid-based simulations, and to some extent the statistical regressions, were consistent in projecting that warmer regions are likely to suffer more yield loss with increasing temperature than cooler regions. By forming a multi-method ensemble, it was possible to quantify 'method uncertainty' in addition to model uncertainty. This significantly improves confidence in estimates of climate impacts on global food security.
Similar estimates of temperature impacts on global wheat yield by three independent methods
Liu, Bing; Asseng, Senthold; Müller, Christoph; Ewert, Frank; Elliott, Joshua; Lobell, David B.; Martre, Pierre; Ruane, Alex C.; Wallach, Daniel; Jones, James W.; Rosenzweig, Cynthia; Aggarwal, Pramod K.; Alderman, Phillip D.; Anothai, Jakarat; Basso, Bruno; Biernath, Christian; Cammarano, Davide; Challinor, Andy; Deryng, Delphine; Sanctis, Giacomo De; Doltra, Jordi; Fereres, Elias; Folberth, Christian; Garcia-Vila, Margarita; Gayler, Sebastian; Hoogenboom, Gerrit; Hunt, Leslie A.; Izaurralde, Roberto C.; Jabloun, Mohamed; Jones, Curtis D.; Kersebaum, Kurt C.; Kimball, Bruce A.; Koehler, Ann-Kristin; Kumar, Soora Naresh; Nendel, Claas; O'Leary, Garry J.; Olesen, Jørgen E.; Ottman, Michael J.; Palosuo, Taru; Prasad, P. V. Vara; Priesack, Eckart; Pugh, Thomas A. M.; Reynolds, Matthew; Rezaei, Ehsan E.; Rötter, Reimund P.; Schmid, Erwin; Semenov, Mikhail A.; Shcherbak, Iurii; Stehfest, Elke; Stöckle, Claudio O.; Stratonovitch, Pierre; Streck, Thilo; Supit, Iwan; Tao, Fulu; Thorburn, Peter; Waha, Katharina; Wall, Gerard W.; Wang, Enli; White, Jeffrey W.; Wolf, Joost; Zhao, Zhigan; Zhu, Yan
2016-12-01
The potential impact of global temperature change on global crop yield has recently been assessed with different methods. Here we show that grid-based and point-based simulations and statistical regressions (from historic records), without deliberate adaptation or CO2 fertilization effects, produce similar estimates of temperature impact on wheat yields at global and national scales. With a 1 °C global temperature increase, global wheat yield is projected to decline between 4.1% and 6.4%. Projected relative temperature impacts from different methods were similar for major wheat-producing countries China, India, USA and France, but less so for Russia. Point-based and grid-based simulations, and to some extent the statistical regressions, were consistent in projecting that warmer regions are likely to suffer more yield loss with increasing temperature than cooler regions. By forming a multi-method ensemble, it was possible to quantify `method uncertainty’ in addition to model uncertainty. This significantly improves confidence in estimates of climate impacts on global food security.
International Nuclear Information System (INIS)
Ohmori, Naoki; Ashida, Kenji; Fujita, Osamu
2003-01-01
Because the glandular content rate is an important factor in evaluating breast cancer detection and average glandular dose, it is important in mammography research to estimate and analyze this rate. The purpose of this study was to obtain a formula for statistical estimation of the glandular content rate, to clarify statistically the influence of age group and compressed breast thickness (CBT) on estimating the glandular content rate, and to show statistically the general relation between glandular content rate and the factors of age and CBT. The subjects were 740 Japanese women aged 20-91 years (mean±SD: 48.3±12.8 years) who had undergone mammography. In our study, the glandular content rate was statistically estimated from age group, mAs-value, and CBT when subjects underwent mammography, from a phantom simulation, and from MR images of the breast. In addition, multivariate analysis was carried to examine statistically the influence of age group and CBT on glandular content rate. The mean glandular content rate as estimated by age group was as follows: 35.6% for those in their 20s, 33.4% in the 30s, 27.5% in the 40s, 23.8% in the 50s, and 21.8% in those 60 and over. The rate for the subjects as a whole was 27.1%. This study indicated that overestimation occurred if the estimated value of the glandular content rate was not corrected in the 3D-measurement by MRI. In addition, this study showed that the statistical influence on glandular content rate was significantly larger for CBT than age. (author)
International Nuclear Information System (INIS)
Martin, Robert P.; Nutt, William T.
2011-01-01
Research highlights: → Historical recitation on application of order-statistics models to nuclear power plant thermal-hydraulics safety analysis. → Interpretation of regulatory language regarding 10 CFR 50.46 reference to a 'high level of probability'. → Derivation and explanation of order-statistics-based evaluation methodologies considering multi-variate acceptance criteria. → Summary of order-statistics models and recommendations to the nuclear power plant thermal-hydraulics safety analysis community. - Abstract: The application of order-statistics in best-estimate plus uncertainty nuclear safety analysis has received a considerable amount of attention from methodology practitioners, regulators, and academia. At the root of the debate are two questions: (1) what is an appropriate quantitative interpretation of 'high level of probability' in regulatory language appearing in the LOCA rule, 10 CFR 50.46 and (2) how best to mathematically characterize the multi-variate case. An original derivation is offered to provide a quantitative basis for 'high level of probability.' At root of the second question is whether one should recognize a probability statement based on the tolerance region method of Wald and Guba, et al., for multi-variate problems, one explicitly based on the regulatory limits, best articulated in the Wallis-Nutt 'Testing Method', or something else entirely. This paper reviews the origins of the different positions, key assumptions, limitations, and relationship to addressing acceptance criteria. It presents a mathematical interpretation of the regulatory language, including a complete derivation of uni-variate order-statistics (as credited in AREVA's Realistic Large Break LOCA methodology) and extension to multi-variate situations. Lastly, it provides recommendations for LOCA applications, endorsing the 'Testing Method' and addressing acceptance methods allowing for limited sample failures.
Directory of Open Access Journals (Sweden)
Zheng Hui
2011-04-01
Full Text Available Abstract Background As many respiratory viruses are responsible for influenza like symptoms, accurate measures of the disease burden are not available and estimates are generally based on statistical methods. The objective of this study was to estimate absenteeism rates and hours lost due to seasonal influenza and compare these estimates with estimates of absenteeism attributable to the two H1N1 pandemic waves that occurred in 2009. Methods Key absenteeism variables were extracted from Statistics Canada's monthly labour force survey (LFS. Absenteeism and the proportion of hours lost due to own illness or disability were modelled as a function of trend, seasonality and proxy variables for influenza activity from 1998 to 2009. Results Hours lost due to the H1N1/09 pandemic strain were elevated compared to seasonal influenza, accounting for a loss of 0.2% of potential hours worked annually. In comparison, an estimated 0.08% of hours worked annually were lost due to seasonal influenza illnesses. Absenteeism rates due to influenza were estimated at 12% per year for seasonal influenza over the 1997/98 to 2008/09 seasons, and 13% for the two H1N1/09 pandemic waves. Employees who took time off due to a seasonal influenza infection took an average of 14 hours off. For the pandemic strain, the average absence was 25 hours. Conclusions This study confirms that absenteeism due to seasonal influenza has typically ranged from 5% to 20%, with higher rates associated with multiple circulating strains. Absenteeism rates for the 2009 pandemic were similar to those occurring for seasonal influenza. Employees took more time off due to the pandemic strain than was typical for seasonal influenza.
Skinner, Carl G; Patel, Manish M; Thomas, Jerry D; Miller, Michael A
2011-01-01
Statistical methods are pervasive in medical research and general medical literature. Understanding general statistical concepts will enhance our ability to critically appraise the current literature and ultimately improve the delivery of patient care. This article intends to provide an overview of the common statistical methods relevant to medicine.
Kim, Sanghong; Kano, Manabu; Nakagawa, Hiroshi; Hasebe, Shinji
2011-12-15
Development of quality estimation models using near infrared spectroscopy (NIRS) and multivariate analysis has been accelerated as a process analytical technology (PAT) tool in the pharmaceutical industry. Although linear regression methods such as partial least squares (PLS) are widely used, they cannot always achieve high estimation accuracy because physical and chemical properties of a measuring object have a complex effect on NIR spectra. In this research, locally weighted PLS (LW-PLS) which utilizes a newly defined similarity between samples is proposed to estimate active pharmaceutical ingredient (API) content in granules for tableting. In addition, a statistical wavelength selection method which quantifies the effect of API content and other factors on NIR spectra is proposed. LW-PLS and the proposed wavelength selection method were applied to real process data provided by Daiichi Sankyo Co., Ltd., and the estimation accuracy was improved by 38.6% in root mean square error of prediction (RMSEP) compared to the conventional PLS using wavelengths selected on the basis of variable importance on the projection (VIP). The results clearly show that the proposed calibration modeling technique is useful for API content estimation and is superior to the conventional one. Copyright © 2011 Elsevier B.V. All rights reserved.
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
A method for statistically comparing spatial distribution maps
Directory of Open Access Journals (Sweden)
Reynolds Mary G
2009-01-01
Full Text Available Abstract Background Ecological niche modeling is a method for estimation of species distributions based on certain ecological parameters. Thus far, empirical determination of significant differences between independently generated distribution maps for a single species (maps which are created through equivalent processes, but with different ecological input parameters, has been challenging. Results We describe a method for comparing model outcomes, which allows a statistical evaluation of whether the strength of prediction and breadth of predicted areas is measurably different between projected distributions. To create ecological niche models for statistical comparison, we utilized GARP (Genetic Algorithm for Rule-Set Production software to generate ecological niche models of human monkeypox in Africa. We created several models, keeping constant the case location input records for each model but varying the ecological input data. In order to assess the relative importance of each ecological parameter included in the development of the individual predicted distributions, we performed pixel-to-pixel comparisons between model outcomes and calculated the mean difference in pixel scores. We used a two sample Student's t-test, (assuming as null hypothesis that both maps were identical to each other regardless of which input parameters were used to examine whether the mean difference in corresponding pixel scores from one map to another was greater than would be expected by chance alone. We also utilized weighted kappa statistics, frequency distributions, and percent difference to look at the disparities in pixel scores. Multiple independent statistical tests indicated precipitation as the single most important independent ecological parameter in the niche model for human monkeypox disease. Conclusion In addition to improving our understanding of the natural factors influencing the distribution of human monkeypox disease, such pixel-to-pixel comparison
Application of advanced statistical methods in assessment of the late phase of a nuclear accident
International Nuclear Information System (INIS)
Hofman, R.
2008-01-01
The paper presents a new methodology for improving of estimates of radiological situation on terrain in the late phase of a nuclear accident. Methods of Bayesian filtering are applied to the problem. The estimates are based on combination of modeled and measured data provided by responsible authorities. Exploiting information on uncertainty of both the data sources, we are able to produce improved estimate of the true situation on terrain. We also attempt to account for model error, which is unknown and plays crucial role in accuracy of the estimates. The main contribution of this paper is application of an approach based on advanced statistical methods, which allows for estimating of model error covariance structure upon measurements. Model error is estimated on basis of measured-minus-observed residuals evaluated upon measured and modeled values. The methodology is demonstrated on a sample scenario with simulated measurements. (authors)
Application of advanced statistical methods in assessment of the late phase of a nuclear accident
International Nuclear Information System (INIS)
Hofman, R.
2009-01-01
The paper presents a new methodology for improving of estimates of radiological situation on terrain in the late phase of a nuclear accident. Methods of Bayesian filtering are applied to the problem. The estimates are based on combination of modeled and measured data provided by responsible authorities. Exploiting information on uncertainty of both the data sources, we are able to produce improved estimate of the true situation on terrain. We also attempt to account for model error, which is unknown and plays crucial role in accuracy of the estimates. The main contribution of this paper is application of an approach based on advanced statistical methods, which allows for estimating of model error covariance structure upon measurements. Model error is estimated on basis of measured-minus-observed residuals evaluated upon measured and modeled values. The methodology is demonstrated on a sample scenario with simulated measurements. (authors)
Estimating Selected Streamflow Statistics Representative of 1930-2002 in West Virginia
Wiley, Jeffrey B.
2008-01-01
Regional equations and procedures were developed for estimating 1-, 3-, 7-, 14-, and 30-day 2-year; 1-, 3-, 7-, 14-, and 30-day 5-year; and 1-, 3-, 7-, 14-, and 30-day 10-year hydrologically based low-flow frequency values for unregulated streams in West Virginia. Regional equations and procedures also were developed for estimating the 1-day, 3-year and 4-day, 3-year biologically based low-flow frequency values; the U.S. Environmental Protection Agency harmonic-mean flows; and the 10-, 25-, 50-, 75-, and 90-percent flow-duration values. Regional equations were developed using ordinary least-squares regression using statistics from 117 U.S. Geological Survey continuous streamflow-gaging stations as dependent variables and basin characteristics as independent variables. Equations for three regions in West Virginia - North, South-Central, and Eastern Panhandle - were determined. Drainage area, precipitation, and longitude of the basin centroid are significant independent variables in one or more of the equations. Estimating procedures are presented for determining statistics at a gaging station, a partial-record station, and an ungaged location. Examples of some estimating procedures are presented.
Methods to estimate historical daily streamflow for ungaged stream locations in Minnesota
Lorenz, David L.; Ziegeweid, Jeffrey R.
2016-03-14
Effective and responsible management of water resources relies on a thorough understanding of the quantity and quality of available water; however, streamgages cannot be installed at every location where streamflow information is needed. Therefore, methods for estimating streamflow at ungaged stream locations need to be developed. This report presents a statewide study to develop methods to estimate the structure of historical daily streamflow at ungaged stream locations in Minnesota. Historical daily mean streamflow at ungaged locations in Minnesota can be estimated by transferring streamflow data at streamgages to the ungaged location using the QPPQ method. The QPPQ method uses flow-duration curves at an index streamgage, relying on the assumption that exceedance probabilities are equivalent between the index streamgage and the ungaged location, and estimates the flow at the ungaged location using the estimated flow-duration curve. Flow-duration curves at ungaged locations can be estimated using recently developed regression equations that have been incorporated into StreamStats (http://streamstats.usgs.gov/), which is a U.S. Geological Survey Web-based interactive mapping tool that can be used to obtain streamflow statistics, drainage-basin characteristics, and other information for user-selected locations on streams.
On the method of logarithmic cumulants for parametric probability density function estimation.
Krylov, Vladimir A; Moser, Gabriele; Serpico, Sebastiano B; Zerubia, Josiane
2013-10-01
Parameter estimation of probability density functions is one of the major steps in the area of statistical image and signal processing. In this paper we explore several properties and limitations of the recently proposed method of logarithmic cumulants (MoLC) parameter estimation approach which is an alternative to the classical maximum likelihood (ML) and method of moments (MoM) approaches. We derive the general sufficient condition for a strong consistency of the MoLC estimates which represents an important asymptotic property of any statistical estimator. This result enables the demonstration of the strong consistency of MoLC estimates for a selection of widely used distribution families originating from (but not restricted to) synthetic aperture radar image processing. We then derive the analytical conditions of applicability of MoLC to samples for the distribution families in our selection. Finally, we conduct various synthetic and real data experiments to assess the comparative properties, applicability and small sample performance of MoLC notably for the generalized gamma and K families of distributions. Supervised image classification experiments are considered for medical ultrasound and remote-sensing SAR imagery. The obtained results suggest that MoLC is a feasible and computationally fast yet not universally applicable alternative to MoM. MoLC becomes especially useful when the direct ML approach turns out to be unfeasible.
Sikora, Grzegorz; Teuerle, Marek; Wyłomańska, Agnieszka; Grebenkov, Denis
2017-08-01
The most common way of estimating the anomalous scaling exponent from single-particle trajectories consists of a linear fit of the dependence of the time-averaged mean-square displacement on the lag time at the log-log scale. We investigate the statistical properties of this estimator in the case of fractional Brownian motion (FBM). We determine the mean value, the variance, and the distribution of the estimator. Our theoretical results are confirmed by Monte Carlo simulations. In the limit of long trajectories, the estimator is shown to be asymptotically unbiased, consistent, and with vanishing variance. These properties ensure an accurate estimation of the scaling exponent even from a single (long enough) trajectory. As a consequence, we prove that the usual way to estimate the diffusion exponent of FBM is correct from the statistical point of view. Moreover, the knowledge of the estimator distribution is the first step toward new statistical tests of FBM and toward a more reliable interpretation of the experimental histograms of scaling exponents in microbiology.
Qu, Long; Nettleton, Dan; Dekkers, Jack C M
2012-12-01
Given a large number of t-statistics, we consider the problem of approximating the distribution of noncentrality parameters (NCPs) by a continuous density. This problem is closely related to the control of false discovery rates (FDR) in massive hypothesis testing applications, e.g., microarray gene expression analysis. Our methodology is similar to, but improves upon, the existing approach by Ruppert, Nettleton, and Hwang (2007, Biometrics, 63, 483-495). We provide parametric, nonparametric, and semiparametric estimators for the distribution of NCPs, as well as estimates of the FDR and local FDR. In the parametric situation, we assume that the NCPs follow a distribution that leads to an analytically available marginal distribution for the test statistics. In the nonparametric situation, we use convex combinations of basis density functions to estimate the density of the NCPs. A sequential quadratic programming procedure is developed to maximize the penalized likelihood. The smoothing parameter is selected with the approximate network information criterion. A semiparametric estimator is also developed to combine both parametric and nonparametric fits. Simulations show that, under a variety of situations, our density estimates are closer to the underlying truth and our FDR estimates are improved compared with alternative methods. Data-based simulations and the analyses of two microarray datasets are used to evaluate the performance in realistic situations. © 2012, The International Biometric Society.
International Nuclear Information System (INIS)
Azarm, M.A.; Hsu, F.; Martinez-Guridi, G.; Vesely, W.E.
1993-07-01
This report introduces a new perspective on the basic concept of dependent failures where the definition of dependency is based on clustering in failure times of similar components. This perspective has two significant implications: first, it relaxes the conventional assumption that dependent failures must be simultaneous and result from a severe shock; second, it allows the analyst to use all the failures in a time continuum to estimate the potential for multiple failures in a window of time (e.g., a test interval), therefore arriving at a more accurate value for system unavailability. In addition, the models developed here provide a method for plant-specific analysis of dependency, reflecting the plant-specific maintenance practices that reduce or increase the contribution of dependent failures to system unavailability. The proposed methodology can be used for screening analysis of failure data to estimate the fraction of dependent failures among the failures. In addition, the proposed method can evaluate the impact of the observed dependency on system unavailability and plant risk. The formulations derived in this report have undergone various levels of validations through computer simulation studies and pilot applications. The pilot applications of these methodologies showed that the contribution of dependent failures of diesel generators in one plant was negligible, while in another plant was quite significant. It also showed that in the plant with significant contribution of dependency to Emergency Power System (EPS) unavailability, the contribution changed with time. Similar findings were reported for the Containment Fan Cooler breakers. Drawing such conclusions about system performance would not have been possible with any other reported dependency methodologies
Statistical approach for selection of regression model during validation of bioanalytical method
Directory of Open Access Journals (Sweden)
Natalija Nakov
2014-06-01
Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.
Study on Comparison of Bidding and Pricing Behavior Distinction between Estimate Methods
Morimoto, Emi; Namerikawa, Susumu
The most characteristic trend on bidding and pricing behavior distinction in recent years is the increasing number of bidders just above the criteria for low-price bidding investigations. The contractor's markup is the difference between the bidding price and the execution price. Therefore, the contractor's markup is the difference between criteria for low-price bidding investigations price and the execution price in the public works bid in Japan. Virtually, bidder's strategies and behavior have been controlled by public engineer's budgets. Estimation and bid are inseparably linked in the Japanese public works procurement system. The trial of the unit price-type estimation method begins in 2004. On another front, accumulated estimation method is one of the general methods in public works. So, there are two types of standard estimation methods in Japan. In this study, we did a statistical analysis on the bid information of civil engineering works for the Ministry of Land, Infrastructure, and Transportation in 2008. It presents several issues that bidding and pricing behavior is related to an estimation method (several estimation methods) for public works bid in Japan. The two types of standard estimation methods produce different results that number of bidders (decide on bid-no bid strategy) and distribution of bid price (decide on mark-up strategy).The comparison on the distribution of bid prices showed that the percentage of the bid concentrated on the criteria for low-price bidding investigations have had a tendency to get higher in the large-sized public works by the unit price-type estimation method, comparing with the accumulated estimation method. On one hand, the number of bidders who bids for public works estimated unit-price tends to increase significantly Public works estimated unit-price is likely to have been one of the factors for the construction companies to decide if they participate in the biddings.
Mann, Michael E.; Steinman, Byron A.; Miller, Sonya K.; Frankcombe, Leela M.; England, Matthew H.; Cheung, Anson H.
2016-04-01
The temporary slowdown in large-scale surface warming during the early 2000s has been attributed to both external and internal sources of climate variability. Using semiempirical estimates of the internal low-frequency variability component in Northern Hemisphere, Atlantic, and Pacific surface temperatures in concert with statistical hindcast experiments, we investigate whether the slowdown and its recent recovery were predictable. We conclude that the internal variability of the North Pacific, which played a critical role in the slowdown, does not appear to have been predictable using statistical forecast methods. An additional minor contribution from the North Atlantic, by contrast, appears to exhibit some predictability. While our analyses focus on combining semiempirical estimates of internal climatic variability with statistical hindcast experiments, possible implications for initialized model predictions are also discussed.
Statistical methods for the analysis of a screening test for chronic beryllium disease
Energy Technology Data Exchange (ETDEWEB)
Frome, E.L.; Neubert, R.L. [Oak Ridge National Lab., TN (United States). Mathematical Sciences Section; Smith, M.H.; Littlefield, L.G.; Colyer, S.P. [Oak Ridge Inst. for Science and Education, TN (United States). Medical Sciences Div.
1994-10-01
The lymphocyte proliferation test (LPT) is a noninvasive screening procedure used to identify persons who may have chronic beryllium disease. A practical problem in the analysis of LPT well counts is the occurrence of outlying data values (approximately 7% of the time). A log-linear regression model is used to describe the expected well counts for each set of test conditions. The variance of the well counts is proportional to the square of the expected counts, and two resistant regression methods are used to estimate the parameters of interest. The first approach uses least absolute values (LAV) on the log of the well counts to estimate beryllium stimulation indices (SIs) and the coefficient of variation. The second approach uses a resistant regression version of maximum quasi-likelihood estimation. A major advantage of the resistant regression methods is that it is not necessary to identify and delete outliers. These two new methods for the statistical analysis of the LPT data and the outlier rejection method that is currently being used are applied to 173 LPT assays. The authors strongly recommend the LAV method for routine analysis of the LPT.
The choice of statistical methods for comparisons of dosimetric data in radiotherapy
International Nuclear Information System (INIS)
Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques
2014-01-01
Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman’s test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman’s rank and Kendall’s rank tests. The Friedman’s test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p <0.001). The density correction methods yielded to lower doses as compared to PBC by on average (−5 ± 4.4 SD) for MB and (−4.7 ± 5 SD) for ETAR. Post-hoc Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density
Estimation of measurement variance in the context of environment statistics
Maiti, Pulakesh
2015-02-01
The object of environment statistics is for providing information on the environment, on its most important changes over time, across locations and identifying the main factors that influence them. Ultimately environment statistics would be required to produce higher quality statistical information. For this timely, reliable and comparable data are needed. Lack of proper and uniform definitions, unambiguous classifications pose serious problems to procure qualitative data. These cause measurement errors. We consider the problem of estimating measurement variance so that some measures may be adopted to improve upon the quality of data on environmental goods and services and on value statement in economic terms. The measurement technique considered here is that of employing personal interviewers and the sampling considered here is that of two-stage sampling.
Statistical Methods for Stochastic Differential Equations
Kessler, Mathieu; Sorensen, Michael
2012-01-01
The seventh volume in the SemStat series, Statistical Methods for Stochastic Differential Equations presents current research trends and recent developments in statistical methods for stochastic differential equations. Written to be accessible to both new students and seasoned researchers, each self-contained chapter starts with introductions to the topic at hand and builds gradually towards discussing recent research. The book covers Wiener-driven equations as well as stochastic differential equations with jumps, including continuous-time ARMA processes and COGARCH processes. It presents a sp
Statistics as Unbiased Estimators: Exploring the Teaching of Standard Deviation
Wasserman, Nicholas H.; Casey, Stephanie; Champion, Joe; Huey, Maryann
2017-01-01
This manuscript presents findings from a study about the knowledge for and planned teaching of standard deviation. We investigate how understanding variance as an unbiased (inferential) estimator--not just a descriptive statistic for the variation (spread) in data--is related to teachers' instruction regarding standard deviation, particularly…
Nakae, Ken; Ikegaya, Yuji; Ishikawa, Tomoe; Oba, Shigeyuki; Urakubo, Hidetoshi; Koyama, Masanori; Ishii, Shin
2014-01-01
Crosstalk between neurons and glia may constitute a significant part of information processing in the brain. We present a novel method of statistically identifying interactions in a neuron–glia network. We attempted to identify neuron–glia interactions from neuronal and glial activities via maximum-a-posteriori (MAP)-based parameter estimation by developing a generalized linear model (GLM) of a neuron–glia network. The interactions in our interest included functional connectivity and response functions. We evaluated the cross-validated likelihood of GLMs that resulted from the addition or removal of connections to confirm the existence of specific neuron-to-glia or glia-to-neuron connections. We only accepted addition or removal when the modification improved the cross-validated likelihood. We applied the method to a high-throughput, multicellular in vitro Ca2+ imaging dataset obtained from the CA3 region of a rat hippocampus, and then evaluated the reliability of connectivity estimates using a statistical test based on a surrogate method. Our findings based on the estimated connectivity were in good agreement with currently available physiological knowledge, suggesting our method can elucidate undiscovered functions of neuron–glia systems. PMID:25393874
Rate of formation of neutron stars in the galaxy estimated from stellar statistics
International Nuclear Information System (INIS)
Endal, A.S.
1979-01-01
Stellar statistics and stellar evolution models can be used to estimate the rate of formation of neutron stars in the Galaxy. A recent analysis by Hills suggests that the mean interval between neutron-star births is greater than 27 years. This is incompatible with estimates based on pulsar statistics. However, a closer examination of the stellar data shows that Hill's result is incorrect. A mean interval between neutron-star births as short as 4 years is consistent with (though certainly not required by) stellar evolution theory
Simple statistical methods for software engineering data and patterns
Pandian, C Ravindranath
2015-01-01
Although there are countless books on statistics, few are dedicated to the application of statistical methods to software engineering. Simple Statistical Methods for Software Engineering: Data and Patterns fills that void. Instead of delving into overly complex statistics, the book details simpler solutions that are just as effective and connect with the intuition of problem solvers.Sharing valuable insights into software engineering problems and solutions, the book not only explains the required statistical methods, but also provides many examples, review questions, and case studies that prov
Application of blended learning in teaching statistical methods
Directory of Open Access Journals (Sweden)
Barbara Dębska
2012-12-01
Full Text Available The paper presents the application of a hybrid method (blended learning - linking traditional education with on-line education to teach selected problems of mathematical statistics. This includes the teaching of the application of mathematical statistics to evaluate laboratory experimental results. An on-line statistics course was developed to form an integral part of the module ‘methods of statistical evaluation of experimental results’. The course complies with the principles outlined in the Polish National Framework of Qualifications with respect to the scope of knowledge, skills and competencies that students should have acquired at course completion. The paper presents the structure of the course and the educational content provided through multimedia lessons made accessible on the Moodle platform. Following courses which used the traditional method of teaching and courses which used the hybrid method of teaching, students test results were compared and discussed to evaluate the effectiveness of the hybrid method of teaching when compared to the effectiveness of the traditional method of teaching.
Dynamic whole-body PET parametric imaging: II. Task-oriented statistical estimation.
Karakatsanis, Nicolas A; Lodge, Martin A; Zhou, Y; Wahl, Richard L; Rahmim, Arman
2013-10-21
In the context of oncology, dynamic PET imaging coupled with standard graphical linear analysis has been previously employed to enable quantitative estimation of tracer kinetic parameters of physiological interest at the voxel level, thus, enabling quantitative PET parametric imaging. However, dynamic PET acquisition protocols have been confined to the limited axial field-of-view (~15-20 cm) of a single-bed position and have not been translated to the whole-body clinical imaging domain. On the contrary, standardized uptake value (SUV) PET imaging, considered as the routine approach in clinical oncology, commonly involves multi-bed acquisitions, but is performed statically, thus not allowing for dynamic tracking of the tracer distribution. Here, we pursue a transition to dynamic whole-body PET parametric imaging, by presenting, within a unified framework, clinically feasible multi-bed dynamic PET acquisition protocols and parametric imaging methods. In a companion study, we presented a novel clinically feasible dynamic (4D) multi-bed PET acquisition protocol as well as the concept of whole-body PET parametric imaging employing Patlak ordinary least squares (OLS) regression to estimate the quantitative parameters of tracer uptake rate Ki and total blood distribution volume V. In the present study, we propose an advanced hybrid linear regression framework, driven by Patlak kinetic voxel correlations, to achieve superior trade-off between contrast-to-noise ratio (CNR) and mean squared error (MSE) than provided by OLS for the final Ki parametric images, enabling task-based performance optimization. Overall, whether the observer's task is to detect a tumor or quantitatively assess treatment response, the proposed statistical estimation framework can be adapted to satisfy the specific task performance criteria, by adjusting the Patlak correlation-coefficient (WR) reference value. The multi-bed dynamic acquisition protocol, as optimized in the preceding companion study
Dynamic whole-body PET parametric imaging: II. Task-oriented statistical estimation
International Nuclear Information System (INIS)
Karakatsanis, Nicolas A; Lodge, Martin A; Zhou, Y; Wahl, Richard L; Rahmim, Arman
2013-01-01
In the context of oncology, dynamic PET imaging coupled with standard graphical linear analysis has been previously employed to enable quantitative estimation of tracer kinetic parameters of physiological interest at the voxel level, thus, enabling quantitative PET parametric imaging. However, dynamic PET acquisition protocols have been confined to the limited axial field-of-view (∼15–20 cm) of a single-bed position and have not been translated to the whole-body clinical imaging domain. On the contrary, standardized uptake value (SUV) PET imaging, considered as the routine approach in clinical oncology, commonly involves multi-bed acquisitions, but is performed statically, thus not allowing for dynamic tracking of the tracer distribution. Here, we pursue a transition to dynamic whole-body PET parametric imaging, by presenting, within a unified framework, clinically feasible multi-bed dynamic PET acquisition protocols and parametric imaging methods. In a companion study, we presented a novel clinically feasible dynamic (4D) multi-bed PET acquisition protocol as well as the concept of whole-body PET parametric imaging employing Patlak ordinary least squares (OLS) regression to estimate the quantitative parameters of tracer uptake rate K i and total blood distribution volume V. In the present study, we propose an advanced hybrid linear regression framework, driven by Patlak kinetic voxel correlations, to achieve superior trade-off between contrast-to-noise ratio (CNR) and mean squared error (MSE) than provided by OLS for the final K i parametric images, enabling task-based performance optimization. Overall, whether the observer's task is to detect a tumor or quantitatively assess treatment response, the proposed statistical estimation framework can be adapted to satisfy the specific task performance criteria, by adjusting the Patlak correlation-coefficient (WR) reference value. The multi-bed dynamic acquisition protocol, as optimized in the preceding companion
Cumulant-Based Coherent Signal Subspace Method for Bearing and Range Estimation
Directory of Open Access Journals (Sweden)
Bourennane Salah
2007-01-01
Full Text Available A new method for simultaneous range and bearing estimation for buried objects in the presence of an unknown Gaussian noise is proposed. This method uses the MUSIC algorithm with noise subspace estimated by using the slice fourth-order cumulant matrix of the received data. The higher-order statistics aim at the removal of the additive unknown Gaussian noise. The bilinear focusing operator is used to decorrelate the received signals and to estimate the coherent signal subspace. A new source steering vector is proposed including the acoustic scattering model at each sensor. Range and bearing of the objects at each sensor are expressed as a function of those at the first sensor. This leads to the improvement of object localization anywhere, in the near-field or in the far-field zone of the sensor array. Finally, the performances of the proposed method are validated on data recorded during experiments in a water tank.
Development of a Research Methods and Statistics Concept Inventory
Veilleux, Jennifer C.; Chapman, Kate M.
2017-01-01
Research methods and statistics are core courses in the undergraduate psychology major. To assess learning outcomes, it would be useful to have a measure that assesses research methods and statistical literacy beyond course grades. In two studies, we developed and provided initial validation results for a research methods and statistical knowledge…
Method of statistical estimation of temperature minimums in binary systems
International Nuclear Information System (INIS)
Mireev, V.A.; Safonov, V.V.
1985-01-01
On the basis of statistical processing of literature data the technique for evaluation of temperature minima on liquidus curves in binary systems with common ion chloride systems being taken as an example, is developed. The systems are formed by 48 chlorides of 45 chemical elements including alkali, alkaline earth, rare earth and transition metals as well as Cd, In, Th. It is shown that calculation error in determining minimum melting points depends on topology of the phase diagram. The comparison of calculated and experimental data for several previously nonstudied systems is given
Statistical characterization of roughness uncertainty and impact on wind resource estimation
Directory of Open Access Journals (Sweden)
M. Kelly
2017-04-01
Full Text Available In this work we relate uncertainty in background roughness length (z0 to uncertainty in wind speeds, where the latter are predicted at a wind farm location based on wind statistics observed at a different site. Sensitivity of predicted winds to roughness is derived analytically for the industry-standard European Wind Atlas method, which is based on the geostrophic drag law. We statistically consider roughness and its corresponding uncertainty, in terms of both z0 derived from measured wind speeds as well as that chosen in practice by wind engineers. We show the combined effect of roughness uncertainty arising from differing wind-observation and turbine-prediction sites; this is done for the case of roughness bias as well as for the general case. For estimation of uncertainty in annual energy production (AEP, we also develop a generalized analytical turbine power curve, from which we derive a relation between mean wind speed and AEP. Following our developments, we provide guidance on approximate roughness uncertainty magnitudes to be expected in industry practice, and we also find that sites with larger background roughness incur relatively larger uncertainties.
Robust Control Methods for On-Line Statistical Learning
Directory of Open Access Journals (Sweden)
Capobianco Enrico
2001-01-01
Full Text Available The issue of controlling that data processing in an experiment results not affected by the presence of outliers is relevant for statistical control and learning studies. Learning schemes should thus be tested for their capacity of handling outliers in the observed training set so to achieve reliable estimates with respect to the crucial bias and variance aspects. We describe possible ways of endowing neural networks with statistically robust properties by defining feasible error criteria. It is convenient to cast neural nets in state space representations and apply both Kalman filter and stochastic approximation procedures in order to suggest statistically robustified solutions for on-line learning.
Statistical Models and Methods for Lifetime Data
Lawless, Jerald F
2011-01-01
Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,
Del Pico, Wayne J
2014-01-01
Simplify the estimating process with the latest data, materials, and practices Electrical Estimating Methods, Fourth Edition is a comprehensive guide to estimating electrical costs, with data provided by leading construction database RS Means. The book covers the materials and processes encountered by the modern contractor, and provides all the information professionals need to make the most precise estimate. The fourth edition has been updated to reflect the changing materials, techniques, and practices in the field, and provides the most recent Means cost data available. The complexity of el
Statistical methods in spatial genetics
DEFF Research Database (Denmark)
Guillot, Gilles; Leblois, Raphael; Coulon, Aurelie
2009-01-01
The joint analysis of spatial and genetic data is rapidly becoming the norm in population genetics. More and more studies explicitly describe and quantify the spatial organization of genetic variation and try to relate it to underlying ecological processes. As it has become increasingly difficult...... to keep abreast with the latest methodological developments, we review the statistical toolbox available to analyse population genetic data in a spatially explicit framework. We mostly focus on statistical concepts but also discuss practical aspects of the analytical methods, highlighting not only...
Directory of Open Access Journals (Sweden)
Shkvarko Yuriy
2006-01-01
Full Text Available We address a new approach to solve the ill-posed nonlinear inverse problem of high-resolution numerical reconstruction of the spatial spectrum pattern (SSP of the backscattered wavefield sources distributed over the remotely sensed scene. An array or synthesized array radar (SAR that employs digital data signal processing is considered. By exploiting the idea of combining the statistical minimum risk estimation paradigm with numerical descriptive regularization techniques, we address a new fused statistical descriptive regularization (SDR strategy for enhanced radar imaging. Pursuing such an approach, we establish a family of the SDR-related SSP estimators, that encompass a manifold of existing beamforming techniques ranging from traditional matched filter to robust and adaptive spatial filtering, and minimum variance methods.
Statistical learning methods in high-energy and astrophysics analysis
Energy Technology Data Exchange (ETDEWEB)
Zimmermann, J. [Forschungszentrum Juelich GmbH, Zentrallabor fuer Elektronik, 52425 Juelich (Germany) and Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)]. E-mail: zimmerm@mppmu.mpg.de; Kiesling, C. [Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)
2004-11-21
We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application.
Statistical learning methods in high-energy and astrophysics analysis
International Nuclear Information System (INIS)
Zimmermann, J.; Kiesling, C.
2004-01-01
We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application
Glushak, P. A.; Markiv, B. B.; Tokarchuk, M. V.
2018-01-01
We present a generalization of Zubarev's nonequilibrium statistical operator method based on the principle of maximum Renyi entropy. In the framework of this approach, we obtain transport equations for the basic set of parameters of the reduced description of nonequilibrium processes in a classical system of interacting particles using Liouville equations with fractional derivatives. For a classical systems of particles in a medium with a fractal structure, we obtain a non-Markovian diffusion equation with fractional spatial derivatives. For a concrete model of the frequency dependence of a memory function, we obtain generalized Kettano-type diffusion equation with the spatial and temporal fractality taken into account. We present a generalization of nonequilibrium thermofield dynamics in Zubarev's nonequilibrium statistical operator method in the framework of Renyi statistics.
Statistical methods and their applications in constructional engineering
International Nuclear Information System (INIS)
1977-01-01
An introduction into the basic terms of statistics is followed by a discussion of elements of the probability theory, customary discrete and continuous distributions, simulation methods, statistical supporting framework dynamics, and a cost-benefit analysis of the methods introduced. (RW) [de
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
Statistical model for forecasting uranium prices to estimate the nuclear fuel cycle cost
International Nuclear Information System (INIS)
Kim, Sung Ki; Ko, Won Il; Nam, Hyoon; Kim, Chul Min; Chung, Yang Hon; Bang, Sung Sig
2017-01-01
This paper presents a method for forecasting future uranium prices that is used as input data to calculate the uranium cost, which is a rational key cost driver of the nuclear fuel cycle cost. In other words, the statistical autoregressive integrated moving average (ARIMA) model and existing engineering cost estimation method, the so-called escalation rate model, were subjected to a comparative analysis. When the uranium price was forecasted in 2015, the margin of error of the ARIMA model forecasting was calculated and found to be 5.4%, whereas the escalation rate model was found to have a margin of error of 7.32%. Thus, it was verified that the ARIMA model is more suitable than the escalation rate model at decreasing uncertainty in nuclear fuel cycle cost calculation
Statistical model for forecasting uranium prices to estimate the nuclear fuel cycle cost
Energy Technology Data Exchange (ETDEWEB)
Kim, Sung Ki; Ko, Won Il; Nam, Hyoon [Nuclear Fuel Cycle Analysis, Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of); Kim, Chul Min; Chung, Yang Hon; Bang, Sung Sig [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)
2017-08-15
This paper presents a method for forecasting future uranium prices that is used as input data to calculate the uranium cost, which is a rational key cost driver of the nuclear fuel cycle cost. In other words, the statistical autoregressive integrated moving average (ARIMA) model and existing engineering cost estimation method, the so-called escalation rate model, were subjected to a comparative analysis. When the uranium price was forecasted in 2015, the margin of error of the ARIMA model forecasting was calculated and found to be 5.4%, whereas the escalation rate model was found to have a margin of error of 7.32%. Thus, it was verified that the ARIMA model is more suitable than the escalation rate model at decreasing uncertainty in nuclear fuel cycle cost calculation.
Pilot points method for conditioning multiple-point statistical facies simulation on flow data
Ma, Wei; Jafarpour, Behnam
2018-05-01
We propose a new pilot points method for conditioning discrete multiple-point statistical (MPS) facies simulation on dynamic flow data. While conditioning MPS simulation on static hard data is straightforward, their calibration against nonlinear flow data is nontrivial. The proposed method generates conditional models from a conceptual model of geologic connectivity, known as a training image (TI), by strategically placing and estimating pilot points. To place pilot points, a score map is generated based on three sources of information: (i) the uncertainty in facies distribution, (ii) the model response sensitivity information, and (iii) the observed flow data. Once the pilot points are placed, the facies values at these points are inferred from production data and then are used, along with available hard data at well locations, to simulate a new set of conditional facies realizations. While facies estimation at the pilot points can be performed using different inversion algorithms, in this study the ensemble smoother (ES) is adopted to update permeability maps from production data, which are then used to statistically infer facies types at the pilot point locations. The developed method combines the information in the flow data and the TI by using the former to infer facies values at selected locations away from the wells and the latter to ensure consistent facies structure and connectivity where away from measurement locations. Several numerical experiments are used to evaluate the performance of the developed method and to discuss its important properties.
Directory of Open Access Journals (Sweden)
A. E. Pismak
2016-03-01
Full Text Available Subject of Research. The paper is focused on Wiktionary articles structural organization in the aspect of its usage as the base for semantic network. Wiktionary community references, article templates and articles markup features are analyzed. The problem of numerical estimation for semantic similarity of structural elements in Wiktionary articles is considered. Analysis of existing software for semantic similarity estimation of such elements is carried out; algorithms of their functioning are studied; their advantages and disadvantages are shown. Methods. Mathematical statistics methods were used to analyze Wiktionary articles markup features. The method of semantic similarity computing based on statistics data for compared structural elements was proposed.Main Results. We have concluded that there is no possibility for direct use of Wiktionary articles as the source for semantic network. We have proposed to find hidden similarity between article elements, and for that purpose we have developed the algorithm for calculation of confidence coefficients proving that each pair of sentences is semantically near. The research of quantitative and qualitative characteristics for the developed algorithm has shown its major performance advantage over the other existing solutions in the presence of insignificantly higher error rate. Practical Relevance. The resulting algorithm may be useful in developing tools for automatic Wiktionary articles parsing. The developed method could be used in computing of semantic similarity for short text fragments in natural language in case of algorithm performance requirements are higher than its accuracy specifications.
An age estimation method using brain local features for T1-weighted images.
Kondo, Chihiro; Ito, Koichi; Kai Wu; Sato, Kazunori; Taki, Yasuyuki; Fukuda, Hiroshi; Aoki, Takafumi
2015-08-01
Previous statistical analysis studies using large-scale brain magnetic resonance (MR) image databases have examined that brain tissues have age-related morphological changes. This fact indicates that one can estimate the age of a subject from his/her brain MR image by evaluating morphological changes with healthy aging. This paper proposes an age estimation method using local features extracted from T1-weighted MR images. The brain local features are defined by volumes of brain tissues parcellated into local regions defined by the automated anatomical labeling atlas. The proposed method selects optimal local regions to improve the performance of age estimation. We evaluate performance of the proposed method using 1,146 T1-weighted images from a Japanese MR image database. We also discuss the medical implication of selected optimal local regions.
Online Statistics Labs in MSW Research Methods Courses: Reducing Reluctance toward Statistics
Elliott, William; Choi, Eunhee; Friedline, Terri
2013-01-01
This article presents results from an evaluation of an online statistics lab as part of a foundations research methods course for master's-level social work students. The article discusses factors that contribute to an environment in social work that fosters attitudes of reluctance toward learning and teaching statistics in research methods…
Spatial analysis statistics, visualization, and computational methods
Oyana, Tonny J
2015-01-01
An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...
International Nuclear Information System (INIS)
Harlim, John; Mahdi, Adam; Majda, Andrew J.
2014-01-01
A central issue in contemporary science is the development of nonlinear data driven statistical–dynamical models for time series of noisy partial observations from nature or a complex model. It has been established recently that ad-hoc quadratic multi-level regression models can have finite-time blow-up of statistical solutions and/or pathological behavior of their invariant measure. Recently, a new class of physics constrained nonlinear regression models were developed to ameliorate this pathological behavior. Here a new finite ensemble Kalman filtering algorithm is developed for estimating the state, the linear and nonlinear model coefficients, the model and the observation noise covariances from available partial noisy observations of the state. Several stringent tests and applications of the method are developed here. In the most complex application, the perfect model has 57 degrees of freedom involving a zonal (east–west) jet, two topographic Rossby waves, and 54 nonlinearly interacting Rossby waves; the perfect model has significant non-Gaussian statistics in the zonal jet with blocked and unblocked regimes and a non-Gaussian skewed distribution due to interaction with the other 56 modes. We only observe the zonal jet contaminated by noise and apply the ensemble filter algorithm for estimation. Numerically, we find that a three dimensional nonlinear stochastic model with one level of memory mimics the statistical effect of the other 56 modes on the zonal jet in an accurate fashion, including the skew non-Gaussian distribution and autocorrelation decay. On the other hand, a similar stochastic model with zero memory levels fails to capture the crucial non-Gaussian behavior of the zonal jet from the perfect 57-mode model
Babamoradi, Hamid; van den Berg, Frans; Rinnan, Åsmund
2016-02-18
In Multivariate Statistical Process Control, when a fault is expected or detected in the process, contribution plots are essential for operators and optimization engineers in identifying those process variables that were affected by or might be the cause of the fault. The traditional way of interpreting a contribution plot is to examine the largest contributing process variables as the most probable faulty ones. This might result in false readings purely due to the differences in natural variation, measurement uncertainties, etc. It is more reasonable to compare variable contributions for new process runs with historical results achieved under Normal Operating Conditions, where confidence limits for contribution plots estimated from training data are used to judge new production runs. Asymptotic methods cannot provide confidence limits for contribution plots, leaving re-sampling methods as the only option. We suggest bootstrap re-sampling to build confidence limits for all contribution plots in online PCA-based MSPC. The new strategy to estimate CLs is compared to the previously reported CLs for contribution plots. An industrial batch process dataset was used to illustrate the concepts. Copyright © 2016 Elsevier B.V. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Suh, M. Y.; Jee, K. Y.; Park, K. K.; Park, Y. J.; Kim, W. H
1999-08-01
This report is intended to describe the statistical methods necessary to design and conduct radiation counting experiments and evaluate the data from the experiment. The methods are described for the evaluation of the stability of a counting system and the estimation of the precision of counting data by application of probability distribution models. The methods for the determination of the uncertainty of the results calculated from the number of counts, as well as various statistical methods for the reduction of counting error are also described. (Author). 11 refs., 8 tabs., 8 figs.
Energy Technology Data Exchange (ETDEWEB)
Suh, M. Y.; Jee, K. Y.; Park, K. K. [Korea Atomic Energy Research Institute, Taejon (Korea)
1999-08-01
This report is intended to describe the statistical methods necessary to design and conduct radiation counting experiments and evaluate the data from the experiments. The methods are described for the evaluation of the stability of a counting system and the estimation of the precision of counting data by application of probability distribution models. The methods for the determination of the uncertainty of the results calculated from the number of counts, as well as various statistical methods for the reduction of counting error are also described. 11 refs., 6 figs., 8 tabs. (Author)
International Nuclear Information System (INIS)
Suh, M. Y.; Jee, K. Y.; Park, K. K.; Park, Y. J.; Kim, W. H.
1999-08-01
This report is intended to describe the statistical methods necessary to design and conduct radiation counting experiments and evaluate the data from the experiment. The methods are described for the evaluation of the stability of a counting system and the estimation of the precision of counting data by application of probability distribution models. The methods for the determination of the uncertainty of the results calculated from the number of counts, as well as various statistical methods for the reduction of counting error are also described. (Author). 11 refs., 8 tabs., 8 figs
Likelihood devices in spatial statistics
Zwet, E.W. van
1999-01-01
One of the main themes of this thesis is the application to spatial data of modern semi- and nonparametric methods. Another, closely related theme is maximum likelihood estimation from spatial data. Maximum likelihood estimation is not common practice in spatial statistics. The method of moments
Improved Statistical Method For Hydrographic Climatic Records Quality Control
Gourrion, J.; Szekely, T.
2016-02-01
Climate research benefits from the continuous development of global in-situ hydrographic networks in the last decades. Apart from the increasing volume of observations available on a large range of temporal and spatial scales, a critical aspect concerns the ability to constantly improve the quality of the datasets. In the context of the Coriolis Dataset for ReAnalysis (CORA) version 4.2, a new quality control method based on a local comparison to historical extreme values ever observed is developed, implemented and validated. Temperature, salinity and potential density validity intervals are directly estimated from minimum and maximum values from an historical reference dataset, rather than from traditional mean and standard deviation estimates. Such an approach avoids strong statistical assumptions on the data distributions such as unimodality, absence of skewness and spatially homogeneous kurtosis. As a new feature, it also allows addressing simultaneously the two main objectives of a quality control strategy, i.e. maximizing the number of good detections while minimizing the number of false alarms. The reference dataset is presently built from the fusion of 1) all ARGO profiles up to early 2014, 2) 3 historical CTD datasets and 3) the Sea Mammals CTD profiles from the MEOP database. All datasets are extensively and manually quality controlled. In this communication, the latest method validation results are also presented. The method has been implemented in the latest version of the CORA dataset and will benefit to the next version of the Copernicus CMEMS dataset.
Combining Neural Networks with Existing Methods to Estimate 1 in 100-Year Flood Event Magnitudes
Newson, A.; See, L.
2005-12-01
Over the last fifteen years artificial neural networks (ANN) have been shown to be advantageous for the solution of many hydrological modelling problems. The use of ANNs for flood magnitude estimation in ungauged catchments, however, is a relatively new and under researched area. In this paper ANNs are used to make estimates of the magnitude of the 100-year flood event (Q100) for a number of ungauged catchments. The data used in this study were provided by the Centre for Ecology and Hydrology's Flood Estimation Handbook (FEH), which contains information on catchments across the UK. Sixteen catchment descriptors for 719 catchments were used to train an ANN, which was split into a training, validation and test data set. The goodness-of-fit statistics on the test data set indicated good model performance, with an r-squared value of 0.8 and a coefficient of efficiency of 79 percent. Data for twelve ungauged catchments were then put through the trained ANN to produce estimates of Q100. Two other accepted methodologies were also employed: the FEH statistical method and the FSR (Flood Studies Report) design storm technique, both of which are used to produce flood frequency estimates. The advantage of developing an ANN model is that it provides a third figure to aid a hydrologist in making an accurate estimate. For six of the twelve catchments, there was a relatively low spread between estimates. In these instances, an estimate of Q100 could be made with a fair degree of certainty. Of the remaining six catchments, three had areas greater than 1000km2, which means the FSR design storm estimate cannot be used. Armed with the ANN model and the FEH statistical method the hydrologist still has two possible estimates to consider. For these three catchments, the estimates were also fairly similar, providing additional confidence to the estimation. In summary, the findings of this study have shown that an accurate estimation of Q100 can be made using the catchment descriptors of
Energy Technology Data Exchange (ETDEWEB)
Frome, EL
2005-09-20
Environmental exposure measurements are, in general, positive and may be subject to left censoring; i.e,. the measured value is less than a ''detection limit''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. Parametric methods used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level, an upper percentile, and the exceedance fraction are used to characterize exposure levels, and confidence limits are used to describe the uncertainty in these estimates. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on an upper percentile (i.e., the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical data analysis and graphics has greatly enhanced the availability of high-quality nonproprietary (open source) software that serves as the basis for implementing the methods in this paper.
Statistical-mechanical entropy by the thin-layer method
International Nuclear Information System (INIS)
Feng, He; Kim, Sung Won
2003-01-01
G. Hooft first studied the statistical-mechanical entropy of a scalar field in a Schwarzschild black hole background by the brick-wall method and hinted that the statistical-mechanical entropy is the statistical origin of the Bekenstein-Hawking entropy of the black hole. However, according to our viewpoint, the statistical-mechanical entropy is only a quantum correction to the Bekenstein-Hawking entropy of the black-hole. The brick-wall method based on thermal equilibrium at a large scale cannot be applied to the cases out of equilibrium such as a nonstationary black hole. The statistical-mechanical entropy of a scalar field in a nonstationary black hole background is calculated by the thin-layer method. The condition of local equilibrium near the horizon of the black hole is used as a working postulate and is maintained for a black hole which evaporates slowly enough and whose mass is far greater than the Planck mass. The statistical-mechanical entropy is also proportional to the area of the black hole horizon. The difference from the stationary black hole is that the result relies on a time-dependent cutoff
Statistical Methods and Sampling Design for Estimating Step Trends in Surface-Water Quality
Hirsch, Robert M.
1988-01-01
This paper addresses two components of the problem of estimating the magnitude of step trends in surface water quality. The first is finding a robust estimator appropriate to the data characteristics expected in water-quality time series. The J. L. Hodges-E. L. Lehmann class of estimators is found to be robust in comparison to other nonparametric and moment-based estimators. A seasonal Hodges-Lehmann estimator is developed and shown to have desirable properties. Second, the effectiveness of various sampling strategies is examined using Monte Carlo simulation coupled with application of this estimator. The simulation is based on a large set of total phosphorus data from the Potomac River. To assure that the simulated records have realistic properties, the data are modeled in a multiplicative fashion incorporating flow, hysteresis, seasonal, and noise components. The results demonstrate the importance of balancing the length of the two sampling periods and balancing the number of data values between the two periods.
Method for statistical data analysis of multivariate observations
Gnanadesikan, R
1997-01-01
A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte
Statistical errors in Monte Carlo estimates of systematic errors
Energy Technology Data Exchange (ETDEWEB)
Roe, Byron P. [Department of Physics, University of Michigan, Ann Arbor, MI 48109 (United States)]. E-mail: byronroe@umich.edu
2007-01-01
For estimating the effects of a number of systematic errors on a data sample, one can generate Monte Carlo (MC) runs with systematic parameters varied and examine the change in the desired observed result. Two methods are often used. In the unisim method, the systematic parameters are varied one at a time by one standard deviation, each parameter corresponding to a MC run. In the multisim method (see ), each MC run has all of the parameters varied; the amount of variation is chosen from the expected distribution of each systematic parameter, usually assumed to be a normal distribution. The variance of the overall systematic error determination is derived for each of the two methods and comparisons are made between them. If one focuses not on the error in the prediction of an individual systematic error, but on the overall error due to all systematic errors in the error matrix element in data bin m, the number of events needed is strongly reduced because of the averaging effect over all of the errors. For simple models presented here the multisim model was far better if the statistical error in the MC samples was larger than an individual systematic error, while for the reverse case, the unisim model was better. Exact formulas and formulas for the simple toy models are presented so that realistic calculations can be made. The calculations in the present note are valid if the errors are in a linear region. If that region extends sufficiently far, one can have the unisims or multisims correspond to k standard deviations instead of one. This reduces the number of events required by a factor of k{sup 2}.
Statistical errors in Monte Carlo estimates of systematic errors
International Nuclear Information System (INIS)
Roe, Byron P.
2007-01-01
For estimating the effects of a number of systematic errors on a data sample, one can generate Monte Carlo (MC) runs with systematic parameters varied and examine the change in the desired observed result. Two methods are often used. In the unisim method, the systematic parameters are varied one at a time by one standard deviation, each parameter corresponding to a MC run. In the multisim method (see ), each MC run has all of the parameters varied; the amount of variation is chosen from the expected distribution of each systematic parameter, usually assumed to be a normal distribution. The variance of the overall systematic error determination is derived for each of the two methods and comparisons are made between them. If one focuses not on the error in the prediction of an individual systematic error, but on the overall error due to all systematic errors in the error matrix element in data bin m, the number of events needed is strongly reduced because of the averaging effect over all of the errors. For simple models presented here the multisim model was far better if the statistical error in the MC samples was larger than an individual systematic error, while for the reverse case, the unisim model was better. Exact formulas and formulas for the simple toy models are presented so that realistic calculations can be made. The calculations in the present note are valid if the errors are in a linear region. If that region extends sufficiently far, one can have the unisims or multisims correspond to k standard deviations instead of one. This reduces the number of events required by a factor of k 2
Analysis of Statistical Methods Currently used in Toxicology Journals.
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-09-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
Liou, Jyun-you; Smith, Elliot H.; Bateman, Lisa M.; McKhann, Guy M., II; Goodman, Robert R.; Greger, Bradley; Davis, Tyler S.; Kellis, Spencer S.; House, Paul A.; Schevon, Catherine A.
2017-08-01
Objective. Epileptiform discharges, an electrophysiological hallmark of seizures, can propagate across cortical tissue in a manner similar to traveling waves. Recent work has focused attention on the origination and propagation patterns of these discharges, yielding important clues to their source location and mechanism of travel. However, systematic studies of methods for measuring propagation are lacking. Approach. We analyzed epileptiform discharges in microelectrode array recordings of human seizures. The array records multiunit activity and local field potentials at 400 micron spatial resolution, from a small cortical site free of obstructions. We evaluated several computationally efficient statistical methods for calculating traveling wave velocity, benchmarking them to analyses of associated neuronal burst firing. Main results. Over 90% of discharges met statistical criteria for propagation across the sampled cortical territory. Detection rate, direction and speed estimates derived from a multiunit estimator were compared to four field potential-based estimators: negative peak, maximum descent, high gamma power, and cross-correlation. Interestingly, the methods that were computationally simplest and most efficient (negative peak and maximal descent) offer non-inferior results in predicting neuronal traveling wave velocities compared to the other two, more complex methods. Moreover, the negative peak and maximal descent methods proved to be more robust against reduced spatial sampling challenges. Using least absolute deviation in place of least squares error minimized the impact of outliers, and reduced the discrepancies between local field potential-based and multiunit estimators. Significance. Our findings suggest that ictal epileptiform discharges typically take the form of exceptionally strong, rapidly traveling waves, with propagation detectable across millimeter distances. The sequential activation of neurons in space can be inferred from clinically
Application of nonparametric statistic method for DNBR limit calculation
International Nuclear Information System (INIS)
Dong Bo; Kuang Bo; Zhu Xuenong
2013-01-01
Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)
International Nuclear Information System (INIS)
Simakov, V.A.; Kordyukov, S.V.; Petrov, E.N.
1988-01-01
Method of background estimation in short-wave spectral region during determination of total sample composition by X-ray fluorescence method is described. 13 types of different rocks with considerable variations of base composition and Zr, Nb, Th, U content below 7x10 -3 % are investigated. The suggested method of background accounting provides for a less statistical error of the background estimation than direct isolated measurement and reliability of its determination in a short-wave region independent on the sample base. Possibilities of suggested method for artificial mixtures conforming by the content of main component to technological concemtrates - niobium, zirconium, tantalum are estimated
Schanzer, Dena L; Zheng, Hui; Gilmore, Jason
2011-04-12
As many respiratory viruses are responsible for influenza like symptoms, accurate measures of the disease burden are not available and estimates are generally based on statistical methods. The objective of this study was to estimate absenteeism rates and hours lost due to seasonal influenza and compare these estimates with estimates of absenteeism attributable to the two H1N1 pandemic waves that occurred in 2009. Key absenteeism variables were extracted from Statistics Canada's monthly labour force survey (LFS). Absenteeism and the proportion of hours lost due to own illness or disability were modelled as a function of trend, seasonality and proxy variables for influenza activity from 1998 to 2009. Hours lost due to the H1N1/09 pandemic strain were elevated compared to seasonal influenza, accounting for a loss of 0.2% of potential hours worked annually. In comparison, an estimated 0.08% of hours worked annually were lost due to seasonal influenza illnesses. Absenteeism rates due to influenza were estimated at 12% per year for seasonal influenza over the 1997/98 to 2008/09 seasons, and 13% for the two H1N1/09 pandemic waves. Employees who took time off due to a seasonal influenza infection took an average of 14 hours off. For the pandemic strain, the average absence was 25 hours. This study confirms that absenteeism due to seasonal influenza has typically ranged from 5% to 20%, with higher rates associated with multiple circulating strains. Absenteeism rates for the 2009 pandemic were similar to those occurring for seasonal influenza. Employees took more time off due to the pandemic strain than was typical for seasonal influenza.
A study on the estimation method of nuclear accident risk cost
International Nuclear Information System (INIS)
Matsuo, Yuji
2016-01-01
The methodology of estimating nuclear accident risk cost, as a part of nuclear power generation cost, has hardly been established due mainly to the extremely wide range of the estimation of the accident frequency. This study estimates the expected nuclear accident frequency for Japan, making use of the method of Bayesian statistics, which exploits both the information obtained by Probabilistic Risk Assessment (PRA) and the observed historical accident frequencies. Using the PRA estimation of the Containment Failure Frequency (CFF) for Tomari nuclear power plant unit 3 of Hokkaido Electric Power Company (average: 2.1 x 10 -4 , 95th percentile: 7.7 x 10 -4 ) and the actual large-scale accident frequency (once in 1,460 reactor-years), the posterior CFF was estimated at 3.8 x 10 -4 . This study also took into account the 'external' factor causing unexpected nuclear accidents, concluding that such factor could result in higher CFF estimations, especially with larger observed accident numbers. (author)
Gene flow analysis method, the D-statistic, is robust in a wide parameter space.
Zheng, Yichen; Janke, Axel
2018-01-08
We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text], to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.
Askerov, Bahram M
2010-01-01
This book deals with theoretical thermodynamics and the statistical physics of electron and particle gases. While treating the laws of thermodynamics from both classical and quantum theoretical viewpoints, it posits that the basis of the statistical theory of macroscopic properties of a system is the microcanonical distribution of isolated systems, from which all canonical distributions stem. To calculate the free energy, the Gibbs method is applied to ideal and non-ideal gases, and also to a crystalline solid. Considerable attention is paid to the Fermi-Dirac and Bose-Einstein quantum statistics and its application to different quantum gases, and electron gas in both metals and semiconductors is considered in a nonequilibrium state. A separate chapter treats the statistical theory of thermodynamic properties of an electron gas in a quantizing magnetic field.
Consistency of extreme flood estimation approaches
Felder, Guido; Paquet, Emmanuel; Penot, David; Zischg, Andreas; Weingartner, Rolf
2017-04-01
Estimations of low-probability flood events are frequently used for the planning of infrastructure as well as for determining the dimensions of flood protection measures. There are several well-established methodical procedures to estimate low-probability floods. However, a global assessment of the consistency of these methods is difficult to achieve, the "true value" of an extreme flood being not observable. Anyway, a detailed comparison performed on a given case study brings useful information about the statistical and hydrological processes involved in different methods. In this study, the following three different approaches for estimating low-probability floods are compared: a purely statistical approach (ordinary extreme value statistics), a statistical approach based on stochastic rainfall-runoff simulation (SCHADEX method), and a deterministic approach (physically based PMF estimation). These methods are tested for two different Swiss catchments. The results and some intermediate variables are used for assessing potential strengths and weaknesses of each method, as well as for evaluating the consistency of these methods.
Methods library of embedded R functions at Statistics Norway
Directory of Open Access Journals (Sweden)
Øyvind Langsrud
2017-11-01
Full Text Available Statistics Norway is modernising the production processes. An important element in this work is a library of functions for statistical computations. In principle, the functions in such a methods library can be programmed in several languages. A modernised production environment demand that these functions can be reused for different statistics products, and that they are embedded within a common IT system. The embedding should be done in such a way that the users of the methods do not need to know the underlying programming language. As a proof of concept, Statistics Norway soon has established a methods library offering a limited number of methods for macro-editing, imputation and confidentiality. This is done within an area of municipal statistics with R as the only programming language. This paper presents the details and experiences from this work. The problem of fitting real word applications to simple and strict standards is discussed and exemplified by the development of solutions to regression imputation and table suppression.
Complex Data Modeling and Computationally Intensive Statistical Methods
Mantovan, Pietro
2010-01-01
The last years have seen the advent and development of many devices able to record and store an always increasing amount of complex and high dimensional data; 3D images generated by medical scanners or satellite remote sensing, DNA microarrays, real time financial data, system control datasets. The analysis of this data poses new challenging problems and requires the development of novel statistical models and computational methods, fueling many fascinating and fast growing research areas of modern statistics. The book offers a wide variety of statistical methods and is addressed to statistici
Flexible and efficient estimating equations for variogram estimation
Sun, Ying; Chang, Xiaohui; Guan, Yongtao
2018-01-01
Variogram estimation plays a vastly important role in spatial modeling. Different methods for variogram estimation can be largely classified into least squares methods and likelihood based methods. A general framework to estimate the variogram through a set of estimating equations is proposed. This approach serves as an alternative approach to likelihood based methods and includes commonly used least squares approaches as its special cases. The proposed method is highly efficient as a low dimensional representation of the weight matrix is employed. The statistical efficiency of various estimators is explored and the lag effect is examined. An application to a hydrology dataset is also presented.
Flexible and efficient estimating equations for variogram estimation
Sun, Ying
2018-01-11
Variogram estimation plays a vastly important role in spatial modeling. Different methods for variogram estimation can be largely classified into least squares methods and likelihood based methods. A general framework to estimate the variogram through a set of estimating equations is proposed. This approach serves as an alternative approach to likelihood based methods and includes commonly used least squares approaches as its special cases. The proposed method is highly efficient as a low dimensional representation of the weight matrix is employed. The statistical efficiency of various estimators is explored and the lag effect is examined. An application to a hydrology dataset is also presented.
Energy Technology Data Exchange (ETDEWEB)
Almeida, Arthur C.; Barros, Paulo L.C.; Monteiro, Jose H.A.; Rocha, Brigida R.P. [Universidade Federal do Para (DEEC/UFPA), Belem, PA (Brazil). Dept. de Engenharia Eletrica e Computacao. Grupo de Pesquisa ENERBIO], e-mails: arthur@ufpa.br, jhumberto01@yahoo.com.br, brigida@ufpa.br, paulo.contente@ufra.edu.br
2006-07-01
The current methodologies for calculating the volume of biomass and the consequent potential energy widely used in forest inventories, based primarily in statistical methodology to obtain their results. However, more recent techniques, based on the ability of nonlinear mappings, offered by artificial neural networks, have been used successfully in several areas of technology, with superior performance. This work shows a comparison between the statistical model to estimate the volume of trees and a model based on neural networks, which can be used with advantage for this activity related with biomass energy planning.
Dental age estimation using Willems method: A digital orthopantomographic study
Directory of Open Access Journals (Sweden)
Rezwana Begum Mohammed
2014-01-01
Full Text Available In recent years, age estimation has become increasingly important in living people for a variety of reasons, including identifying criminal and legal responsibility, and for many other social events such as a birth certificate, marriage, beginning a job, joining the army, and retirement. Objectives: The aim of this study was to assess the developmental stages of left seven mandibular teeth for estimation of dental age (DA in different age groups and to evaluate the possible correlation between DA and chronological age (CA in South Indian population using Willems method. Materials and Methods: Digital Orthopantomogram of 332 subjects (166 males, 166 females who fit the study and the criteria were obtained. Assessment of mandibular teeth (from central incisor to the second molar on left quadrant development was undertaken and DA was assessed using Willems method. Results and Discussion: The present study showed a significant correlation between DA and CA in both males (r = 0.71 and females (r = 0.88. The overall mean difference between the estimated DA and CA for males was 0.69 ± 2.14 years (P 0.05. Willems method underestimated the mean age of males by 0.69 years and females by 0.08 years and showed that females mature earlier than males in selected population. The mean difference between DA and CA according to Willems method was 0.39 years and is statistically significant (P < 0.05. Conclusion: This study showed significant relation between DA and CA. Thus, digital radiographic assessment of mandibular teeth development can be used to generate mean DA using Willems method and also the estimated age range for an individual of unknown CA.
Shin, S M; Kim, Y-I; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B
2015-01-01
To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. The sample included 24 female and 19 male patients with hand-wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index.
DEFF Research Database (Denmark)
Jensen, Jesper; Tan, Zheng-Hua
2014-01-01
We propose a method for minimum mean-square error (MMSE) estimation of mel-frequency cepstral features for noise robust automatic speech recognition (ASR). The method is based on a minimum number of well-established statistical assumptions; no assumptions are made which are inconsistent with others....... The strength of the proposed method is that it allows MMSE estimation of mel-frequency cepstral coefficients (MFCC's), cepstral mean-subtracted MFCC's (CMS-MFCC's), velocity, and acceleration coefficients. Furthermore, the method is easily modified to take into account other compressive non-linearities than...... the logarithmic which is usually used for MFCC computation. The proposed method shows estimation performance which is identical to or better than state-of-the-art methods. It further shows comparable ASR performance, where the advantage of being able to use mel-frequency speech features based on a power non...
Estimation of the POD function and the LOD of a qualitative microbiological measurement method.
Wilrich, Cordula; Wilrich, Peter-Theodor
2009-01-01
Qualitative microbiological measurement methods in which the measurement results are either 0 (microorganism not detected) or 1 (microorganism detected) are discussed. The performance of such a measurement method is described by its probability of detection as a function of the contamination (CFU/g or CFU/mL) of the test material, or by the LOD(p), i.e., the contamination that is detected (measurement result 1) with a specified probability p. A complementary log-log model was used to statistically estimate these performance characteristics. An intralaboratory experiment for the detection of Listeria monocytogenes in various food matrixes illustrates the method. The estimate of LOD50% is compared with the Spearman-Kaerber method.
Descriptive and inferential statistical methods used in burns research.
Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars
2010-05-01
Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals
DEFF Research Database (Denmark)
Sharifzadeh, Sara; Skytte, Jacob Lercke; Nielsen, Otto Højager Attermann
2012-01-01
Statistical solutions find wide spread use in food and medicine quality control. We investigate the effect of different regression and sparse regression methods for a viscosity estimation problem using the spectro-temporal features from new Sub-Surface Laser Scattering (SLS) vision system. From...... with sparse LAR, lasso and Elastic Net (EN) sparse regression methods. Due to the inconsistent measurement condition, Locally Weighted Scatter plot Smoothing (Loess) has been employed to alleviate the undesired variation in the estimated viscosity. The experimental results of applying different methods show...
On the efficiency of high-energy particle identification statistical methods
International Nuclear Information System (INIS)
Chilingaryan, A.A.
1982-01-01
An attempt is made to analyze the statistical methods of making decisions on the high-energy particle identification. The Bayesian approach is shown to provide the most complete account of the primary discriminative information between the particles of various tupes. It does not impose rigid requirements on the density form of the probability function and ensures the account of the a priori information as compared with the Neyman-Pearson approach, the mimimax technique and the heristic rules of the decision limits construction in the variant region of the specially chosen parameter. The methods based on the concept of the nearest neighbourhood are shown to be the most effective one among the local methods of the probability function density estimation. The probability distances between the training sample classes are suggested to make a decision on selecting the high-energy particle detector optimal parameters. The method proposed and the software constructed are tested on the problem of the cosmic radiation hadron identification by means of transition radiation detectors (the ''PION'' experiment)
DEFF Research Database (Denmark)
Carstensen, Jakob; Madsen, Henrik; Poulsen, Niels Kjølstad
1994-01-01
of the processes, i.e. including prior knowledge, with the significant effects found in data by using statistical identification methods. Rates of the biochemical and hydraulic processes are identified by statistical methods and the related constants for the biochemical processes are estimated assuming Monod...... kinetics. The models only include those hydraulic and kinetic parameters, which have shown to be significant in a statistical sense, and hence they can be quantified. The application potential of these models is on-line control, because the present state of the plant is given by the variables of the models......The introduction of on-line sensors of nutrient salt concentrations on wastewater treatment plants opens a wide new area of modelling wastewater processes. Time series models of these processes are very useful for gaining insight in real time operation of wastewater treatment systems which deal...
Improved statistical method for temperature and salinity quality control
Gourrion, Jérôme; Szekely, Tanguy
2017-04-01
Climate research and Ocean monitoring benefit from the continuous development of global in-situ hydrographic networks in the last decades. Apart from the increasing volume of observations available on a large range of temporal and spatial scales, a critical aspect concerns the ability to constantly improve the quality of the datasets. In the context of the Coriolis Dataset for ReAnalysis (CORA) version 4.2, a new quality control method based on a local comparison to historical extreme values ever observed is developed, implemented and validated. Temperature, salinity and potential density validity intervals are directly estimated from minimum and maximum values from an historical reference dataset, rather than from traditional mean and standard deviation estimates. Such an approach avoids strong statistical assumptions on the data distributions such as unimodality, absence of skewness and spatially homogeneous kurtosis. As a new feature, it also allows addressing simultaneously the two main objectives of an automatic quality control strategy, i.e. maximizing the number of good detections while minimizing the number of false alarms. The reference dataset is presently built from the fusion of 1) all ARGO profiles up to late 2015, 2) 3 historical CTD datasets and 3) the Sea Mammals CTD profiles from the MEOP database. All datasets are extensively and manually quality controlled. In this communication, the latest method validation results are also presented. The method has already been implemented in the latest version of the delayed-time CMEMS in-situ dataset and will be deployed soon in the equivalent near-real time products.
Directory of Open Access Journals (Sweden)
Haruki Nakamura
2012-09-01
Full Text Available We have developed a method for estimating protein-ligand binding free energy (DG based on the direct protein-ligand interaction obtained by a molecular dynamics simulation. Using this method, we estimated the DG value statistically by the average values of the van der Waals and electrostatic interactions between each amino acid of the target protein and the ligand molecule. In addition, we introduced fluctuations in the accessible surface area (ASA and dihedral angles of the protein-ligand complex system as the entropy terms of the DG estimation. The present method included the fluctuation term of structural change of the protein and the effective dielectric constant. We applied this method to 34 protein-ligand complex structures. As a result, the correlation coefficient between the experimental and calculated DG values was 0.81, and the average error of DG was 1.2 kcal/mol with the use of the fixed parameters. These results were obtained from a 2 nsec molecular dynamics simulation.
A NEW METHOD FOR PREDICTING SURVIVAL AND ESTIMATING UNCERTAINTY IN TRAUMA PATIENTS
Directory of Open Access Journals (Sweden)
V. G. Schetinin
2017-01-01
Full Text Available The Trauma and Injury Severity Score (TRISS is the current “gold” standard of screening patient’s condition for purposes of predicting survival probability. More than 40 years of TRISS practice revealed a number of problems, particularly, 1 unexplained fluctuation of predicted values caused by aggregation of screening tests, and 2 low accuracy of uncertainty intervals estimations. We developed a new method made it available for practitioners as a web calculator to reduce negative effect of factors given above. The method involves Bayesian methodology of statistical inference which, being computationally expensive, in theory provides most accurate predictions. We implemented and tested this approach on a data set including 571,148 patients registered in the US National Trauma Data Bank (NTDB with 1–20 injuries. These patients were distributed over the following categories: (1 174,647 with 1 injury, (2 381,137 with 2–10 injuries, and (3 15,364 with 11–20 injuries. Survival rates in each category were 0.977, 0.953, and 0.831, respectively. The proposed method has improved prediction accuracy by 0.04%, 0.36%, and 3.64% (p-value <0.05 in the categories 1, 2, and 3, respectively. Hosmer-Lemeshow statistics showed a significant improvement of the new model calibration. The uncertainty 2σ intervals were reduced from 0.628 to 0.569 for patients of the second category and from 1.227 to 0.930 for patients of the third category, both with p-value <0.005. The new method shows the statistically significant improvement (p-value <0.05 in accuracy of predicting survival and estimating the uncertainty intervals. The largest improvement has been achieved for patients with 11–20 injuries. The method is available for practitioners as a web calculator http://www.traumacalc.org.
Performance Evaluation of the Spectral Centroid Downshift Method for Attenuation Estimation
Samimi, Kayvan; Varghese, Tomy
2015-01-01
Estimation of frequency-dependent ultrasonic attenuation is an important aspect of tissue characterization. Along with other acoustic parameters studied in quantitative ultrasound, the attenuation coefficient can be used to differentiate normal and pathological tissue. The spectral centroid downshift (CDS) method is one the most common frequency-domain approaches applied to this problem. In this study, a statistical analysis of this method’s performance was carried out based on a parametric m...
Energy Technology Data Exchange (ETDEWEB)
Telfeyan, Katherine Christina [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Ware, Stuart Douglas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Reimus, Paul William [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Birdsell, Kay Hanson [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-11-06
Diffusion cell and diffusion wafer experiments were conducted to compare methods for estimating matrix diffusion coefficients in rock core samples from Pahute Mesa at the Nevada Nuclear Security Site (NNSS). A diffusion wafer method, in which a solute diffuses out of a rock matrix that is pre-saturated with water containing the solute, is presented as a simpler alternative to the traditional through-diffusion (diffusion cell) method. Both methods yielded estimates of matrix diffusion coefficients that were within the range of values previously reported for NNSS volcanic rocks. The difference between the estimates of the two methods ranged from 14 to 30%, and there was no systematic high or low bias of one method relative to the other. From a transport modeling perspective, these differences are relatively minor when one considers that other variables (e.g., fracture apertures, fracture spacings) influence matrix diffusion to a greater degree and tend to have greater uncertainty than diffusion coefficients. For the same relative random errors in concentration measurements, the diffusion cell method yields diffusion coefficient estimates that have less uncertainty than the wafer method. However, the wafer method is easier and less costly to implement and yields estimates more quickly, thus allowing a greater number of samples to be analyzed for the same cost and time. Given the relatively good agreement between the methods, and the lack of any apparent bias between the methods, the diffusion wafer method appears to offer advantages over the diffusion cell method if better statistical representation of a given set of rock samples is desired.
Telfeyan, Katherine; Ware, S. Doug; Reimus, Paul W.; Birdsell, Kay H.
2018-02-01
Diffusion cell and diffusion wafer experiments were conducted to compare methods for estimating effective matrix diffusion coefficients in rock core samples from Pahute Mesa at the Nevada Nuclear Security Site (NNSS). A diffusion wafer method, in which a solute diffuses out of a rock matrix that is pre-saturated with water containing the solute, is presented as a simpler alternative to the traditional through-diffusion (diffusion cell) method. Both methods yielded estimates of effective matrix diffusion coefficients that were within the range of values previously reported for NNSS volcanic rocks. The difference between the estimates of the two methods ranged from 14 to 30%, and there was no systematic high or low bias of one method relative to the other. From a transport modeling perspective, these differences are relatively minor when one considers that other variables (e.g., fracture apertures, fracture spacings) influence matrix diffusion to a greater degree and tend to have greater uncertainty than effective matrix diffusion coefficients. For the same relative random errors in concentration measurements, the diffusion cell method yields effective matrix diffusion coefficient estimates that have less uncertainty than the wafer method. However, the wafer method is easier and less costly to implement and yields estimates more quickly, thus allowing a greater number of samples to be analyzed for the same cost and time. Given the relatively good agreement between the methods, and the lack of any apparent bias between the methods, the diffusion wafer method appears to offer advantages over the diffusion cell method if better statistical representation of a given set of rock samples is desired.
Application of Statistical Methods to Activation Analytical Results near the Limit of Detection
DEFF Research Database (Denmark)
Heydorn, Kaj; Wanscher, B.
1978-01-01
Reporting actual numbers instead of upper limits for analytical results at or below the detection limit may produce reliable data when these numbers are subjected to appropriate statistical processing. Particularly in radiometric methods, such as activation analysis, where individual standard...... deviations of analytical results may be estimated, improved discrimination may be based on the Analysis of Precision. Actual experimental results from a study of the concentrations of arsenic in human skin demonstrate the power of this principle....
Unrecorded Alcohol Consumption: Quantitative Methods of Estimation
Razvodovsky, Y. E.
2010-01-01
unrecorded alcohol; methods of estimation In this paper we focused on methods of estimation of unrecorded alcohol consumption level. Present methods of estimation of unrevorded alcohol consumption allow only approximate estimation of unrecorded alcohol consumption level. Tacking into consideration the extreme importance of such kind of data, further investigation is necessary to improve the reliability of methods estimation of unrecorded alcohol consumption.
Statistical methods for spatio-temporal systems
Finkenstadt, Barbel
2006-01-01
Statistical Methods for Spatio-Temporal Systems presents current statistical research issues on spatio-temporal data modeling and will promote advances in research and a greater understanding between the mechanistic and the statistical modeling communities.Contributed by leading researchers in the field, each self-contained chapter starts with an introduction of the topic and progresses to recent research results. Presenting specific examples of epidemic data of bovine tuberculosis, gastroenteric disease, and the U.K. foot-and-mouth outbreak, the first chapter uses stochastic models, such as point process models, to provide the probabilistic backbone that facilitates statistical inference from data. The next chapter discusses the critical issue of modeling random growth objects in diverse biological systems, such as bacteria colonies, tumors, and plant populations. The subsequent chapter examines data transformation tools using examples from ecology and air quality data, followed by a chapter on space-time co...
Directory of Open Access Journals (Sweden)
Gia Huy Dinh
2016-06-01
Full Text Available The paper suggests a new method of collision avoidance stemming from the concept of the polygonal target ship domain. Since the last century, we have witnessed the current typical ship domains classified and described. In this proposition, firstly, the domain is a geometrical manner which is used in both analytical and statistical method, resulting in the signification of practical application and simulation. Secondly, such domain will be applied to target ship under the combination of two separated parts: “Blocking area” and “Action area” in order to define the area where the ship must keep outside and how the actions to avoid collision can be generated. Thirdly, the concept has suggested the number of mathematical models for different approaching encounters, including head-on, overtaking and crossing situation. Finally, the parameters of turning circle of the ship can be proposed in determining the size of the domain. Statistical evidences indicate that this method reflects a crew's real habit and psychological in maneuvering. As the result, simple domain is shaped like imagination of sailors, but more accurate in calculating boundary. It promises an effective solution for automatic collision avoidance method. The next researches of this paper have achieved positive results in finding shortest route for avoiding collision. Moreover, while using statistical methods, classical researches face a serious problem in a wide application with different areas, this concept can make up a beneficial solution for the popular application. The numerous ship domains which are in previous researches will be carried out to compare and point out the simplification and effectiveness of the new method in practice.
Statistical methods for forecasting
Abraham, Bovas
2009-01-01
The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists."This book, it must be said, lives up to the words on its advertising cover: ''Bridging the gap between introductory, descriptive approaches and highly advanced theoretical treatises, it provides a practical, intermediate level discussion of a variety of forecasting tools, and explains how they relate to one another, both in theory and practice.'' It does just that!"-Journal of the Royal Statistical Society"A well-written work that deals with statistical methods and models that can be used to produce short-term forecasts, this book has wide-ranging applications. It could be used in the context of a study of regression, forecasting, and time series ...
Advances in Statistical Methods for Substance Abuse Prevention Research
MacKinnon, David P.; Lockwood, Chondra M.
2010-01-01
The paper describes advances in statistical methods for prevention research with a particular focus on substance abuse prevention. Standard analysis methods are extended to the typical research designs and characteristics of the data collected in prevention research. Prevention research often includes longitudinal measurement, clustering of data in units such as schools or clinics, missing data, and categorical as well as continuous outcome variables. Statistical methods to handle these features of prevention data are outlined. Developments in mediation, moderation, and implementation analysis allow for the extraction of more detailed information from a prevention study. Advancements in the interpretation of prevention research results include more widespread calculation of effect size and statistical power, the use of confidence intervals as well as hypothesis testing, detailed causal analysis of research findings, and meta-analysis. The increased availability of statistical software has contributed greatly to the use of new methods in prevention research. It is likely that the Internet will continue to stimulate the development and application of new methods. PMID:12940467
Directory of Open Access Journals (Sweden)
A. Casanueva
2013-08-01
Full Text Available The study of extreme events has become of great interest in recent years due to their direct impact on society. Extremes are usually evaluated by using extreme indicators, based on order statistics on the tail of the probability distribution function (typically percentiles. In this study, we focus on the tail of the distribution of daily maximum and minimum temperatures. For this purpose, we analyse high (95th and low (5th percentiles in daily maximum and minimum temperatures on the Iberian Peninsula, respectively, derived from different downscaling methods (statistical and dynamical. First, we analyse the performance of reanalysis-driven downscaling methods in present climate conditions. The comparison among the different methods is performed in terms of the bias of seasonal percentiles, considering as observations the public gridded data sets E-OBS and Spain02, and obtaining an estimation of both the mean and spatial percentile errors. Secondly, we analyse the increments of future percentile projections under the SRES A1B scenario and compare them with those corresponding to the mean temperature, showing that their relative importance depends on the method, and stressing the need to consider an ensemble of methodologies.
The Monte Carlo method the method of statistical trials
Shreider, YuA
1966-01-01
The Monte Carlo Method: The Method of Statistical Trials is a systematic account of the fundamental concepts and techniques of the Monte Carlo method, together with its range of applications. Some of these applications include the computation of definite integrals, neutron physics, and in the investigation of servicing processes. This volume is comprised of seven chapters and begins with an overview of the basic features of the Monte Carlo method and typical examples of its application to simple problems in computational mathematics. The next chapter examines the computation of multi-dimensio
Regularization parameter selection methods for ill-posed Poisson maximum likelihood estimation
International Nuclear Information System (INIS)
Bardsley, Johnathan M; Goldes, John
2009-01-01
In image processing applications, image intensity is often measured via the counting of incident photons emitted by the object of interest. In such cases, image data noise is accurately modeled by a Poisson distribution. This motivates the use of Poisson maximum likelihood estimation for image reconstruction. However, when the underlying model equation is ill-posed, regularization is needed. Regularized Poisson likelihood estimation has been studied extensively by the authors, though a problem of high importance remains: the choice of the regularization parameter. We will present three statistically motivated methods for choosing the regularization parameter, and numerical examples will be presented to illustrate their effectiveness
Estimating Predictive Variance for Statistical Gas Distribution Modelling
International Nuclear Information System (INIS)
Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo
2009-01-01
Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.
Comparison of different methods for estimation of potential evapotranspiration
International Nuclear Information System (INIS)
Nazeer, M.
2010-01-01
Evapotranspiration can be estimated with different available methods. The aim of this research study to compare and evaluate the originally measured potential evapotranspiration from Class A pan with the Hargreaves equation, the Penman equation, the Penman-Montheith equation, and the FAO56 Penman-Monteith equation. The evaporation rate from pan recorded greater than stated methods. For each evapotranspiration method, results were compared against mean monthly potential evapotranspiration (PET) from Pan data according to FAO (ET/sub o/=K/sub pan X E/sub pan)), from daily measured recorded data of the twenty-five years (1984-2008). On the basis of statistical analysis between the pan data and the FAO56- Penman-Monteith method are not considered to be very significant (=0.98) at 95% confidence and prediction intervals. All methods required accurate weather data for precise results, for the purpose of this study the past twenty five years data were analyzed and used including maximum and minimum air temperature, relative humidity, wind speed, sunshine duration and rainfall. Based on linear regression analysis results the FAO56 PMM ranked first (R/sup 2/=0.98) followed by Hergreaves method (R/sup 2/=0.96), Penman-Monteith method (R/sup 2/=0.94) and Penman method (=0.93). Obviously, using FAO56 Penman Monteith method with precise climatic variables for ET/sub o/ estimation is more reliable than the other alternative methods, Hergreaves is more simple and rely only on air temperatures data and can be used alternative of FAO56 Penman-Monteith method if other climatic data are missing or unreliable. (author)
Three methods for estimating a range of vehicular interactions
Krbálek, Milan; Apeltauer, Jiří; Apeltauer, Tomáš; Szabová, Zuzana
2018-02-01
We present three different approaches how to estimate the number of preceding cars influencing a decision-making procedure of a given driver moving in saturated traffic flows. The first method is based on correlation analysis, the second one evaluates (quantitatively) deviations from the main assumption in the convolution theorem for probability, and the third one operates with advanced instruments of the theory of counting processes (statistical rigidity). We demonstrate that universally-accepted premise on short-ranged traffic interactions may not be correct. All methods introduced have revealed that minimum number of actively-followed vehicles is two. It supports an actual idea that vehicular interactions are, in fact, middle-ranged. Furthermore, consistency between the estimations used is surprisingly credible. In all cases we have found that the interaction range (the number of actively-followed vehicles) drops with traffic density. Whereas drivers moving in congested regimes with lower density (around 30 vehicles per kilometer) react on four or five neighbors, drivers moving in high-density flows respond to two predecessors only.
Mukhopadhyay, Nitai D; Sampson, Andrew J; Deniz, Daniel; Alm Carlsson, Gudrun; Williamson, Jeffrey; Malusek, Alexandr
2012-01-01
Correlated sampling Monte Carlo methods can shorten computing times in brachytherapy treatment planning. Monte Carlo efficiency is typically estimated via efficiency gain, defined as the reduction in computing time by correlated sampling relative to conventional Monte Carlo methods when equal statistical uncertainties have been achieved. The determination of the efficiency gain uncertainty arising from random effects, however, is not a straightforward task specially when the error distribution is non-normal. The purpose of this study is to evaluate the applicability of the F distribution and standardized uncertainty propagation methods (widely used in metrology to estimate uncertainty of physical measurements) for predicting confidence intervals about efficiency gain estimates derived from single Monte Carlo runs using fixed-collision correlated sampling in a simplified brachytherapy geometry. A bootstrap based algorithm was used to simulate the probability distribution of the efficiency gain estimates and the shortest 95% confidence interval was estimated from this distribution. It was found that the corresponding relative uncertainty was as large as 37% for this particular problem. The uncertainty propagation framework predicted confidence intervals reasonably well; however its main disadvantage was that uncertainties of input quantities had to be calculated in a separate run via a Monte Carlo method. The F distribution noticeably underestimated the confidence interval. These discrepancies were influenced by several photons with large statistical weights which made extremely large contributions to the scored absorbed dose difference. The mechanism of acquiring high statistical weights in the fixed-collision correlated sampling method was explained and a mitigation strategy was proposed. Copyright © 2011 Elsevier Ltd. All rights reserved.
Gao, Yongnian; Gao, Junfeng; Yin, Hongbin; Liu, Chuansheng; Xia, Ting; Wang, Jing; Huang, Qi
2015-03-15
Remote sensing has been widely used for ater quality monitoring, but most of these monitoring studies have only focused on a few water quality variables, such as chlorophyll-a, turbidity, and total suspended solids, which have typically been considered optically active variables. Remote sensing presents a challenge in estimating the phosphorus concentration in water. The total phosphorus (TP) in lakes has been estimated from remotely sensed observations, primarily using the simple individual band ratio or their natural logarithm and the statistical regression method based on the field TP data and the spectral reflectance. In this study, we investigated the possibility of establishing a spatial modeling scheme to estimate the TP concentration of a large lake from multi-spectral satellite imagery using band combinations and regional multivariate statistical modeling techniques, and we tested the applicability of the spatial modeling scheme. The results showed that HJ-1A CCD multi-spectral satellite imagery can be used to estimate the TP concentration in a lake. The correlation and regression analysis showed a highly significant positive relationship between the TP concentration and certain remotely sensed combination variables. The proposed modeling scheme had a higher accuracy for the TP concentration estimation in the large lake compared with the traditional individual band ratio method and the whole-lake scale regression-modeling scheme. The TP concentration values showed a clear spatial variability and were high in western Lake Chaohu and relatively low in eastern Lake Chaohu. The northernmost portion, the northeastern coastal zone and the southeastern portion of western Lake Chaohu had the highest TP concentrations, and the other regions had the lowest TP concentration values, except for the coastal zone of eastern Lake Chaohu. These results strongly suggested that the proposed modeling scheme, i.e., the band combinations and the regional multivariate
DEFF Research Database (Denmark)
Thorson, James T.; Kristensen, Kasper
2016-01-01
Statistical models play an important role in fisheries science when reconciling ecological theory with available data for wild populations or experimental studies. Ecological models increasingly include both fixed and random effects, and are often estimated using maximum likelihood techniques...... configurations of an age-structured population dynamics model. This simulation experiment shows that the epsilon-method and the existing bias-correction method perform equally well in data-rich contexts, but the epsilon-method is slightly less biased in data-poor contexts. We then apply the epsilon......-method to a spatial regression model when estimating an index of population abundance, and compare results with an alternative bias-correction algorithm that involves Markov-chain Monte Carlo sampling. This example shows that the epsilon-method leads to a biologically significant difference in estimates of average...
Directory of Open Access Journals (Sweden)
Sobri Harun
2012-04-01
Full Text Available Evapotranspiration (ET is a complex process in the hydrological cycle that influences the quantity of runoff and thus the irrigation water requirements. Numerous methods have been developed to estimate potential evapotranspiration (PET. Unfortunately, most of the reliable PET methods are parameter rich models and therefore, not feasible for application in data scarce regions. On the other hand, accuracy and reliability of simple PET models vary widely according to regional climate conditions. The objective of the present study was to evaluate the performance of three temperature-based and three radiation-based simple ET methods in estimating historical ET and projecting future ET at Muda Irrigation Scheme at Kedah, Malaysia. The performance was measured by comparing those methods with the parameter intensive Penman-Monteith Method. It was found that radiation based methods gave better performance compared to temperature-based methods in estimation of ET in the study area. Future ET simulated from projected climate data obtained through statistical downscaling technique also showed that radiation-based methods can project closer ET values to that projected by Penman-Monteith Method. It is expected that the study will guide in selecting suitable methods for estimating and projecting ET in accordance to availability of meteorological data.
Statistical inference an integrated approach
Migon, Helio S; Louzada, Francisco
2014-01-01
Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
A probabilistic method for testing and estimating selection differences between populations.
He, Yungang; Wang, Minxian; Huang, Xin; Li, Ran; Xu, Hongyang; Xu, Shuhua; Jin, Li
2015-12-01
Human populations around the world encounter various environmental challenges and, consequently, develop genetic adaptations to different selection forces. Identifying the differences in natural selection between populations is critical for understanding the roles of specific genetic variants in evolutionary adaptation. Although numerous methods have been developed to detect genetic loci under recent directional selection, a probabilistic solution for testing and quantifying selection differences between populations is lacking. Here we report the development of a probabilistic method for testing and estimating selection differences between populations. By use of a probabilistic model of genetic drift and selection, we showed that logarithm odds ratios of allele frequencies provide estimates of the differences in selection coefficients between populations. The estimates approximate a normal distribution, and variance can be estimated using genome-wide variants. This allows us to quantify differences in selection coefficients and to determine the confidence intervals of the estimate. Our work also revealed the link between genetic association testing and hypothesis testing of selection differences. It therefore supplies a solution for hypothesis testing of selection differences. This method was applied to a genome-wide data analysis of Han and Tibetan populations. The results confirmed that both the EPAS1 and EGLN1 genes are under statistically different selection in Han and Tibetan populations. We further estimated differences in the selection coefficients for genetic variants involved in melanin formation and determined their confidence intervals between continental population groups. Application of the method to empirical data demonstrated the outstanding capability of this novel approach for testing and quantifying differences in natural selection. © 2015 He et al.; Published by Cold Spring Harbor Laboratory Press.
Statistical Methods for Unusual Count Data
DEFF Research Database (Denmark)
Guthrie, Katherine A.; Gammill, Hilary S.; Kamper-Jørgensen, Mads
2016-01-01
microchimerism data present challenges for statistical analysis, including a skewed distribution, excess zero values, and occasional large values. Methods for comparing microchimerism levels across groups while controlling for covariates are not well established. We compared statistical models for quantitative...... microchimerism values, applied to simulated data sets and 2 observed data sets, to make recommendations for analytic practice. Modeling the level of quantitative microchimerism as a rate via Poisson or negative binomial model with the rate of detection defined as a count of microchimerism genome equivalents per...
Nonequilibrium statistical mechanics ensemble method
Eu, Byung Chan
1998-01-01
In this monograph, nonequilibrium statistical mechanics is developed by means of ensemble methods on the basis of the Boltzmann equation, the generic Boltzmann equations for classical and quantum dilute gases, and a generalised Boltzmann equation for dense simple fluids The theories are developed in forms parallel with the equilibrium Gibbs ensemble theory in a way fully consistent with the laws of thermodynamics The generalised hydrodynamics equations are the integral part of the theory and describe the evolution of macroscopic processes in accordance with the laws of thermodynamics of systems far removed from equilibrium Audience This book will be of interest to researchers in the fields of statistical mechanics, condensed matter physics, gas dynamics, fluid dynamics, rheology, irreversible thermodynamics and nonequilibrium phenomena
International Nuclear Information System (INIS)
Sabaton, M.; Viollet, P.L.; Darles, A.; Gland, H.
1980-07-01
The PANACH three dimensional calculation code developed from tests on a small scale model and validated from full scale measurement campaigns, was used to estimate a three dimensional statistic of plumes. As it is not possible with the calculation times to make a calculation for each radio sondage, a classification method was adopted. This method developed by the French National Meteorological Office is based on a double classification comprising basic classes in which the plumes are assumed to be dynamically similar and a sub-classification to take better account of the true moisture profiles. This statistical method was then applied to the case of 2 or 4 1300 MWe units fitted with natural draught cooling towers of the wet, dry or wet-dry types [fr
Lange, J.; O'Shaughnessy, R.; Boyle, M.; Calderón Bustillo, J.; Campanelli, M.; Chu, T.; Clark, J. A.; Demos, N.; Fong, H.; Healy, J.; Hemberger, D. A.; Hinder, I.; Jani, K.; Khamesra, B.; Kidder, L. E.; Kumar, P.; Laguna, P.; Lousto, C. O.; Lovelace, G.; Ossokine, S.; Pfeiffer, H.; Scheel, M. A.; Shoemaker, D. M.; Szilagyi, B.; Teukolsky, S.; Zlochower, Y.
2017-11-01
We present and assess a Bayesian method to interpret gravitational wave signals from binary black holes. Our method directly compares gravitational wave data to numerical relativity (NR) simulations. In this study, we present a detailed investigation of the systematic and statistical parameter estimation errors of this method. This procedure bypasses approximations used in semianalytical models for compact binary coalescence. In this work, we use the full posterior parameter distribution for only generic nonprecessing binaries, drawing inferences away from the set of NR simulations used, via interpolation of a single scalar quantity (the marginalized log likelihood, ln L ) evaluated by comparing data to nonprecessing binary black hole simulations. We also compare the data to generic simulations, and discuss the effectiveness of this procedure for generic sources. We specifically assess the impact of higher order modes, repeating our interpretation with both l ≤2 as well as l ≤3 harmonic modes. Using the l ≤3 higher modes, we gain more information from the signal and can better constrain the parameters of the gravitational wave signal. We assess and quantify several sources of systematic error that our procedure could introduce, including simulation resolution and duration; most are negligible. We show through examples that our method can recover the parameters for equal mass, zero spin, GW150914-like, and unequal mass, precessing spin sources. Our study of this new parameter estimation method demonstrates that we can quantify and understand the systematic and statistical error. This method allows us to use higher order modes from numerical relativity simulations to better constrain the black hole binary parameters.
Energy Technology Data Exchange (ETDEWEB)
Takamizawa, Hisashi, E-mail: takamizawa.hisashi@jaea.go.jp; Itoh, Hiroto, E-mail: ito.hiroto@jaea.go.jp; Nishiyama, Yutaka, E-mail: nishiyama.yutaka93@jaea.go.jp
2016-10-15
In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.
PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks.
Directory of Open Access Journals (Sweden)
Thong Pham
Full Text Available Preferential attachment is a stochastic process that has been proposed to explain certain topological features characteristic of complex networks from diverse domains. The systematic investigation of preferential attachment is an important area of research in network science, not only for the theoretical matter of verifying whether this hypothesized process is operative in real-world networks, but also for the practical insights that follow from knowledge of its functional form. Here we describe a maximum likelihood based estimation method for the measurement of preferential attachment in temporal complex networks. We call the method PAFit, and implement it in an R package of the same name. PAFit constitutes an advance over previous methods primarily because we based it on a nonparametric statistical framework that enables attachment kernel estimation free of any assumptions about its functional form. We show this results in PAFit outperforming the popular methods of Jeong and Newman in Monte Carlo simulations. What is more, we found that the application of PAFit to a publically available Flickr social network dataset yielded clear evidence for a deviation of the attachment kernel from the popularly assumed log-linear form. Independent of our main work, we provide a correction to a consequential error in Newman's original method which had evidently gone unnoticed since its publication over a decade ago.
Using statistical sensitivities for adaptation of a best-estimate thermo-hydraulic simulation model
International Nuclear Information System (INIS)
Liu, X.J.; Kerner, A.; Schaefer, A.
2010-01-01
On-line adaptation of best-estimate simulations of NPP behaviour to time-dependent measurement data can be used to insure that simulations performed in parallel to plant operation develop synchronously with the real plant behaviour even over extended periods of time. This opens a range of applications including operator support in non-standard-situations, improving diagnostics and validation of measurements in real plants or experimental facilities. A number of adaptation methods have been proposed and successfully applied to control problems. However, these methods are difficult to be applied to best-estimate thermal-hydraulic codes, such as TRACE and ATHLET, with their large nonlinear differential equation systems and sophisticated time integration techniques. This paper presents techniques to use statistical sensitivity measures to overcome those problems by reducing the number of parameters subject to adaptation. It describes how to identify the most significant parameters for adaptation and how this information can be used by combining: -decomposition techniques splitting the system into a small set of component parts with clearly defined interfaces where boundary conditions can be derived from the measurement data, -filtering techniques to insure that the time frame for adaptation is meaningful, -numerical sensitivities to find minimal error conditions. The suitability of combining those techniques is shown by application to an adaptive simulation of the PKL experiment.
Statistical significance of cis-regulatory modules
Directory of Open Access Journals (Sweden)
Smith Andrew D
2007-01-01
Full Text Available Abstract Background It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection and assessment of cis-regulatory modules. While various algorithms for this purpose have been introduced, most are not well suited for rapid, genome scale scanning. Results We introduce methods designed for the detection and statistical evaluation of cis-regulatory modules, modeled as either clusters of individual binding sites or as combinations of sites with constrained organization. In order to determine the statistical significance of module sites, we first need a method to determine the statistical significance of single transcription factor binding site matches. We introduce a straightforward method of estimating the statistical significance of single site matches using a database of known promoters to produce data structures that can be used to estimate p-values for binding site matches. We next introduce a technique to calculate the statistical significance of the arrangement of binding sites within a module using a max-gap model. If the module scanned for has defined organizational parameters, the probability of the module is corrected to account for organizational constraints. The statistical significance of single site matches and the architecture of sites within the module can be combined to provide an overall estimation of statistical significance of cis-regulatory module sites. Conclusion The methods introduced in this paper allow for the detection and statistical evaluation of single transcription factor binding sites and cis-regulatory modules. The features described are implemented in the Search Tool for Occurrences of Regulatory Motifs (STORM and MODSTORM software.
Asiri, Sharefa M.
2017-10-08
Partial Differential Equations (PDEs) are commonly used to model complex systems that arise for example in biology, engineering, chemistry, and elsewhere. The parameters (or coefficients) and the source of PDE models are often unknown and are estimated from available measurements. Despite its importance, solving the estimation problem is mathematically and numerically challenging and especially when the measurements are corrupted by noise, which is often the case. Various methods have been proposed to solve estimation problems in PDEs which can be classified into optimization methods and recursive methods. The optimization methods are usually heavy computationally, especially when the number of unknowns is large. In addition, they are sensitive to the initial guess and stop condition, and they suffer from the lack of robustness to noise. Recursive methods, such as observer-based approaches, are limited by their dependence on some structural properties such as observability and identifiability which might be lost when approximating the PDE numerically. Moreover, most of these methods provide asymptotic estimates which might not be useful for control applications for example. An alternative non-asymptotic approach with less computational burden has been proposed in engineering fields based on the so-called modulating functions. In this dissertation, we propose to mathematically and numerically analyze the modulating functions based approaches. We also propose to extend these approaches to different situations. The contributions of this thesis are as follows. (i) Provide a mathematical analysis of the modulating function-based method (MFBM) which includes: its well-posedness, statistical properties, and estimation errors. (ii) Provide a numerical analysis of the MFBM through some estimation problems, and study the sensitivity of the method to the modulating functions\\' parameters. (iii) Propose an effective algorithm for selecting the method\\'s design parameters
Comparison of four statistical and machine learning methods for crash severity prediction.
Iranitalab, Amirfarrokh; Khattak, Aemal
2017-11-01
Crash severity prediction models enable different agencies to predict the severity of a reported crash with unknown severity or the severity of crashes that may be expected to occur sometime in the future. This paper had three main objectives: comparison of the performance of four statistical and machine learning methods including Multinomial Logit (MNL), Nearest Neighbor Classification (NNC), Support Vector Machines (SVM) and Random Forests (RF), in predicting traffic crash severity; developing a crash costs-based approach for comparison of crash severity prediction methods; and investigating the effects of data clustering methods comprising K-means Clustering (KC) and Latent Class Clustering (LCC), on the performance of crash severity prediction models. The 2012-2015 reported crash data from Nebraska, United States was obtained and two-vehicle crashes were extracted as the analysis data. The dataset was split into training/estimation (2012-2014) and validation (2015) subsets. The four prediction methods were trained/estimated using the training/estimation dataset and the correct prediction rates for each crash severity level, overall correct prediction rate and a proposed crash costs-based accuracy measure were obtained for the validation dataset. The correct prediction rates and the proposed approach showed NNC had the best prediction performance in overall and in more severe crashes. RF and SVM had the next two sufficient performances and MNL was the weakest method. Data clustering did not affect the prediction results of SVM, but KC improved the prediction performance of MNL, NNC and RF, while LCC caused improvement in MNL and RF but weakened the performance of NNC. Overall correct prediction rate had almost the exact opposite results compared to the proposed approach, showing that neglecting the crash costs can lead to misjudgment in choosing the right prediction method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Analysis of methods to estimate spring flows in a karst aquifer.
Sepúlveda, Nicasio
2009-01-01
Hydraulically and statistically based methods were analyzed to identify the most reliable method to predict spring flows in a karst aquifer. Measured water levels at nearby observation wells, measured spring pool altitudes, and the distance between observation wells and the spring pool were the parameters used to match measured spring flows. Measured spring flows at six Upper Floridan aquifer springs in central Florida were used to assess the reliability of these methods to predict spring flows. Hydraulically based methods involved the application of the Theis, Hantush-Jacob, and Darcy-Weisbach equations, whereas the statistically based methods were the multiple linear regressions and the technology of artificial neural networks (ANNs). Root mean square errors between measured and predicted spring flows using the Darcy-Weisbach method ranged between 5% and 15% of the measured flows, lower than the 7% to 27% range for the Theis or Hantush-Jacob methods. Flows at all springs were estimated to be turbulent based on the Reynolds number derived from the Darcy-Weisbach equation for conduit flow. The multiple linear regression and the Darcy-Weisbach methods had similar spring flow prediction capabilities. The ANNs provided the lowest residuals between measured and predicted spring flows, ranging from 1.6% to 5.3% of the measured flows. The model prediction efficiency criteria also indicated that the ANNs were the most accurate method predicting spring flows in a karst aquifer.
A fast pulse phase estimation method for X-ray pulsar signals based on epoch folding
Directory of Open Access Journals (Sweden)
Xue Mengfan
2016-06-01
Full Text Available X-ray pulsar-based navigation (XPNAV is an attractive method for autonomous deep-space navigation in the future. The pulse phase estimation is a key task in XPNAV and its accuracy directly determines the navigation accuracy. State-of-the-art pulse phase estimation techniques either suffer from poor estimation accuracy, or involve the maximization of generally non-convex object function, thus resulting in a large computational cost. In this paper, a fast pulse phase estimation method based on epoch folding is presented. The statistical properties of the observed profile obtained through epoch folding are developed. Based on this, we recognize the joint probability distribution of the observed profile as the likelihood function and utilize a fast Fourier transform-based procedure to estimate the pulse phase. Computational complexity of the proposed estimator is analyzed as well. Experimental results show that the proposed estimator significantly outperforms the currently used cross-correlation (CC and nonlinear least squares (NLS estimators, while significantly reduces the computational complexity compared with NLS and maximum likelihood (ML estimators.
Boundary methods for mode estimation
Pierson, William E., Jr.; Ulug, Batuhan; Ahalt, Stanley C.
1999-08-01
This paper investigates the use of Boundary Methods (BMs), a collection of tools used for distribution analysis, as a method for estimating the number of modes associated with a given data set. Model order information of this type is required by several pattern recognition applications. The BM technique provides a novel approach to this parameter estimation problem and is comparable in terms of both accuracy and computations to other popular mode estimation techniques currently found in the literature and automatic target recognition applications. This paper explains the methodology used in the BM approach to mode estimation. Also, this paper quickly reviews other common mode estimation techniques and describes the empirical investigation used to explore the relationship of the BM technique to other mode estimation techniques. Specifically, the accuracy and computational efficiency of the BM technique are compared quantitatively to the a mixture of Gaussian (MOG) approach and a k-means approach to model order estimation. The stopping criteria of the MOG and k-means techniques is the Akaike Information Criteria (AIC).
Ha, Min Jin; Sun, Wei
2014-09-01
Motivated by the problem of construction of gene co-expression network, we propose a statistical framework for estimating high-dimensional partial correlation matrix by a three-step approach. We first obtain a penalized estimate of a partial correlation matrix using ridge penalty. Next we select the non-zero entries of the partial correlation matrix by hypothesis testing. Finally we re-estimate the partial correlation coefficients at these non-zero entries. In the second step, the null distribution of the test statistics derived from penalized partial correlation estimates has not been established. We address this challenge by estimating the null distribution from the empirical distribution of the test statistics of all the penalized partial correlation estimates. Extensive simulation studies demonstrate the good performance of our method. Application on a yeast cell cycle gene expression data shows that our method delivers better predictions of the protein-protein interactions than the Graphic Lasso. © 2014, The International Biometric Society.
Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M
2011-12-01
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
Sutton, Virginia Kay
This paper examines statistical issues associated with estimating paths of juvenile salmon through the intakes of Kaplan turbines. Passive sensors, hydrophones, detecting signals from ultrasonic transmitters implanted in individual fish released into the preturbine region were used to obtain the information to estimate fish paths through the intake. Aim and location of the sensors affects the spatial region in which the transmitters can be detected, and formulas relating this region to sensor aiming directions are derived. Cramer-Rao lower bounds for the variance of estimators of fish location are used to optimize placement of each sensor. Finally, a statistical methodology is developed for analyzing angular data collected from optimally placed sensors.
Brief guidelines for methods and statistics in medical research
Ab Rahman, Jamalludin
2015-01-01
This book serves as a practical guide to methods and statistics in medical research. It includes step-by-step instructions on using SPSS software for statistical analysis, as well as relevant examples to help those readers who are new to research in health and medical fields. Simple texts and diagrams are provided to help explain the concepts covered, and print screens for the statistical steps and the SPSS outputs are provided, together with interpretations and examples of how to report on findings. Brief Guidelines for Methods and Statistics in Medical Research offers a valuable quick reference guide for healthcare students and practitioners conducting research in health related fields, written in an accessible style.
Comparing two survey methods for estimating maternal and perinatal mortality in rural Cambodia.
Chandy, Hoeuy; Heng, Yang Van; Samol, Ha; Husum, Hans
2008-03-01
We need solid estimates of maternal mortality rates (MMR) to monitor the impact of maternal care programs. Cambodian health authorities and WHO report the MMR in Cambodia at 450 per 100,000 live births. The figure is drawn from surveys where information is obtained by interviewing respondents about the survival of all their adult sisters (sisterhood method). The estimate is statistically imprecise, 95% confidence intervals ranging from 260 to 620/100,000. The MMR estimate is also uncertain due to under-reporting; where 80-90% of women deliver at home maternal fatalities may go undetected especially where mortality is highest, in remote rural areas. The aim of this study was to attain more reliable MMR estimates by using survey methods other than the sisterhood method prior to an intervention targeting obstetric rural emergencies. The study was carried out in rural Northwestern Cambodia where access to health services is poor and poverty, endemic diseases, and land mines are endemic. Two survey methods were applied in two separate sectors: a community-based survey gathering data from public sources and a household survey gathering data direct from primary sources. There was no statistically significant difference between the two survey results for maternal deaths, both types of survey reported mortality rates around the public figure. The household survey reported a significantly higher perinatal mortality rate as compared to the community-based survey, 8.6% versus 5.0%. Also the household survey gave qualitative data important for a better understanding of the many problems faced by mothers giving birth in the remote villages. There are detection failures in both surveys; the failure rate may be as high as 30-40%. PRINCIPLE CONCLUSION: Both survey methods are inaccurate, therefore inappropriate for evaluation of short-term changes of mortality rates. Surveys based on primary informants yield qualitative information about mothers' hardships important for the design
Directory of Open Access Journals (Sweden)
Seyedtabaee Saeed
2010-01-01
Full Text Available This paper deals with configuration of an algorithm to be used in a speech-passing angle grinder noise-canceling headset. Angle grinder noise is annoying and interrupts ordinary oral communication. Meaning that, low SNR noisy condition is ahead. Since variation in angle grinder working condition changes noise statistics, the noise will be nonstationary with possible jumps in its power. Studies are conducted for picking an appropriate algorithm. A modified version of the well-known spectral subtraction shows superior performance against alternate methods. Noise estimation is calculated through a multi-band fast adapting scheme. The algorithm is adapted very quickly to the non-stationary noise environment while inflecting minimum musical noise and speech distortion on the processed signal. Objective and subjective measures illustrating the performance of the proposed method are introduced.
Statistical modeling and MAP estimation for body fat quantification with MRI ratio imaging
Wong, Wilbur C. K.; Johnson, David H.; Wilson, David L.
2008-03-01
We are developing small animal imaging techniques to characterize the kinetics of lipid accumulation/reduction of fat depots in response to genetic/dietary factors associated with obesity and metabolic syndromes. Recently, we developed an MR ratio imaging technique that approximately yields lipid/{lipid + water}. In this work, we develop a statistical model for the ratio distribution that explicitly includes a partial volume (PV) fraction of fat and a mixture of a Rician and multiple Gaussians. Monte Carlo hypothesis testing showed that our model was valid over a wide range of coefficient of variation of the denominator distribution (c.v.: 0-0:20) and correlation coefficient among the numerator and denominator (ρ 0-0.95), which cover the typical values that we found in MRI data sets (c.v.: 0:027-0:063, ρ: 0:50-0:75). Then a maximum a posteriori (MAP) estimate for the fat percentage per voxel is proposed. Using a digital phantom with many PV voxels, we found that ratio values were not linearly related to PV fat content and that our method accurately described the histogram. In addition, the new method estimated the ground truth within +1.6% vs. +43% for an approach using an uncorrected ratio image, when we simply threshold the ratio image. On the six genetically obese rat data sets, the MAP estimate gave total fat volumes of 279 +/- 45mL, values 21% smaller than those from the uncorrected ratio images, principally due to the non-linear PV effect. We conclude that our algorithm can increase the accuracy of fat volume quantification even in regions having many PV voxels, e.g. ectopic fat depots.
Amalia, Junita; Purhadi, Otok, Bambang Widjanarko
2017-11-01
Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.
Rodriguez, G.; Scheid, R. E., Jr.
1986-01-01
This paper outlines methods for modeling, identification and estimation for static determination of flexible structures. The shape estimation schemes are based on structural models specified by (possibly interconnected) elliptic partial differential equations. The identification techniques provide approximate knowledge of parameters in elliptic systems. The techniques are based on the method of maximum-likelihood that finds parameter values such that the likelihood functional associated with the system model is maximized. The estimation methods are obtained by means of a function-space approach that seeks to obtain the conditional mean of the state given the data and a white noise characterization of model errors. The solutions are obtained in a batch-processing mode in which all the data is processed simultaneously. After methods for computing the optimal estimates are developed, an analysis of the second-order statistics of the estimates and of the related estimation error is conducted. In addition to outlining the above theoretical results, the paper presents typical flexible structure simulations illustrating performance of the shape determination methods.
Directory of Open Access Journals (Sweden)
Brayan Alexander Fonseca Martinez
2017-11-01
Full Text Available One of the most commonly observational study designs employed in veterinary is the cross-sectional study with binary outcomes. To measure an association with exposure, the use of prevalence ratios (PR or odds ratios (OR are possible. In human epidemiology, much has been discussed about the use of the OR exclusively for case–control studies and some authors reported that there is no good justification for fitting logistic regression when the prevalence of the disease is high, in which OR overestimate the PR. Nonetheless, interpretation of OR is difficult since confusing between risk and odds can lead to incorrect quantitative interpretation of data such as “the risk is X times greater,” commonly reported in studies that use OR. The aims of this study were (1 to review articles with cross-sectional designs to assess the statistical method used and the appropriateness of the interpretation of the estimated measure of association and (2 to illustrate the use of alternative statistical methods that estimate PR directly. An overview of statistical methods and its interpretation using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA guidelines was conducted and included a diverse set of peer-reviewed journals among the veterinary science field using PubMed as the search engine. From each article, the statistical method used and the appropriateness of the interpretation of the estimated measure of association were registered. Additionally, four alternative models for logistic regression that estimate directly PR were tested using our own dataset from a cross-sectional study on bovine viral diarrhea virus. The initial search strategy found 62 articles, in which 6 articles were excluded and therefore 56 studies were used for the overall analysis. The review showed that independent of the level of prevalence reported, 96% of articles employed logistic regression, thus estimating the OR. Results of the multivariate models
Development of an unbiased statistical method for the analysis of unigenic evolution
Directory of Open Access Journals (Sweden)
Shilton Brian H
2006-03-01
Full Text Available Abstract Background Unigenic evolution is a powerful genetic strategy involving random mutagenesis of a single gene product to delineate functionally important domains of a protein. This method involves selection of variants of the protein which retain function, followed by statistical analysis comparing expected and observed mutation frequencies of each residue. Resultant mutability indices for each residue are averaged across a specified window of codons to identify hypomutable regions of the protein. As originally described, the effect of changes to the length of this averaging window was not fully eludicated. In addition, it was unclear when sufficient functional variants had been examined to conclude that residues conserved in all variants have important functional roles. Results We demonstrate that the length of averaging window dramatically affects identification of individual hypomutable regions and delineation of region boundaries. Accordingly, we devised a region-independent chi-square analysis that eliminates loss of information incurred during window averaging and removes the arbitrary assignment of window length. We also present a method to estimate the probability that conserved residues have not been mutated simply by chance. In addition, we describe an improved estimation of the expected mutation frequency. Conclusion Overall, these methods significantly extend the analysis of unigenic evolution data over existing methods to allow comprehensive, unbiased identification of domains and possibly even individual residues that are essential for protein function.
Markov Chain Monte Carlo (MCMC) methods for parameter estimation of a novel hybrid redundant robot
International Nuclear Information System (INIS)
Wang Yongbo; Wu Huapeng; Handroos, Heikki
2011-01-01
This paper presents a statistical method for the calibration of a redundantly actuated hybrid serial-parallel robot IWR (Intersector Welding Robot). The robot under study will be used to carry out welding, machining, and remote handing for the assembly of vacuum vessel of International Thermonuclear Experimental Reactor (ITER). The robot has ten degrees of freedom (DOF), among which six DOF are contributed by the parallel mechanism and the rest are from the serial mechanism. In this paper, a kinematic error model which involves 54 unknown geometrical error parameters is developed for the proposed robot. Based on this error model, the mean values of the unknown parameters are statistically analyzed and estimated by means of Markov Chain Monte Carlo (MCMC) approach. The computer simulation is conducted by introducing random geometric errors and measurement poses which represent the corresponding real physical behaviors. The simulation results of the marginal posterior distributions of the estimated model parameters indicate that our method is reliable and robust.
Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth
2015-10-01
Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations
Statistical errors in Monte Carlo estimates of systematic errors
Roe, Byron P.
2007-01-01
For estimating the effects of a number of systematic errors on a data sample, one can generate Monte Carlo (MC) runs with systematic parameters varied and examine the change in the desired observed result. Two methods are often used. In the unisim method, the systematic parameters are varied one at a time by one standard deviation, each parameter corresponding to a MC run. In the multisim method (see ), each MC run has all of the parameters varied; the amount of variation is chosen from the expected distribution of each systematic parameter, usually assumed to be a normal distribution. The variance of the overall systematic error determination is derived for each of the two methods and comparisons are made between them. If one focuses not on the error in the prediction of an individual systematic error, but on the overall error due to all systematic errors in the error matrix element in data bin m, the number of events needed is strongly reduced because of the averaging effect over all of the errors. For simple models presented here the multisim model was far better if the statistical error in the MC samples was larger than an individual systematic error, while for the reverse case, the unisim model was better. Exact formulas and formulas for the simple toy models are presented so that realistic calculations can be made. The calculations in the present note are valid if the errors are in a linear region. If that region extends sufficiently far, one can have the unisims or multisims correspond to k standard deviations instead of one. This reduces the number of events required by a factor of k2. The specific terms unisim and multisim were coined by Peter Meyers and Steve Brice, respectively, for the MiniBooNE experiment. However, the concepts have been developed over time and have been in general use for some time.
Directory of Open Access Journals (Sweden)
Zaira M Alieva
2016-01-01
Full Text Available The article analyzes the application of mathematical and statistical methods in the analysis of socio-humanistic texts. The essence of mathematical and statistical methods, presents examples of their use in the study of Humanities and social phenomena. Considers the key issues faced by the expert in the application of mathematical-statistical methods in socio-humanitarian sphere, including the availability of sustainable contrasting socio-humanitarian Sciences and mathematics; the complexity of the allocation of the object that is the bearer of the problem; having the use of a probabilistic approach. The conclusion according to the results of the study.
Cutting-edge statistical methods for a life-course approach.
Bub, Kristen L; Ferretti, Larissa K
2014-01-01
Advances in research methods, data collection and record keeping, and statistical software have substantially increased our ability to conduct rigorous research across the lifespan. In this article, we review a set of cutting-edge statistical methods that life-course researchers can use to rigorously address their research questions. For each technique, we describe the method, highlight the benefits and unique attributes of the strategy, offer a step-by-step guide on how to conduct the analysis, and illustrate the technique using data from the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development. In addition, we recommend a set of technical and empirical readings for each technique. Our goal was not to address a substantive question of interest but instead to provide life-course researchers with a useful reference guide to cutting-edge statistical methods.
Estimation of selected seasonal streamflow statistics representative of 1930-2002 in West Virginia
Wiley, Jeffrey B.; Atkins, John T.
2010-01-01
Regional equations and procedures were developed for estimating seasonal 1-day 10-year, 7-day 10-year, and 30-day 5-year hydrologically based low-flow frequency values for unregulated streams in West Virginia. Regional equations and procedures also were developed for estimating the seasonal U.S. Environmental Protection Agency harmonic-mean flows and the 50-percent flow-duration values. The seasons were defined as winter (January 1-March 31), spring (April 1-June 30), summer (July 1-September 30), and fall (October 1-December 31). Regional equations were developed using ordinary least squares regression using statistics from 117 U.S. Geological Survey continuous streamgage stations as dependent variables and basin characteristics as independent variables. Equations for three regions in West Virginia-North, South-Central, and Eastern Panhandle Regions-were determined. Drainage area, average annual precipitation, and longitude of the basin centroid are significant independent variables in one or more of the equations. The average standard error of estimates for the equations ranged from 12.6 to 299 percent. Procedures developed to estimate the selected seasonal streamflow statistics in this study are applicable only to rural, unregulated streams within the boundaries of West Virginia that have independent variables within the limits of the stations used to develop the regional equations: drainage area from 16.3 to 1,516 square miles in the North Region, from 2.78 to 1,619 square miles in the South-Central Region, and from 8.83 to 3,041 square miles in the Eastern Panhandle Region; average annual precipitation from 42.3 to 61.4 inches in the South-Central Region and from 39.8 to 52.9 inches in the Eastern Panhandle Region; and longitude of the basin centroid from 79.618 to 82.023 decimal degrees in the North Region. All estimates of seasonal streamflow statistics are representative of the period from the 1930 to the 2002 climatic year.
International Nuclear Information System (INIS)
Ishikawa, Nao; Tagami, Keiko; Uchida, Shigeo
2009-01-01
Soil-to-plant transfer factor (TF) is one of the important parameters in radiation dose assessment models for the environmental transfer of radionuclides. Since TFs are affected by several factors, including radionuclides, plant species and soil properties, development of a method for estimation of TF using some soil and plant properties would be useful. In this study, we took a statistical approach to estimating the TF of stable strontium (TF Sr ) from selected soil properties and element concentrations in plants, which was used as an analogue of 90 Sr. We collected the plant and soil samples used for the study from 142 agricultural fields throughout Japan. We applied a multiple linear regression analysis in order to get an empirical equation to estimate TF Sr . TF Sr could be estimated from the Sr concentration in soil (C Sr soil ) and Ca concentration in crop (C Ca crop ) using the following equation: log TF Sr =-0.88·log C Sr soil +0.93·log C Ca crop -2.53. Then, we replaced our data with Ca concentrations in crops from a food composition database compiled by the Japanese government. Finally, we predicted TF Sr using Sr concentration in soil from our data and Ca concentration in crops from the database of food composition. (author)
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Jerez, José M; Molina, Ignacio; García-Laencina, Pedro J; Alba, Emilio; Ribelles, Nuria; Martín, Miguel; Franco, Leonardo
2010-10-01
Missing data imputation is an important task in cases where it is crucial to use all available data and not discard records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set. Imputation methods based on statistical techniques, e.g., mean, hot-deck and multiple imputation, and machine learning techniques, e.g., multi-layer perceptron (MLP), self-organisation maps (SOM) and k-nearest neighbour (KNN), were applied to data collected through the "El Álamo-I" project, and the results were then compared to those obtained from the listwise deletion (LD) imputation method. The database includes demographic, therapeutic and recurrence-survival information from 3679 women with operable invasive breast cancer diagnosed in 32 different hospitals belonging to the Spanish Breast Cancer Research Group (GEICAM). The accuracies of predictions on early cancer relapse were measured using artificial neural networks (ANNs), in which different ANNs were estimated using the data sets with imputed missing values. The imputation methods based on machine learning algorithms outperformed imputation statistical methods in the prediction of patient outcome. Friedman's test revealed a significant difference (p=0.0091) in the observed area under the ROC curve (AUC) values, and the pairwise comparison test showed that the AUCs for MLP, KNN and SOM were significantly higher (p=0.0053, p=0.0048 and p=0.0071, respectively) than the AUC from the LD-based prognosis model. The methods based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy compared to imputation methods based on statistical procedures. Copyright © 2010 Elsevier B.V. All rights reserved.
A robust statistical method for association-based eQTL analysis.
Directory of Open Access Journals (Sweden)
Ning Jiang
Full Text Available It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS is statistical inference of linkage disequilibrium (LD between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation.We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations.The analyses show that the new method confers an improved statistical power for detecting genuine genetic association in subpopulations and an effective control of spurious associations stemmed from population structure when compared with other two popularly implemented methods in the literature of GWAS.
Induction of micronuclei in hemocytes of Mytilus edulis and statistical analysis
DEFF Research Database (Denmark)
Wrisberg, M. N.; Bilbo, Carl M.; Spliid, Henrik
1992-01-01
biological variation, emphasizing the importance of application of a correct statistical method. A systematic approach to the statistical evaluation of the mussel MN test is outlined. The statistical model includes three different situations: (a) estimation of parameters of a single sample, (b) estimation...
Bayesian approach to inverse statistical mechanics
Habeck, Michael
2014-05-01
Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.
Directory of Open Access Journals (Sweden)
Christopher J Paciorek
Full Text Available We present a gridded 8 km-resolution data product of the estimated composition of tree taxa at the time of Euro-American settlement of the northeastern United States and the statistical methodology used to produce the product from trees recorded by land surveyors. Composition is defined as the proportion of stems larger than approximately 20 cm diameter at breast height for 22 tree taxa, generally at the genus level. The data come from settlement-era public survey records that are transcribed and then aggregated spatially, giving count data. The domain is divided into two regions, eastern (Maine to Ohio and midwestern (Indiana to Minnesota. Public Land Survey point data in the midwestern region (ca. 0.8-km resolution are aggregated to a regular 8 km grid, while data in the eastern region, from Town Proprietor Surveys, are aggregated at the township level in irregularly-shaped local administrative units. The product is based on a Bayesian statistical model fit to the count data that estimates composition on the 8 km grid across the entire domain. The statistical model is designed to handle data from both the regular grid and the irregularly-shaped townships and allows us to estimate composition at locations with no data and to smooth over noise caused by limited counts in locations with data. Critically, the model also allows us to quantify uncertainty in our composition estimates, making the product suitable for applications employing data assimilation. We expect this data product to be useful for understanding the state of vegetation in the northeastern United States prior to large-scale Euro-American settlement. In addition to specific regional questions, the data product can also serve as a baseline against which to investigate how forests and ecosystems change after intensive settlement. The data product is being made available at the NIS data portal as version 1.0.
Estimation of subcriticality of TCA using 'indirect estimation method for calculation error'
International Nuclear Information System (INIS)
Naito, Yoshitaka; Yamamoto, Toshihiro; Arakawa, Takuya; Sakurai, Kiyoshi
1996-01-01
To estimate the subcriticality of neutron multiplication factor in a fissile system, 'Indirect Estimation Method for Calculation Error' is proposed. This method obtains the calculational error of neutron multiplication factor by correlating measured values with the corresponding calculated ones. This method was applied to the source multiplication and to the pulse neutron experiments conducted at TCA, and the calculation error of MCNP 4A was estimated. In the source multiplication method, the deviation of measured neutron count rate distributions from the calculated ones estimates the accuracy of calculated k eff . In the pulse neutron method, the calculation errors of prompt neutron decay constants give the accuracy of the calculated k eff . (author)
On systematic and statistic errors in radionuclide mass activity estimation procedure
International Nuclear Information System (INIS)
Smelcerovic, M.; Djuric, G.; Popovic, D.
1989-01-01
One of the most important requirements during nuclear accidents is the fast estimation of the mass activity of the radionuclides that suddenly and without control reach the environment. The paper points to systematic errors in the procedures of sampling, sample preparation and measurement itself, that in high degree contribute to total mass activity evaluation error. Statistic errors in gamma spectrometry as well as in total mass alpha and beta activity evaluation are also discussed. Beside, some of the possible sources of errors in the partial mass activity evaluation for some of the radionuclides are presented. The contribution of the errors in the total mass activity evaluation error is estimated and procedures that could possibly reduce it are discussed (author)
Statistical analysis of maximum likelihood estimator images of human brain FDG PET studies
International Nuclear Information System (INIS)
Llacer, J.; Veklerov, E.; Hoffman, E.J.; Nunez, J.; Coakley, K.J.
1993-01-01
The work presented in this paper evaluates the statistical characteristics of regional bias and expected error in reconstructions of real PET data of human brain fluorodeoxiglucose (FDG) studies carried out by the maximum likelihood estimator (MLE) method with a robust stopping rule, and compares them with the results of filtered backprojection (FBP) reconstructions and with the method of sieves. The task that the authors have investigated is that of quantifying radioisotope uptake in regions-of-interest (ROI's). They first describe a robust methodology for the use of the MLE method with clinical data which contains only one adjustable parameter: the kernel size for a Gaussian filtering operation that determines final resolution and expected regional error. Simulation results are used to establish the fundamental characteristics of the reconstructions obtained by out methodology, corresponding to the case in which the transition matrix is perfectly known. Then, data from 72 independent human brain FDG scans from four patients are used to show that the results obtained from real data are consistent with the simulation, although the quality of the data and of the transition matrix have an effect on the final outcome
Computational statistics handbook with Matlab
Martinez, Wendy L
2007-01-01
Prefaces Introduction What Is Computational Statistics? An Overview of the Book Probability Concepts Introduction Probability Conditional Probability and Independence Expectation Common Distributions Sampling Concepts Introduction Sampling Terminology and Concepts Sampling Distributions Parameter Estimation Empirical Distribution Function Generating Random Variables Introduction General Techniques for Generating Random Variables Generating Continuous Random Variables Generating Discrete Random Variables Exploratory Data Analysis Introduction Exploring Univariate Data Exploring Bivariate and Trivariate Data Exploring Multidimensional Data Finding Structure Introduction Projecting Data Principal Component Analysis Projection Pursuit EDA Independent Component Analysis Grand Tour Nonlinear Dimensionality Reduction Monte Carlo Methods for Inferential Statistics Introduction Classical Inferential Statistics Monte Carlo Methods for Inferential Statist...
van de Glind, Esther M. M.; Willems, Hanna C.; Eslami, Saeid; Abu-Hanna, Ameen; Lems, Willem F.; Hooft, Lotty; de Rooij, Sophia E.; Black, Dennis M.; van Munster, Barbara C.
2016-01-01
Background For physicians dealing with patients with a limited life expectancy, knowing the time to benefit (TTB) of preventive medication is essential to support treatment decisions. Objective The aim of this study was to investigate the usefulness of statistical process control (SPC) for determining the TTB in relation to fracture risk with alendronate versus placebo in postmenopausal women. Methods We performed a post?hoc analysis of the Fracture Intervention Trial (FIT), a randomized, con...
Sampling designs and methods for estimating fish-impingement losses at cooling-water intakes
International Nuclear Information System (INIS)
Murarka, I.P.; Bodeau, D.J.
1977-01-01
Several systems for estimating fish impingement at power plant cooling-water intakes are compared to determine the most statistically efficient sampling designs and methods. Compared to a simple random sampling scheme the stratified systematic random sampling scheme, the systematic random sampling scheme, and the stratified random sampling scheme yield higher efficiencies and better estimators for the parameters in two models of fish impingement as a time-series process. Mathematical results and illustrative examples of the applications of the sampling schemes to simulated and real data are given. Some sampling designs applicable to fish-impingement studies are presented in appendixes
Application of statistical method for FBR plant transient computation
International Nuclear Information System (INIS)
Kikuchi, Norihiro; Mochizuki, Hiroyasu
2014-01-01
Highlights: • A statistical method with a large trial number up to 10,000 is applied to the plant system analysis. • A turbine trip test conducted at the “Monju” reactor is selected as a plant transient. • A reduction method of trial numbers is discussed. • The result with reduced trial number can express the base regions of the computed distribution. -- Abstract: It is obvious that design tolerances, errors included in operation, and statistical errors in empirical correlations effect on the transient behavior. The purpose of the present study is to apply above mentioned statistical errors to a plant system computation in order to evaluate the statistical distribution contained in the transient evolution. A selected computation case is the turbine trip test conducted at 40% electric power of the prototype fast reactor “Monju”. All of the heat transport systems of “Monju” are modeled with the NETFLOW++ system code which has been validated using the plant transient tests of the experimental fast reactor Joyo, and “Monju”. The effects of parameters on upper plenum temperature are confirmed by sensitivity analyses, and dominant parameters are chosen. The statistical errors are applied to each computation deck by using a pseudorandom number and the Monte-Carlo method. The dSFMT (Double precision SIMD-oriented Fast Mersenne Twister) that is developed version of Mersenne Twister (MT), is adopted as the pseudorandom number generator. In the present study, uniform random numbers are generated by dSFMT, and these random numbers are transformed to the normal distribution by the Box–Muller method. Ten thousands of different computations are performed at once. In every computation case, the steady calculation is performed for 12,000 s, and transient calculation is performed for 4000 s. In the purpose of the present statistical computation, it is important that the base regions of distribution functions should be calculated precisely. A large number of
Directory of Open Access Journals (Sweden)
K. R. Gupta
2010-01-01
Full Text Available Three simple, precise and economical UV methods have been developed for the estimation of itopride hydrochloride in pharmaceutical formulations. Itopride hydrochloride in distilled water shows the maximum absorbance at 258.0 nm (Method A and in first order derivative spectra of the same shows sharp peak at 247.0 nm, when n = 1 (Method B. Method C utilises area under curve (AUC in the wavelength range from 262.0-254.0 nm for analysis of itopride hydrochloride. The drug was found to obey Beer-Lambert’s law in the concentration range of 5-50 μg/mL for all three proposed methods. Results of the analysis were validated statistically and recovery studies were found to be satisfactory.
Heuristic introduction to estimation methods
International Nuclear Information System (INIS)
Feeley, J.J.; Griffith, J.M.
1982-08-01
The methods and concepts of optimal estimation and control have been very successfully applied in the aerospace industry during the past 20 years. Although similarities exist between the problems (control, modeling, measurements) in the aerospace and nuclear power industries, the methods and concepts have found only scant acceptance in the nuclear industry. Differences in technical language seem to be a major reason for the slow transfer of estimation and control methods to the nuclear industry. Therefore, this report was written to present certain important and useful concepts with a minimum of specialized language. By employing a simple example throughout the report, the importance of several information and uncertainty sources is stressed and optimal ways of using or allowing for these sources are presented. This report discusses optimal estimation problems. A future report will discuss optimal control problems
Statistical theory and inference
Olive, David J
2014-01-01
This text is for a one semester graduate course in statistical theory and covers minimal and complete sufficient statistics, maximum likelihood estimators, method of moments, bias and mean square error, uniform minimum variance estimators and the Cramer-Rao lower bound, an introduction to large sample theory, likelihood ratio tests and uniformly most powerful tests and the Neyman Pearson Lemma. A major goal of this text is to make these topics much more accessible to students by using the theory of exponential families. Exponential families, indicator functions and the support of the distribution are used throughout the text to simplify the theory. More than 50 ``brand name" distributions are used to illustrate the theory with many examples of exponential families, maximum likelihood estimators and uniformly minimum variance unbiased estimators. There are many homework problems with over 30 pages of solutions.
Essaky, El; Vives, Josep
2016-01-01
This book is the outcome of the CIMPA School on Statistical Methods and Applications in Insurance and Finance, held in Marrakech and Kelaat M'gouna (Morocco) in April 2013. It presents two lectures and seven refereed papers from the school, offering the reader important insights into key topics. The first of the lectures, by Frederic Viens, addresses risk management via hedging in discrete and continuous time, while the second, by Boualem Djehiche, reviews statistical estimation methods applied to life and disability insurance. The refereed papers offer diverse perspectives and extensive discussions on subjects including optimal control, financial modeling using stochastic differential equations, pricing and hedging of financial derivatives, and sensitivity analysis. Each chapter of the volume includes a comprehensive bibliography to promote further research.
Statistical methods for evaluating the attainment of cleanup standards
Energy Technology Data Exchange (ETDEWEB)
Gilbert, R.O.; Simpson, J.C.
1992-12-01
This document is the third volume in a series of volumes sponsored by the US Environmental Protection Agency (EPA), Statistical Policy Branch, that provide statistical methods for evaluating the attainment of cleanup Standards at Superfund sites. Volume 1 (USEPA 1989a) provides sampling designs and tests for evaluating attainment of risk-based standards for soils and solid media. Volume 2 (USEPA 1992) provides designs and tests for evaluating attainment of risk-based standards for groundwater. The purpose of this third volume is to provide statistical procedures for designing sampling programs and conducting statistical tests to determine whether pollution parameters in remediated soils and solid media at Superfund sites attain site-specific reference-based standards. This.document is written for individuals who may not have extensive training or experience with statistical methods. The intended audience includes EPA regional remedial project managers, Superfund-site potentially responsible parties, state environmental protection agencies, and contractors for these groups.
The Kernel Estimation in Biosystems Engineering
Directory of Open Access Journals (Sweden)
Esperanza Ayuga Téllez
2008-04-01
Full Text Available In many fields of biosystems engineering, it is common to find works in which statistical information is analysed that violates the basic hypotheses necessary for the conventional forecasting methods. For those situations, it is necessary to find alternative methods that allow the statistical analysis considering those infringements. Non-parametric function estimation includes methods that fit a target function locally, using data from a small neighbourhood of the point. Weak assumptions, such as continuity and differentiability of the target function, are rather used than "a priori" assumption of the global target function shape (e.g., linear or quadratic. In this paper a few basic rules of decision are enunciated, for the application of the non-parametric estimation method. These statistical rules set up the first step to build an interface usermethod for the consistent application of kernel estimation for not expert users. To reach this aim, univariate and multivariate estimation methods and density function were analysed, as well as regression estimators. In some cases the models to be applied in different situations, based on simulations, were defined. Different biosystems engineering applications of the kernel estimation are also analysed in this review.
Modulating functions method for parameters estimation in the fifth order KdV equation
Asiri, Sharefa M.
2017-07-25
In this work, the modulating functions method is proposed for estimating coefficients in higher-order nonlinear partial differential equation which is the fifth order Kortewegde Vries (KdV) equation. The proposed method transforms the problem into a system of linear algebraic equations of the unknowns. The statistical properties of the modulating functions solution are described in this paper. In addition, guidelines for choosing the number of modulating functions, which is an important design parameter, are provided. The effectiveness and robustness of the proposed method are shown through numerical simulations in both noise-free and noisy cases.
Statistical inference for financial engineering
Taniguchi, Masanobu; Ogata, Hiroaki; Taniai, Hiroyuki
2014-01-01
This monograph provides the fundamentals of statistical inference for financial engineering and covers some selected methods suitable for analyzing financial time series data. In order to describe the actual financial data, various stochastic processes, e.g. non-Gaussian linear processes, non-linear processes, long-memory processes, locally stationary processes etc. are introduced and their optimal estimation is considered as well. This book also includes several statistical approaches, e.g., discriminant analysis, the empirical likelihood method, control variate method, quantile regression, realized volatility etc., which have been recently developed and are considered to be powerful tools for analyzing the financial data, establishing a new bridge between time series and financial engineering. This book is well suited as a professional reference book on finance, statistics and statistical financial engineering. Readers are expected to have an undergraduate-level knowledge of statistics.
Statistical methods for accurately determining criticality code bias
International Nuclear Information System (INIS)
Trumble, E.F.; Kimball, K.D.
1997-01-01
A system of statistically treating validation calculations for the purpose of determining computer code bias is provided in this paper. The following statistical treatments are described: weighted regression analysis, lower tolerance limit, lower tolerance band, and lower confidence band. These methods meet the criticality code validation requirements of ANS 8.1. 8 refs., 5 figs., 4 tabs
Statistical inferences for bearings life using sudden death test
Directory of Open Access Journals (Sweden)
Morariu Cristin-Olimpiu
2017-01-01
Full Text Available In this paper we propose a calculus method for reliability indicators estimation and a complete statistical inferences for three parameters Weibull distribution of bearings life. Using experimental values regarding the durability of bearings tested on stands by the sudden death tests involves a series of particularities of the estimation using maximum likelihood method and statistical inference accomplishment. The paper detailing these features and also provides an example calculation.
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
[Estimations of maternal mortality using the sisterhood survival method: Latin American experience].
Wong, L R; Simons, H; Graham, W; Schkolnik, S
1990-08-01
The method of surviving sisters for indirectly estimating maternal mortality is still under development but shows promise for countries lacking alternative sources of data and good statistics. This work uses census or survey data to apply the method to rural villages in Gambia; Mapuche settlements in Cautin, Chile; marginal populations on the outskirts of Lima, Peru; and rural villages of Avaroa, Bolivia. The method is explained in detail following presentation of the results. The necessary basic information is outlined, and the particularities of its application to each Latin American case are discussed. The surviving sisters method was developed by Graham and Brass to derive indicators of maternal mortality based on the proportion of sisters who arrive at fertile age and die during pregnancy, delivery, or the postpartum period. The method transforms the proportions of sisters who died of maternal causes obtained from a census or survey into conventional probabilities of death. The basic information required concerns the numbers of sisters entering the reproductive period (excluding the respondent is she is a woman), the number surviving and decreased at the survey data, and the number who died during pregnancy, delivery, or the postpartum period. The probabilities of dying from a maternal cause were estimated on the basis of the sister survival method at 1/98 in Lima, 1/53 in Cautin, 1/17 in Gambia, and 1/10 in Bolivia. These probabilities correspond to ratios of maternal mortality per 100,000 live births of 286 in Lima, 414 in Cautin, 1005 in Gambia, and 1379 in Bolivia. The results demonstrate great variability in maternal mortality rates. In the cases of Lima and Cautin there were significant differences between estimates derived from the sister survival method and those derived from vital statistics. The 4 cases demonstrated the familiar association between maternal and infant mortality, fertility, and overall female mortality expressed in life expectancy at
Statistical methods for quality assurance
International Nuclear Information System (INIS)
Rinne, H.; Mittag, H.J.
1989-01-01
This is the first German-language textbook on quality assurance and the fundamental statistical methods that is suitable for private study. The material for this book has been developed from a course of Hagen Open University and is characterized by a particularly careful didactical design which is achieved and supported by numerous illustrations and photographs, more than 100 exercises with complete problem solutions, many fully displayed calculation examples, surveys fostering a comprehensive approach, bibliography with comments. The textbook has an eye to practice and applications, and great care has been taken by the authors to avoid abstraction wherever appropriate, to explain the proper conditions of application of the testing methods described, and to give guidance for suitable interpretation of results. The testing methods explained also include latest developments and research results in order to foster their adoption in practice. (orig.) [de
Directory of Open Access Journals (Sweden)
Zhang Zhang
2012-03-01
Full Text Available Abstract Background Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB. Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis. Results Here we propose a novel measure--Codon Deviation Coefficient (CDC--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance. Conclusions As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions.
International Nuclear Information System (INIS)
Khayat, Omid; Afarideh, Hossein; Mohammadnia, Meisam
2015-01-01
In the solid state nuclear track detectors of chemically etched type, nuclear tracks with center-to-center neighborhood of distance shorter than two times the radius of tracks will emerge as overlapping tracks. Track overlapping in this type of detectors causes tracks count losses and it becomes rather severe in high track densities. Therefore, tracks counting in this condition should include a correction factor for count losses of different tracks overlapping orders since a number of overlapping tracks may be counted as one track. Another aspect of the problem is the cases where imaging the whole area of the detector and counting all tracks are not possible. In these conditions a statistical generalization method is desired to be applicable in counting a segmented area of the detector and the results can be generalized to the whole surface of the detector. Also there is a challenge in counting the tracks in densely overlapped tracks because not sufficient geometrical or contextual information are available. It this paper we present a statistical counting method which gives the user a relation between the tracks overlapping probabilities on a segmented area of the detector surface and the total number of tracks. To apply the proposed method one can estimate the total number of tracks on a solid state detector of arbitrary shape and dimensions by approximating the tracks averaged area, whole detector surface area and some orders of tracks overlapping probabilities. It will be shown that this method is applicable in high and ultra high density tracks images and the count loss error can be enervated using a statistical generalization approach. - Highlights: • A correction factor for count losses of different tracks overlapping orders. • For the cases imaging the whole area of the detector is not possible. • Presenting a statistical generalization method for segmented areas. • Giving a relation between the tracks overlapping probabilities and the total tracks
Estimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics
DEFF Research Database (Denmark)
Holland, Dominic; Wang, Yunpeng; Thompson, Wesley K
2016-01-01
Genome-wide Association Studies (GWAS) result in millions of summary statistics ("z-scores") for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric......-scores, as such knowledge would enhance causal SNP and gene discovery, help elucidate mechanistic pathways, and inform future study design. Here we present a parsimonious methodology for modeling effect sizes and replication probabilities, relying only on summary statistics from GWAS substudies, and a scheme allowing...... for estimating the degree of polygenicity of the phenotype and predicting the proportion of chip heritability explainable by genome-wide significant SNPs in future studies with larger sample sizes. We apply the model to recent GWAS of schizophrenia (N = 82,315) and putamen volume (N = 12,596), with approximately...
DEFF Research Database (Denmark)
Korneliussen, Thorfinn Sand
Due to the recent advances in DNA sequencing technology genomic data are being generated at an unprecedented rate and we are gaining access to entire genomes at population level. The technology does, however, not give direct access to the genetic variation and the many levels of preprocessing...... that is required before being able to make inferences from the data introduces multiple levels of uncertainty, especially for low-depth data. Therefore methods that take into account the inherent uncertainty are needed for being able to make robust inferences in the downstream analysis of such data. This poses...... a problem for a range of key summary statistics within populations genetics where existing methods are based on the assumption that the true genotypes are known. Motivated by this I present: 1) a new method for the estimation of relatedness between pairs of individuals, 2) a new method for estimating...
DEFF Research Database (Denmark)
Sommer, Helle Mølgaard; Holst, Helle; Spliid, Henrik
1995-01-01
Three identical microbiological experiments were carried out and analysed in order to examine the variability of the parameter estimates. The microbiological system consisted of a substrate (toluene) and a biomass (pure culture) mixed together in an aquifer medium. The degradation of the substrate...... and the growth of the biomass are described by the Monod model consisting of two nonlinear coupled first-order differential equations. The objective of this study was to estimate the kinetic parameters in the Monod model and to test whether the parameters from the three identical experiments have the same values....... Estimation of the parameters was obtained using an iterative maximum likelihood method and the test used was an approximative likelihood ratio test. The test showed that the three sets of parameters were identical only on a 4% alpha level....
Structure Learning and Statistical Estimation in Distribution Networks - Part I
Energy Technology Data Exchange (ETDEWEB)
Deka, Deepjyoti [Univ. of Texas, Austin, TX (United States); Backhaus, Scott N. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2015-02-13
Traditionally power distribution networks are either not observable or only partially observable. This complicates development and implementation of new smart grid technologies, such as those related to demand response, outage detection and management, and improved load-monitoring. In this two part paper, inspired by proliferation of the metering technology, we discuss estimation problems in structurally loopy but operationally radial distribution grids from measurements, e.g. voltage data, which are either already available or can be made available with a relatively minor investment. In Part I, the objective is to learn the operational layout of the grid. Part II of this paper presents algorithms that estimate load statistics or line parameters in addition to learning the grid structure. Further, Part II discusses the problem of structure estimation for systems with incomplete measurement sets. Our newly suggested algorithms apply to a wide range of realistic scenarios. The algorithms are also computationally efficient – polynomial in time– which is proven theoretically and illustrated computationally on a number of test cases. The technique developed can be applied to detect line failures in real time as well as to understand the scope of possible adversarial attacks on the grid.
Wind Turbine Gearbox Condition Monitoring with AAKR and Moving Window Statistic Methods
Directory of Open Access Journals (Sweden)
Peng Guo
2011-11-01
Full Text Available Condition Monitoring (CM of wind turbines can greatly reduce the maintenance costs for wind farms, especially for offshore wind farms. A new condition monitoring method for a wind turbine gearbox using temperature trend analysis is proposed. Autoassociative Kernel Regression (AAKR is used to construct the normal behavior model of the gearbox temperature. With a proper construction of the memory matrix, the AAKR model can cover the normal working space for the gearbox. When the gearbox has an incipient failure, the residuals between AAKR model estimates and the measurement temperature will become significant. A moving window statistical method is used to detect the changes of the residual mean value and standard deviation in a timely manner. When one of these parameters exceeds predefined thresholds, an incipient failure is flagged. In order to simulate the gearbox fault, manual temperature drift is added to the initial Supervisory Control and Data Acquisitions (SCADA data. Analysis of simulated gearbox failures shows that the new condition monitoring method is effective.
Statistical trend analysis methods for temporal phenomena
Energy Technology Data Exchange (ETDEWEB)
Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)
1997-04-01
We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.
Statistical trend analysis methods for temporal phenomena
International Nuclear Information System (INIS)
Lehtinen, E.; Pulkkinen, U.; Poern, K.
1997-04-01
We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods
Applied parameter estimation for chemical engineers
Englezos, Peter
2000-01-01
Formulation of the parameter estimation problem; computation of parameters in linear models-linear regression; Gauss-Newton method for algebraic models; other nonlinear regression methods for algebraic models; Gauss-Newton method for ordinary differential equation (ODE) models; shortcut estimation methods for ODE models; practical guidelines for algorithm implementation; constrained parameter estimation; Gauss-Newton method for partial differential equation (PDE) models; statistical inferences; design of experiments; recursive parameter estimation; parameter estimation in nonlinear thermodynam
Wind gust estimation by combining numerical weather prediction model and statistical post-processing
Patlakas, Platon; Drakaki, Eleni; Galanis, George; Spyrou, Christos; Kallos, George
2017-04-01
The continuous rise of off-shore and near-shore activities as well as the development of structures, such as wind farms and various offshore platforms, requires the employment of state-of-the-art risk assessment techniques. Such analysis is used to set the safety standards and can be characterized as a climatologically oriented approach. Nevertheless, a reliable operational support is also needed in order to minimize cost drawbacks and human danger during the construction and the functioning stage as well as during maintenance activities. One of the most important parameters for this kind of analysis is the wind speed intensity and variability. A critical measure associated with this variability is the presence and magnitude of wind gusts as estimated in the reference level of 10m. The latter can be attributed to different processes that vary among boundary-layer turbulence, convection activities, mountain waves and wake phenomena. The purpose of this work is the development of a wind gust forecasting methodology combining a Numerical Weather Prediction model and a dynamical statistical tool based on Kalman filtering. To this end, the parameterization of Wind Gust Estimate method was implemented to function within the framework of the atmospheric model SKIRON/Dust. The new modeling tool combines the atmospheric model with a statistical local adaptation methodology based on Kalman filters. This has been tested over the offshore west coastline of the United States. The main purpose is to provide a useful tool for wind analysis and prediction and applications related to offshore wind energy (power prediction, operation and maintenance). The results have been evaluated by using observational data from the NOAA's buoy network. As it was found, the predicted output shows a good behavior that is further improved after the local adjustment post-process.
Method-related estimates of sperm vitality.
Cooper, Trevor G; Hellenkemper, Barbara
2009-01-01
Comparison of methods that estimate viability of human spermatozoa by monitoring head membrane permeability revealed that wet preparations (whether using positive or negative phase-contrast microscopy) generated significantly higher percentages of nonviable cells than did air-dried eosin-nigrosin smears. Only with the latter method did the sum of motile (presumed live) and stained (presumed dead) preparations never exceed 100%, making this the method of choice for sperm viability estimates.
Statistical methods and challenges in connectome genetics
Pluta, Dustin; Yu, Zhaoxia; Shen, Tong; Chen, Chuansheng; Xue, Gui; Ombao, Hernando
2018-01-01
The study of genetic influences on brain connectivity, known as connectome genetics, is an exciting new direction of research in imaging genetics. We here review recent results and current statistical methods in this area, and discuss some
A survey of available margin in a PWR RIA with statistical methods and 3D kinetics
International Nuclear Information System (INIS)
Riverola Gurruchaga, J.; Nunez Rodriguez, T.
2010-01-01
This paper investigates the recovery of margin in a PWR RIA simulation with 3D kinetics, due to statistical techniques. The chosen reference core is a typical 12 feet, 17*17 PWR, with very low leakage loading pattern strategy and gadolinium oxide as burnable poison. The PARCS calculated average nuclear power and nodal power are transferred to a hot spot model for a sequential calculation of fuel temperature and enthalpy responses allowing for independent hypothesis in both calculations. The hot spot analysis is done with a pellet type model with RELAP. The analysis is done at HZP and EOC, since this state is the most limiting one respect to the enthalpy rise criterion, compared to other burn-up condition or initial power cases. In this work, the enthalpy increase is estimated with several statistical methods of propagation of uncertainties: order statistics, parametric statistics, surface response and sensitivities. A discussion on the advantages and disadvantages of each method is also presented. This statistical analysis is also useful to confirm a previous classification of parameters and assumptions according to their importance for the simulation, and found to be consistent with the state of the art in the published literature. These parameters include ejected rod worth and ejection time, delayed neutron fraction and yields, nuclear power peaking factor, and Doppler. (authors)
International Nuclear Information System (INIS)
Lee, Dong Soo; Lee, Jae Sung; Kim, Kyeong Min; Chung, June Key; Lee, Myung Chul
1998-01-01
We investigated the statistical methods to compose the functional brain map of human working memory and the principal factors that have an effect on the methods for localization. Repeated PET scans with successive four tasks, which consist of one control and three different activation tasks, were performed on six right-handed normal volunteers for 2 minutes after bolus injections of 925 MBq H 2 15 O at the intervals of 30 minutes. Image data were analyzed using SPM96 (Statistical Parametric Mapping) implemented with Matlab (Mathworks Inc., U.S.A.). Images from the same subject were spatially registered and were normalized using linear and nonlinear transformation methods. Significant difference between control and each activation state was estimated at every voxel based on the general linear model. Differences of global counts were removed using analysis of covariance (ANCOVA) with global activity as covariate. Using the mean and variance for each condition which was adjusted using ANCOVA, t-statistics was performed on every voxel. To interpret the results more easily, t-values were transformed to the standard Gaussian distribution (Z-score). All the subjects carried out the activation and control tests successfully. Average rate of correct answers was 95%. The numbers of activated blobs were 4 for verbal memory I, 9 for verbal memory II, 9 for visual memory, and 6 for conjunctive activation of these three tasks. The verbal working memory activates predominantly left-sided structures, and the visual memory activates the right hemisphere. We conclude that rCBF PET imaging and statistical parametric mapping method were useful in the localization of the brain regions for verbal and visual working memory
Report on some methods of determining the state of convergence of Monte Carlo risk estimates
International Nuclear Information System (INIS)
Orford, J.L.; Hufton, D.; Johnson, K.
1991-05-01
The Department of the Environment is developing a methodology for assessing potential sites for the disposal of low and intermediate level radioactive wastes. Computer models are used to simulate the groundwater transport of radioactive materials from a disposal facility back to man. Monte Carlo methods are being employed to conduct a probabilistic risk assessment (pra) of potential sites. The models calculate time histories of annual radiation dose to the critical group population. The annual radiation dose to the critical group in turn specifies the annual individual risk. The distribution of dose is generally highly skewed and many simulation runs are required to predict the level of confidence in the risk estimate i.e. to determine whether the risk estimate is converged. This report describes some statistical methods for determining the state of convergence of the risk estimate. The methods described include the Shapiro-Wilk test, calculation of skewness and kurtosis and normal probability plots. A method for forecasting the number of samples needed before the risk estimate is converged is presented. Three case studies were conducted to examine the performance of some of these techniques. (author)
Quantitative EEG Applying the Statistical Recognition Pattern Method
DEFF Research Database (Denmark)
Engedal, Knut; Snaedal, Jon; Hoegh, Peter
2015-01-01
BACKGROUND/AIM: The aim of this study was to examine the discriminatory power of quantitative EEG (qEEG) applying the statistical pattern recognition (SPR) method to separate Alzheimer's disease (AD) patients from elderly individuals without dementia and from other dementia patients. METHODS...
A simple method for estimating the length density of convoluted tubular systems.
Ferraz de Carvalho, Cláudio A; de Campos Boldrini, Silvia; Nishimaru, Flávio; Liberti, Edson A
2008-10-01
We present a new method for estimating the length density (Lv) of convoluted tubular structures exhibiting an isotropic distribution. Although the traditional equation Lv=2Q/A is used, the parameter Q is obtained by considering the collective perimeters of tubular sections. This measurement is converted to a standard model of the structure, assuming that all cross-sections are approximately circular and have an average perimeter similar to that of actual circular cross-sections observed in the same material. The accuracy of this method was tested in eight experiments using hollow macaroni bent into helical shapes. After measuring the length of the macaroni segments, they were boiled and randomly packed into cylindrical volumes along with an aqueous suspension of gelatin and India ink. The solidified blocks were cut into slices 1.0 cm thick and 33.2 cm2 in area (A). The total perimeter of the macaroni cross-sections so revealed was stereologically estimated using a test system of straight parallel lines. Given Lv and the reference volume, the total length of macaroni in each section could be estimated. Additional corrections were made for the changes induced by boiling, and the off-axis position of the thread used to measure length. No statistical difference was observed between the corrected estimated values and the actual lengths. This technique is useful for estimating the length of capillaries, renal tubules, and seminiferous tubules.
International Nuclear Information System (INIS)
Land, C.E.; Pierce, D.A.
1983-01-01
Statistical theory and methodology provide the logical structure for scientific inference about the cancer risk associated with exposure to ionizing radiation. Although much is known about radiation carcinogenesis, the risk associated with low-level exposures is difficult to assess because it is too small to measure directly. Estimation must therefore depend upon mathematical models which relate observed risks at high exposure levels to risks at lower exposure levels. Extrapolated risk estimates obtained using such models are heavily dependent upon assumptions about the shape of the dose-response relationship, the temporal distribution of risk following exposure, and variation of risk according to variables such as age at exposure, sex, and underlying population cancer rates. Expanded statistical models, which make explicit certain assumed relationships between different data sets, can be used to strengthen inferences by incorporating relevant information from diverse sources. They also allow the uncertainties inherent in information from related data sets to be expressed in estimates which partially depend upon that information. To the extent that informed opinion is based upon a valid assessment of scientific data, the larger context of decision theory, which includes statistical theory, provides a logical framework for the incorporation into public policy decisions of the informational content of expert opinion
International Nuclear Information System (INIS)
Hong, Kee Jeung; Kim, Jee Sang
2009-01-01
As concrete ages, the surrounding environment is expected to have growing influences on the concrete. As all the impacts of the environment cannot be considered in the strength-estimating model of a nondestructive concrete test, the increase in concrete age leads to growing uncertainty in the strength-estimating model. Therefore, the variation of the model error increases. It is necessary to include those impacts in the probability model of concrete strength attained from the nondestructive tests so as to build a more accurate reliability model for structural performance evaluation. This paper reviews and categorizes the existing strength-estimating statistical models of nondestructive concrete test, and suggests a new form of the strength-estimating statistical models to properly reflect the model uncertainty due to aging of the concrete. This new form of the statistical models will lay foundation for more accurate structural performance evaluation.
Longitudinal data analysis a handbook of modern statistical methods
Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert
2008-01-01
Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint
Zhang, Yun; Baheti, Saurabh; Sun, Zhifu
2018-05-01
High-throughput bisulfite methylation sequencing such as reduced representation bisulfite sequencing (RRBS), Agilent SureSelect Human Methyl-Seq (Methyl-seq) or whole-genome bisulfite sequencing is commonly used for base resolution methylome research. These data are represented either by the ratio of methylated cytosine versus total coverage at a CpG site or numbers of methylated and unmethylated cytosines. Multiple statistical methods can be used to detect differentially methylated CpGs (DMCs) between conditions, and these methods are often the base for the next step of differentially methylated region identification. The ratio data have a flexibility of fitting to many linear models, but the raw count data take consideration of coverage information. There is an array of options in each datatype for DMC detection; however, it is not clear which is an optimal statistical method. In this study, we systematically evaluated four statistic methods on methylation ratio data and four methods on count-based data and compared their performances with regard to type I error control, sensitivity and specificity of DMC detection and computational resource demands using real RRBS data along with simulation. Our results show that the ratio-based tests are generally more conservative (less sensitive) than the count-based tests. However, some count-based methods have high false-positive rates and should be avoided. The beta-binomial model gives a good balance between sensitivity and specificity and is preferred method. Selection of methods in different settings, signal versus noise and sample size estimation are also discussed.
Statistical methods for assessing agreement between continuous measurements
DEFF Research Database (Denmark)
Sokolowski, Ineta; Hansen, Rikke Pilegaard; Vedsted, Peter
Background: Clinical research often involves study of agreement amongst observers. Agreement can be measured in different ways, and one can obtain quite different values depending on which method one uses. Objective: We review the approaches that have been discussed to assess the agreement between...... continuous measures and discuss their strengths and weaknesses. Different methods are illustrated using actual data from the `Delay in diagnosis of cancer in general practice´ project in Aarhus, Denmark. Subjects and Methods: We use weighted kappa-statistic, intraclass correlation coefficient (ICC......), concordance coefficient, Bland-Altman limits of agreement and percentage of agreement to assess the agreement between patient reported delay and doctor reported delay in diagnosis of cancer in general practice. Key messages: The correct statistical approach is not obvious. Many studies give the product...
Directory of Open Access Journals (Sweden)
Bonnie LaFleur
2011-01-01
Full Text Available In analytic chemistry a detection limit (DL is the lowest measurable amount of an analyte that can be distinguished from a blank; many biomedical measurement technologies exhibit this property. From a statistical perspective, these data present inferential challenges because instead of precise measures, one only has information that the value is somewhere between 0 and the DL (below detection limit, BDL. Substitution of BDL values, with 0 or the DL can lead to biased parameter estimates and a loss of statistical power. Statistical methods that make adjustments when dealing with these types of data, often called left-censored data, are available in many commercial statistical packages. Despite this availability, the use of these methods is still not widespread in biomedical literature. We have reviewed the statistical approaches of dealing with BDL values, and used simulations to examine the performance of the commonly used substitution methods and the most widely available statistical methods. We have illustrated these methods using a study undertaken at the Vanderbilt-Ingram Cancer Center, to examine the serum bile acid levels in patients with colorectal cancer and adenoma. We have found that the modern methods for BDL values identify disease-related differences that are often missed, with statistically naive approaches.
Whole vertebral bone segmentation method with a statistical intensity-shape model based approach
Hanaoka, Shouhei; Fritscher, Karl; Schuler, Benedikt; Masutani, Yoshitaka; Hayashi, Naoto; Ohtomo, Kuni; Schubert, Rainer
2011-03-01
An automatic segmentation algorithm for the vertebrae in human body CT images is presented. Especially we focused on constructing and utilizing 4 different statistical intensity-shape combined models for the cervical, upper / lower thoracic and lumbar vertebrae, respectively. For this purpose, two previously reported methods were combined: a deformable model-based initial segmentation method and a statistical shape-intensity model-based precise segmentation method. The former is used as a pre-processing to detect the position and orientation of each vertebra, which determines the initial condition for the latter precise segmentation method. The precise segmentation method needs prior knowledge on both the intensities and the shapes of the objects. After PCA analysis of such shape-intensity expressions obtained from training image sets, vertebrae were parametrically modeled as a linear combination of the principal component vectors. The segmentation of each target vertebra was performed as fitting of this parametric model to the target image by maximum a posteriori estimation, combined with the geodesic active contour method. In the experimental result by using 10 cases, the initial segmentation was successful in 6 cases and only partially failed in 4 cases (2 in the cervical area and 2 in the lumbo-sacral). In the precise segmentation, the mean error distances were 2.078, 1.416, 0.777, 0.939 mm for cervical, upper and lower thoracic, lumbar spines, respectively. In conclusion, our automatic segmentation algorithm for the vertebrae in human body CT images showed a fair performance for cervical, thoracic and lumbar vertebrae.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
Kappa statistic for clustered matched-pair data.
Yang, Zhao; Zhou, Ming
2014-07-10
Kappa statistic is widely used to assess the agreement between two procedures in the independent matched-pair data. For matched-pair data collected in clusters, on the basis of the delta method and sampling techniques, we propose a nonparametric variance estimator for the kappa statistic without within-cluster correlation structure or distributional assumptions. The results of an extensive Monte Carlo simulation study demonstrate that the proposed kappa statistic provides consistent estimation and the proposed variance estimator behaves reasonably well for at least a moderately large number of clusters (e.g., K ≥50). Compared with the variance estimator ignoring dependence within a cluster, the proposed variance estimator performs better in maintaining the nominal coverage probability when the intra-cluster correlation is fair (ρ ≥0.3), with more pronounced improvement when ρ is further increased. To illustrate the practical application of the proposed estimator, we analyze two real data examples of clustered matched-pair data. Copyright © 2014 John Wiley & Sons, Ltd.
Statistical and Machine Learning forecasting methods: Concerns and ways forward
Makridakis, Spyros; Assimakopoulos, Vassilios
2018-01-01
Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions. PMID:29584784
Statistical and Machine Learning forecasting methods: Concerns and ways forward.
Makridakis, Spyros; Spiliotis, Evangelos; Assimakopoulos, Vassilios
2018-01-01
Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions.
Structure Learning and Statistical Estimation in Distribution Networks - Part II
Energy Technology Data Exchange (ETDEWEB)
Deka, Deepjyoti [Univ. of Texas, Austin, TX (United States); Backhaus, Scott N. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2015-02-13
Limited placement of real-time monitoring devices in the distribution grid, recent trends notwithstanding, has prevented the easy implementation of demand-response and other smart grid applications. Part I of this paper discusses the problem of learning the operational structure of the grid from nodal voltage measurements. In this work (Part II), the learning of the operational radial structure is coupled with the problem of estimating nodal consumption statistics and inferring the line parameters in the grid. Based on a Linear-Coupled(LC) approximation of AC power flows equations, polynomial time algorithms are designed to identify the structure and estimate nodal load characteristics and/or line parameters in the grid using the available nodal voltage measurements. Then the structure learning algorithm is extended to cases with missing data, where available observations are limited to a fraction of the grid nodes. The efficacy of the presented algorithms are demonstrated through simulations on several distribution test cases.
Kittisuwan, Pichid
2015-03-01
The application of image processing in industry has shown remarkable success over the last decade, for example, in security and telecommunication systems. The denoising of natural image corrupted by Gaussian noise is a classical problem in image processing. So, image denoising is an indispensable step during image processing. This paper is concerned with dual-tree complex wavelet-based image denoising using Bayesian techniques. One of the cruxes of the Bayesian image denoising algorithms is to estimate the statistical parameter of the image. Here, we employ maximum a posteriori (MAP) estimation to calculate local observed variance with generalized Gamma density prior for local observed variance and Laplacian or Gaussian distribution for noisy wavelet coefficients. Evidently, our selection of prior distribution is motivated by efficient and flexible properties of generalized Gamma density. The experimental results show that the proposed method yields good denoising results.
Computerized statistical analysis with bootstrap method in nuclear medicine
International Nuclear Information System (INIS)
Zoccarato, O.; Sardina, M.; Zatta, G.; De Agostini, A.; Barbesti, S.; Mana, O.; Tarolo, G.L.
1988-01-01
Statistical analysis of data samples involves some hypothesis about the features of data themselves. The accuracy of these hypotheses can influence the results of statistical inference. Among the new methods of computer-aided statistical analysis, the bootstrap method appears to be one of the most powerful, thanks to its ability to reproduce many artificial samples starting from a single original sample and because it works without hypothesis about data distribution. The authors applied the bootstrap method to two typical situation of Nuclear Medicine Department. The determination of the normal range of serum ferritin, as assessed by radioimmunoassay and defined by the mean value ±2 standard deviations, starting from an experimental sample of small dimension, shows an unacceptable lower limit (ferritin plasmatic levels below zero). On the contrary, the results obtained by elaborating 5000 bootstrap samples gives ans interval of values (10.95 ng/ml - 72.87 ng/ml) corresponding to the normal ranges commonly reported. Moreover the authors applied the bootstrap method in evaluating the possible error associated with the correlation coefficient determined between left ventricular ejection fraction (LVEF) values obtained by first pass radionuclide angiocardiography with 99m Tc and 195m Au. The results obtained indicate a high degree of statistical correlation and give the range of r 2 values to be considered acceptable for this type of studies
Zhang, Zhang
2012-03-22
Background: Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB). Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis.Results: Here we propose a novel measure--Codon Deviation Coefficient (CDC)--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance.Conclusions: As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions. 2012 Zhang et al; licensee BioMed Central Ltd.
Directory of Open Access Journals (Sweden)
Jacobo Pardo-Seco
Full Text Available BACKGROUND: Mitochondrial DNA (mtDNA variation (i.e. haplogroups has been analyzed in regards to a number of multifactorial diseases. The statistical power of a case-control study determines the a priori probability to reject the null hypothesis of homogeneity between cases and controls. METHODS/PRINCIPAL FINDINGS: We critically review previous approaches to the estimation of the statistical power based on the restricted scenario where the number of cases equals the number of controls, and propose a methodology that broadens procedures to more general situations. We developed statistical procedures that consider different disease scenarios, variable sample sizes in cases and controls, and variable number of haplogroups and effect sizes. The results indicate that the statistical power of a particular study can improve substantially by increasing the number of controls with respect to cases. In the opposite direction, the power decreases substantially when testing a growing number of haplogroups. We developed mitPower (http://bioinformatics.cesga.es/mitpower/, a web-based interface that implements the new statistical procedures and allows for the computation of the a priori statistical power in variable scenarios of case-control study designs, or e.g. the number of controls needed to reach fixed effect sizes. CONCLUSIONS/SIGNIFICANCE: The present study provides with statistical procedures for the computation of statistical power in common as well as complex case-control study designs involving 2×k tables, with special application (but not exclusive to mtDNA studies. In order to reach a wide range of researchers, we also provide a friendly web-based tool--mitPower--that can be used in both retrospective and prospective case-control disease studies.
Statistical methods to monitor the West Valley off-gas system
International Nuclear Information System (INIS)
Eggett, D.L.
1990-01-01
This paper reports on the of-gas system for the ceramic melter operated at the West Valley Demonstration Project at West Valley, NY, monitored during melter operation. A one-at-a-time method of monitoring the parameters of the off-gas system is not statistically sound. Therefore, multivariate statistical methods appropriate for the monitoring of many correlated parameters will be used. Monitoring a large number of parameters increases the probability of a false out-of-control signal. If the parameters being monitored are statistically independent, the control limits can be easily adjusted to obtain the desired probability of a false out-of-control signal. The principal component (PC) scores have desirable statistical properties when the original variables are distributed as multivariate normals. Two statistics derived from the PC scores and used to form multivariate control charts are outlined and their distributional properties reviewed
Comparative study of age estimation using dentinal translucency by digital and conventional methods
Bommannavar, Sushma; Kulkarni, Meena
2015-01-01
Introduction: Estimating age using the dentition plays a significant role in identification of the individual in forensic cases. Teeth are one of the most durable and strongest structures in the human body. The morphology and arrangement of teeth vary from person-to-person and is unique to an individual as are the fingerprints. Therefore, the use of dentition is the method of choice in the identification of the unknown. Root dentin translucency is considered to be one of the best parameters for dental age estimation. Traditionally, root dentin translucency was measured using calipers. Recently, the use of custom built software programs have been proposed for the same. Objectives: The present study describes a method to measure root dentin translucency on sectioned teeth using a custom built software program Adobe Photoshop 7.0 version (Adobe system Inc, Mountain View California). Materials and Methods: A total of 50 single rooted teeth were sectioned longitudinally to derive a 0.25 mm uniform thickness and the root dentin translucency was measured using digital and caliper methods and compared. The Gustafson's morphohistologic approach is used in this study. Results: Correlation coefficients of translucency measurements to age were statistically significant for both the methods (P < 0.125) and linear regression equations derived from both methods revealed better ability of the digital method to assess age. Conclusion: The custom built software program used in the present study is commercially available and widely used image editing software. Furthermore, this method is easy to use and less time consuming. The measurements obtained using this method are more precise and thus help in more accurate age estimation. Considering these benefits, the present study recommends the use of digital method to assess translucency for age estimation. PMID:25709325
An Overview of Short-term Statistical Forecasting Methods
DEFF Research Database (Denmark)
Elias, Russell J.; Montgomery, Douglas C.; Kulahci, Murat
2006-01-01
An overview of statistical forecasting methodology is given, focusing on techniques appropriate to short- and medium-term forecasts. Topics include basic definitions and terminology, smoothing methods, ARIMA models, regression methods, dynamic regression models, and transfer functions. Techniques...... for evaluating and monitoring forecast performance are also summarized....
Statistical Process Control in a Modern Production Environment
DEFF Research Database (Denmark)
Windfeldt, Gitte Bjørg
gathered here and standard statistical software. In Paper 2 a new method for process monitoring is introduced. The method uses a statistical model of the quality characteristic and a sliding window of observations to estimate the probability that the next item will not respect the specications......Paper 1 is aimed at practicians to help them test the assumption that the observations in a sample are independent and identically distributed. An assumption that is essential when using classical Shewhart charts. The test can easily be performed in the control chart setup using the samples....... If the estimated probability exceeds a pre-determined threshold the process will be stopped. The method is exible, allowing a complexity in modeling that remains invisible to the end user. Furthermore, the method allows to build diagnostic plots based on the parameters estimates that can provide valuable insight...
DEFF Research Database (Denmark)
Jensen, Jørgen Juncher
2007-01-01
In on-board decision support systems efficient procedures are needed for real-time estimation of the maximum ship responses to be expected within the next few hours, given on-line information on the sea state and user defined ranges of possible headings and speeds. For linear responses standard...... frequency domain methods can be applied. To non-linear responses like the roll motion, standard methods like direct time domain simulations are not feasible due to the required computational time. However, the statistical distribution of non-linear ship responses can be estimated very accurately using...... the first-order reliability method (FORM), well-known from structural reliability problems. To illustrate the proposed procedure, the roll motion is modelled by a simplified non-linear procedure taking into account non-linear hydrodynamic damping, time-varying restoring and wave excitation moments...
Directory of Open Access Journals (Sweden)
Krishna R. Gupta
2010-12-01
Full Text Available Three simple, accurate and economical methods for simultaneous estimation of pantoprazole and itopride hydrochloride in two component solid dosage forms have been developed. The proposed methods employ the application of simultaneous equation method (Method A, absorbance ratio method (Method B and multicomponent mode of analysis method (Method C. All these methods utilize distilled water as a solvent. In distilled water pantoprazole shows maximum absorbance at a wavelength of 289.0 nm while itopride hydrochloride shows maximum absorbance at a wavelength of 258.0 nm also the drugs show an isoabsorptive point at a wavelength of 270.0 nm. For multicomponent method, sampling wavelengths 289.0 nm, 270.0 nm and 239.5 nm were selected. All these methods showed linearity in the range from 4-20 µg/mL and 15-75 µg/mL for pantoprazole and itopride hydrochloride respectively. The results of analysis have been validated statistically and by recovery studies.
Yu, Binbing; Yang, Harry
2017-01-01
Biological assays ( bioassays ) are procedures to estimate the potency of a substance by studying its effects on living organisms, tissues, and cells. Bioassays are essential tools for gaining insight into biologic systems and processes including, for example, the development of new drugs and monitoring environmental pollutants. Two of the most important parameters of bioassay performance are relative accuracy (bias) and precision. Although general strategies and formulas are provided in USP, a comprehensive understanding of the definitions of bias and precision remain elusive. Additionally, whether there is a beneficial use of data transformation in estimating intermediate precision remains unclear. Finally, there are various statistical estimation methods available that often pose a dilemma for the analyst who must choose the most appropriate method. To address these issues, we provide both a rigorous definition of bias and precision as well as three alternative methods for calculating relative standard deviation (RSD). All methods perform similarly when the RSD ≤10%. However, the USP estimates result in larger bias and root-mean-square error (RMSE) compared to the three proposed methods when the actual variation was large. Therefore, the USP method should not be used for routine analysis. For data with moderate skewness and deviation from normality, the estimates based on the original scale perform well. The original scale method is preferred, and the method based on log-transformation may be used for noticeably skewed data. LAY ABSTRACT: Biological assays, or bioassays, are essential in the development and manufacture of biopharmaceutical products for potency testing and quality monitoring. Two important parameters of assay performance are relative accuracy (bias) and precision. The definitions of bias and precision in USP 〈1033〉 are elusive and confusing. Another complicating issue is whether log-transformation should be used for calculating the
Goyal, Anju; Singhvi, I
2008-01-01
Two simple, accurate, economical and reproducible spectrophotometric methods for simultaneous estimation of two-component drug mixture of ethamsylate and mefenamic acid in combined tablet dosage form have been developed. The first developed method involves formation and solving of simultaneous equation using 287.6 nm and 313.2 nm as two wavelengths. Second developed method is based on two wavelength calculation. Two wavelengths selected for estimation of ethamsylate were 274.4 nm and 301.2 nm while that for mefenamic acid were 304.8 nm and 320.4 nm. Both the developed methods obey Beer's law in the concentration ranges employed for the respective methods. The results of analysis were validated statistically and by recovery studies.
Demirjian's method in the estimation of age: A study on human third molars.
Lewis, Amitha J; Boaz, Karen; Nagesh, K R; Srikant, N; Gupta, Neha; Nandita, K P; Manaktala, Nidhi
2015-01-01
The primary aim of the following study is to estimate the chronological age based on the stages of third molar development following the eight stages (A to H) method of Demirjian et al. (along with two modifications-Orhan) and secondary aim is to compare third molar development with sex and age. The sample consisted of 115 orthopantomograms from South Indian subjects with known chronological age and gender. Multiple regression analysis was performed with chronological age as the dependable variable and third molar root development as independent variable. All the statistical analysis was performed using the SPSS 11.0 package (IBM ® Corporation). Statistically no significant differences were found in third molar development between males and females. Depending on the available number of wisdom teeth in an individual, R (2) varied for males from 0.21 to 0.48 and for females from 0.16 to 0.38. New equations were derived for estimating the chronological age. The chronological age of a South Indian individual between 14 and 22 years may be estimated based on the regression formulae. However, additional studies with a larger study population must be conducted to meet the need for population-based information on third molar development.
Spectrum estimation method based on marginal spectrum
International Nuclear Information System (INIS)
Cai Jianhua; Hu Weiwen; Wang Xianchun
2011-01-01
FFT method can not meet the basic requirements of power spectrum for non-stationary signal and short signal. A new spectrum estimation method based on marginal spectrum from Hilbert-Huang transform (HHT) was proposed. The procession of obtaining marginal spectrum in HHT method was given and the linear property of marginal spectrum was demonstrated. Compared with the FFT method, the physical meaning and the frequency resolution of marginal spectrum were further analyzed. Then the Hilbert spectrum estimation algorithm was discussed in detail, and the simulation results were given at last. The theory and simulation shows that under the condition of short data signal and non-stationary signal, the frequency resolution and estimation precision of HHT method is better than that of FFT method. (authors)
Statistical inference a short course
Panik, Michael J
2012-01-01
A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal
Sangnawakij, Patarawan; Böhning, Dankmar; Adams, Stephen; Stanton, Michael; Holling, Heinz
2017-04-30
Statistical inference for analyzing the results from several independent studies on the same quantity of interest has been investigated frequently in recent decades. Typically, any meta-analytic inference requires that the quantity of interest is available from each study together with an estimate of its variability. The current work is motivated by a meta-analysis on comparing two treatments (thoracoscopic and open) of congenital lung malformations in young children. Quantities of interest include continuous end-points such as length of operation or number of chest tube days. As studies only report mean values (and no standard errors or confidence intervals), the question arises how meta-analytic inference can be developed. We suggest two methods to estimate study-specific variances in such a meta-analysis, where only sample means and sample sizes are available in the treatment arms. A general likelihood ratio test is derived for testing equality of variances in two groups. By means of simulation studies, the bias and estimated standard error of the overall mean difference from both methodologies are evaluated and compared with two existing approaches: complete study analysis only and partial variance information. The performance of the test is evaluated in terms of type I error. Additionally, we illustrate these methods in the meta-analysis on comparing thoracoscopic and open surgery for congenital lung malformations and in a meta-analysis on the change in renal function after kidney donation. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F
2017-04-01
Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
Akkermans, Simen; Logist, Filip; Van Impe, Jan F
2018-04-01
When building models to describe the effect of environmental conditions on the microbial growth rate, parameter estimations can be performed either with a one-step method, i.e., directly on the cell density measurements, or in a two-step method, i.e., via the estimated growth rates. The two-step method is often preferred due to its simplicity. The current research demonstrates that the two-step method is, however, only valid if the correct data transformation is applied and a strict experimental protocol is followed for all experiments. Based on a simulation study and a mathematical derivation, it was demonstrated that the logarithm of the growth rate should be used as a variance stabilizing transformation. Moreover, the one-step method leads to a more accurate estimation of the model parameters and a better approximation of the confidence intervals on the estimated parameters. Therefore, the one-step method is preferred and the two-step method should be avoided. Copyright © 2017. Published by Elsevier Ltd.
Estimation of the Effects of Statistical Discrimination on the Gender Wage Gap
Atsuko Tanaka
2015-01-01
How much of the gender wage gap can be attributed to statistical discrimination? Applying an employer learning model and Instrumental Variable (IV) estimation strategy to Japanese panel data, I examine how women's generally weak labor force attachment affects wages when employers cannot easily observe an individual's labor force intentions. To overcome endogeneity issues, I use survey information on individual workers' intentions to continue working after having children and Japanese panel da...
Fundamentals of modern statistical methods substantially improving power and accuracy
Wilcox, Rand R
2001-01-01
Conventional statistical methods have a very serious flaw They routinely miss differences among groups or associations among variables that are detected by more modern techniques - even under very small departures from normality Hundreds of journal articles have described the reasons standard techniques can be unsatisfactory, but simple, intuitive explanations are generally unavailable Improved methods have been derived, but they are far from obvious or intuitive based on the training most researchers receive Situations arise where even highly nonsignificant results become significant when analyzed with more modern methods Without assuming any prior training in statistics, Part I of this book describes basic statistical principles from a point of view that makes their shortcomings intuitive and easy to understand The emphasis is on verbal and graphical descriptions of concepts Part II describes modern methods that address the problems covered in Part I Using data from actual studies, many examples are include...
International Nuclear Information System (INIS)
Gillet, M.
1986-07-01
This thesis presents a study for the surveillance of the Primary circuit water inventory of a pressurized water reactor. A reference model is developed for the development of an automatic system ensuring detection and real-time diagnostic. The methods to our application are statistical tests and adapted a pattern recognition method. The estimation of the detected anomalies is treated by the least square fit method, and by filtering. A new projected optimization method with superlinear convergence is developed in this framework, and a segmented linearization of the model is introduced, in view of a multiple filtering. 46 refs [fr
Statistical methods for biodosimetry in the presence of both Berkson and classical measurement error
Miller, Austin
In radiation epidemiology, the true dose received by those exposed cannot be assessed directly. Physical dosimetry uses a deterministic function of the source term, distance and shielding to estimate dose. For the atomic bomb survivors, the physical dosimetry system is well established. The classical measurement errors plaguing the location and shielding inputs to the physical dosimetry system are well known. Adjusting for the associated biases requires an estimate for the classical measurement error variance, for which no data-driven estimate exists. In this case, an instrumental variable solution is the most viable option to overcome the classical measurement error indeterminacy. Biological indicators of dose may serve as instrumental variables. Specification of the biodosimeter dose-response model requires identification of the radiosensitivity variables, for which we develop statistical definitions and variables. More recently, researchers have recognized Berkson error in the dose estimates, introduced by averaging assumptions for many components in the physical dosimetry system. We show that Berkson error induces a bias in the instrumental variable estimate of the dose-response coefficient, and then address the estimation problem. This model is specified by developing an instrumental variable mixed measurement error likelihood function, which is then maximized using a Monte Carlo EM Algorithm. These methods produce dose estimates that incorporate information from both physical and biological indicators of dose, as well as the first instrumental variable based data-driven estimate for the classical measurement error variance.
International Nuclear Information System (INIS)
Corana, A.; Bortolan, G.; Casaleggio, A.
2004-01-01
We present and compare two automatic methods for dimension estimation from time series. Both methods, based on conceptually different approaches, work on the derivative of the bi-logarithmic plot of the correlation integral versus the correlation length (log-log plot). The first method searches for the most probable dimension values (MPDV) and associates to each of them a possible scaling region. The second one searches for the most flat intervals (MFI) in the derivative of the log-log plot. The automatic procedures include the evaluation of the candidate scaling regions using two reliability indices. The data set used to test the methods consists of time series from known model attractors with and without the addition of noise, structured time series, and electrocardiographic signals from the MIT-BIH ECG database. Statistical analysis of results was carried out by means of paired t-test, and no statistically significant differences were found in the large majority of the trials. Consistent results are also obtained dealing with 'difficult' time series. In general for a more robust and reliable estimate, the use of both methods may represent a good solution when time series from complex systems are analyzed. Although we present results for the correlation dimension only, the procedures can also be used for the automatic estimation of generalized q-order dimensions and pointwise dimension. We think that the proposed methods, eliminating the need of operator intervention, allow a faster and more objective analysis, thus improving the usefulness of dimension analysis for the characterization of time series obtained from complex dynamical systems
Statistical methods and challenges in connectome genetics
Pluta, Dustin
2018-03-12
The study of genetic influences on brain connectivity, known as connectome genetics, is an exciting new direction of research in imaging genetics. We here review recent results and current statistical methods in this area, and discuss some of the persistent challenges and possible directions for future work.
Literature in Focus: Statistical Methods in Experimental Physics
2007-01-01
Frederick James was a high-energy physicist who became the CERN "expert" on statistics and is now well-known around the world, in part for this famous text. The first edition of Statistical Methods in Experimental Physics was originally co-written with four other authors and was published in 1971 by North Holland (now an imprint of Elsevier). It became such an important text that demand for it has continued for more than 30 years. Fred has updated it and it was released in a second edition by World Scientific in 2006. It is still a top seller and there is no exaggeration in calling it «the» reference on the subject. A full review of the title appeared in the October CERN Courier.Come and meet the author to hear more about how this book has flourished during its 35-year lifetime. Frederick James Statistical Methods in Experimental Physics Monday, 26th of November, 4 p.m. Council Chamber (Bldg. 503-1-001) The author will be introduced...
Heterogeneous Rock Simulation Using DIP-Micromechanics-Statistical Methods
Directory of Open Access Journals (Sweden)
H. Molladavoodi
2018-01-01
Full Text Available Rock as a natural material is heterogeneous. Rock material consists of minerals, crystals, cement, grains, and microcracks. Each component of rock has a different mechanical behavior under applied loading condition. Therefore, rock component distribution has an important effect on rock mechanical behavior, especially in the postpeak region. In this paper, the rock sample was studied by digital image processing (DIP, micromechanics, and statistical methods. Using image processing, volume fractions of the rock minerals composing the rock sample were evaluated precisely. The mechanical properties of the rock matrix were determined based on upscaling micromechanics. In order to consider the rock heterogeneities effect on mechanical behavior, the heterogeneity index was calculated in a framework of statistical method. A Weibull distribution function was fitted to the Young modulus distribution of minerals. Finally, statistical and Mohr–Coulomb strain-softening models were used simultaneously as a constitutive model in DEM code. The acoustic emission, strain energy release, and the effect of rock heterogeneities on the postpeak behavior process were investigated. The numerical results are in good agreement with experimental data.
The Playground Game: Inquiry‐Based Learning About Research Methods and Statistics
Westera, Wim; Slootmaker, Aad; Kurvers, Hub
2014-01-01
The Playground Game is a web-based game that was developed for teaching research methods and statistics to nursing and social sciences students in higher education and vocational training. The complexity and abstract nature of research methods and statistics poses many challenges for students. The
McAlinden, Colm; Khadka, Jyoti; Pesudovs, Konrad
2011-07-01
The ever-expanding choice of ocular metrology and imaging equipment has driven research into the validity of their measurements. Consequently, studies of the agreement between two instruments or clinical tests have proliferated in the ophthalmic literature. It is important that researchers apply the appropriate statistical tests in agreement studies. Correlation coefficients are hazardous and should be avoided. The 'limits of agreement' method originally proposed by Altman and Bland in 1983 is the statistical procedure of choice. Its step-by-step use and practical considerations in relation to optometry and ophthalmology are detailed in addition to sample size considerations and statistical approaches to precision (repeatability or reproducibility) estimates. Ophthalmic & Physiological Optics © 2011 The College of Optometrists.
Bayes linear statistics, theory & methods
Goldstein, Michael
2007-01-01
Bayesian methods combine information available from data with any prior information available from expert knowledge. The Bayes linear approach follows this path, offering a quantitative structure for expressing beliefs, and systematic methods for adjusting these beliefs, given observational data. The methodology differs from the full Bayesian methodology in that it establishes simpler approaches to belief specification and analysis based around expectation judgements. Bayes Linear Statistics presents an authoritative account of this approach, explaining the foundations, theory, methodology, and practicalities of this important field. The text provides a thorough coverage of Bayes linear analysis, from the development of the basic language to the collection of algebraic results needed for efficient implementation, with detailed practical examples. The book covers:The importance of partial prior specifications for complex problems where it is difficult to supply a meaningful full prior probability specification...
A Simple Sampling Method for Estimating the Accuracy of Large Scale Record Linkage Projects.
Boyd, James H; Guiver, Tenniel; Randall, Sean M; Ferrante, Anna M; Semmens, James B; Anderson, Phil; Dickinson, Teresa
2016-05-17
Record linkage techniques allow different data collections to be brought together to provide a wider picture of the health status of individuals. Ensuring high linkage quality is important to guarantee the quality and integrity of research. Current methods for measuring linkage quality typically focus on precision (the proportion of incorrect links), given the difficulty of measuring the proportion of false negatives. The aim of this work is to introduce and evaluate a sampling based method to estimate both precision and recall following record linkage. In the sampling based method, record-pairs from each threshold (including those below the identified cut-off for acceptance) are sampled and clerically reviewed. These results are then applied to the entire set of record-pairs, providing estimates of false positives and false negatives. This method was evaluated on a synthetically generated dataset, where the true match status (which records belonged to the same person) was known. The sampled estimates of linkage quality were relatively close to actual linkage quality metrics calculated for the whole synthetic dataset. The precision and recall measures for seven reviewers were very consistent with little variation in the clerical assessment results (overall agreement using the Fleiss Kappa statistics was 0.601). This method presents as a possible means of accurately estimating matching quality and refining linkages in population level linkage studies. The sampling approach is especially important for large project linkages where the number of record pairs produced may be very large often running into millions.
Designs and Methods for Association Studies and Population Size Inference in Statistical Genetics
DEFF Research Database (Denmark)
Waltoft, Berit Lindum
method provides a simple goodness of t test by comparing the observed SFS with the expected SFS under a given model of population size changes. By the use of Monte Carlo estimation the expected time between coalescent events can be estimated and the expected SFS can thereby be evaluated. Using......). The OR is interpreted as the eect of an exposure on the probability of being diseased at the end of follow-up, while the interpretation of the IRR is the eect of an exposure on the probability of becoming diseased. Through a simulation study, the OR from a classical case-control study is shown to be an inconsistent...... the classical chi-square statistics we are able to infer single parameter models. Multiple parameter models, e.g. multiple epochs, are harder to identify. By introducing the inference of population size back in time as an inverse problem, the second procedure applies the theory of smoothing splines to infer...
Comparison of two perturbation methods to estimate the land surface modeling uncertainty
Su, H.; Houser, P.; Tian, Y.; Kumar, S.; Geiger, J.; Belvedere, D.
2007-12-01
In land surface modeling, it is almost impossible to simulate the land surface processes without any error because the earth system is highly complex and the physics of the land processes has not yet been understood sufficiently. In most cases, people want to know not only the model output but also the uncertainty in the modeling, to estimate how reliable the modeling is. Ensemble perturbation is an effective way to estimate the uncertainty in land surface modeling, since land surface models are highly nonlinear which makes the analytical approach not applicable in this estimation. The ideal perturbation noise is zero mean Gaussian distribution, however, this requirement can't be satisfied if the perturbed variables in land surface model have physical boundaries because part of the perturbation noises has to be removed to feed the land surface models properly. Two different perturbation methods are employed in our study to investigate their impact on quantifying land surface modeling uncertainty base on the Land Information System (LIS) framework developed by NASA/GSFC land team. One perturbation method is the built-in algorithm named "STATIC" in LIS version 5; the other is a new perturbation algorithm which was recently developed to minimize the overall bias in the perturbation by incorporating additional information from the whole time series for the perturbed variable. The statistical properties of the perturbation noise generated by the two different algorithms are investigated thoroughly by using a large ensemble size on a NASA supercomputer and then the corresponding uncertainty estimates based on the two perturbation methods are compared. Their further impacts on data assimilation are also discussed. Finally, an optimal perturbation method is suggested.
A robust method for estimating motorbike count based on visual information learning
Huynh, Kien C.; Thai, Dung N.; Le, Sach T.; Thoai, Nam; Hamamoto, Kazuhiko
2015-03-01
Estimating the number of vehicles in traffic videos is an important and challenging task in traffic surveillance, especially with a high level of occlusions between vehicles, e.g.,in crowded urban area with people and/or motorbikes. In such the condition, the problem of separating individual vehicles from foreground silhouettes often requires complicated computation [1][2][3]. Thus, the counting problem is gradually shifted into drawing statistical inferences of target objects density from their shape [4], local features [5], etc. Those researches indicate a correlation between local features and the number of target objects. However, they are inadequate to construct an accurate model for vehicles density estimation. In this paper, we present a reliable method that is robust to illumination changes and partial affine transformations. It can achieve high accuracy in case of occlusions. Firstly, local features are extracted from images of the scene using Speed-Up Robust Features (SURF) method. For each image, a global feature vector is computed using a Bag-of-Words model which is constructed from the local features above. Finally, a mapping between the extracted global feature vectors and their labels (the number of motorbikes) is learned. That mapping provides us a strong prediction model for estimating the number of motorbikes in new images. The experimental results show that our proposed method can achieve a better accuracy in comparison to others.
The statistical process control methods - SPC
Directory of Open Access Journals (Sweden)
Floreková Ľubica
1998-03-01
Full Text Available Methods of statistical evaluation of quality SPC (item 20 of the documentation system of quality control of ISO norm, series 900 of various processes, products and services belong amongst basic qualitative methods that enable us to analyse and compare data pertaining to various quantitative parameters. Also they enable, based on the latter, to propose suitable interventions with the aim of improving these processes, products and services. Theoretical basis and applicatibily of the principles of the: - diagnostics of a cause and effects, - Paret analysis and Lorentz curve, - number distribution and frequency curves of random variable distribution, - Shewhart regulation charts, are presented in the contribution.
The reduction method of statistic scale applied to study of climatic change
International Nuclear Information System (INIS)
Bernal Suarez, Nestor Ricardo; Molina Lizcano, Alicia; Martinez Collantes, Jorge; Pabon Jose Daniel
2000-01-01
In climate change studies the global circulation models of the atmosphere (GCMAs) enable one to simulate the global climate, with the field variables being represented on a grid points 300 km apart. One particular interest concerns the simulation of possible changes in rainfall and surface air temperature due to an assumed increase of greenhouse gases. However, the models yield the climatic projections on grid points that in most cases do not correspond to the sites of major interest. To achieve local estimates of the climatological variables, methods like the one known as statistical down scaling are applied. In this article we show a case in point by applying canonical correlation analysis (CCA) to the Guajira Region in the northeast of Colombia
Robustness of S1 statistic with Hodges-Lehmann for skewed distributions
Ahad, Nor Aishah; Yahaya, Sharipah Soaad Syed; Yin, Lee Ping
2016-10-01
Analysis of variance (ANOVA) is a common use parametric method to test the differences in means for more than two groups when the populations are normally distributed. ANOVA is highly inefficient under the influence of non- normal and heteroscedastic settings. When the assumptions are violated, researchers are looking for alternative such as Kruskal-Wallis under nonparametric or robust method. This study focused on flexible method, S1 statistic for comparing groups using median as the location estimator. S1 statistic was modified by substituting the median with Hodges-Lehmann and the default scale estimator with the variance of Hodges-Lehmann and MADn to produce two different test statistics for comparing groups. Bootstrap method was used for testing the hypotheses since the sampling distributions of these modified S1 statistics are unknown. The performance of the proposed statistic in terms of Type I error was measured and compared against the original S1 statistic, ANOVA and Kruskal-Wallis. The propose procedures show improvement compared to the original statistic especially under extremely skewed distribution.
System and method for traffic signal timing estimation
Dumazert, Julien; Claudel, Christian G.
2015-01-01
A method and system for estimating traffic signals. The method and system can include constructing trajectories of probe vehicles from GPS data emitted by the probe vehicles, estimating traffic signal cycles, combining the estimates, and computing the traffic signal timing by maximizing a scoring function based on the estimates. Estimating traffic signal cycles can be based on transition times of the probe vehicles starting after a traffic signal turns green.
System and method for traffic signal timing estimation
Dumazert, Julien
2015-12-30
A method and system for estimating traffic signals. The method and system can include constructing trajectories of probe vehicles from GPS data emitted by the probe vehicles, estimating traffic signal cycles, combining the estimates, and computing the traffic signal timing by maximizing a scoring function based on the estimates. Estimating traffic signal cycles can be based on transition times of the probe vehicles starting after a traffic signal turns green.
Statistical methods with applications to demography and life insurance
Khmaladze, Estáte V
2013-01-01
Suitable for statisticians, mathematicians, actuaries, and students interested in the problems of insurance and analysis of lifetimes, Statistical Methods with Applications to Demography and Life Insurance presents contemporary statistical techniques for analyzing life distributions and life insurance problems. It not only contains traditional material but also incorporates new problems and techniques not discussed in existing actuarial literature. The book mainly focuses on the analysis of an individual life and describes statistical methods based on empirical and related processes. Coverage ranges from analyzing the tails of distributions of lifetimes to modeling population dynamics with migrations. To help readers understand the technical points, the text covers topics such as the Stieltjes, Wiener, and Itô integrals. It also introduces other themes of interest in demography, including mixtures of distributions, analysis of longevity and extreme value theory, and the age structure of a population. In addi...
Estimating HIES Data through Ratio and Regression Methods for Different Sampling Designs
Directory of Open Access Journals (Sweden)
Faqir Muhammad
2007-01-01
Full Text Available In this study, comparison has been made for different sampling designs, using the HIES data of North West Frontier Province (NWFP for 2001-02 and 1998-99 collected from the Federal Bureau of Statistics, Statistical Division, Government of Pakistan, Islamabad. The performance of the estimators has also been considered using bootstrap and Jacknife. A two-stage stratified random sample design is adopted by HIES. In the first stage, enumeration blocks and villages are treated as the first stage Primary Sampling Units (PSU. The sample PSU’s are selected with probability proportional to size. Secondary Sampling Units (SSU i.e., households are selected by systematic sampling with a random start. They have used a single study variable. We have compared the HIES technique with some other designs, which are: Stratified Simple Random Sampling. Stratified Systematic Sampling. Stratified Ranked Set Sampling. Stratified Two Phase Sampling. Ratio and Regression methods were applied with two study variables, which are: Income (y and Household sizes (x. Jacknife and Bootstrap are used for variance replication. Simple Random Sampling with sample size (462 to 561 gave moderate variances both by Jacknife and Bootstrap. By applying Systematic Sampling, we received moderate variance with sample size (467. In Jacknife with Systematic Sampling, we obtained variance of regression estimator greater than that of ratio estimator for a sample size (467 to 631. At a sample size (952 variance of ratio estimator gets greater than that of regression estimator. The most efficient design comes out to be Ranked set sampling compared with other designs. The Ranked set sampling with jackknife and bootstrap, gives minimum variance even with the smallest sample size (467. Two Phase sampling gave poor performance. Multi-stage sampling applied by HIES gave large variances especially if used with a single study variable.
Watson, Kara M.; McHugh, Amy R.
2014-01-01
representative of the increased development of the last 20 years (1989–2008). The two different land- and water-use conditions were used as surrogates for development to determine whether there have been changes in low-flow statistics as a result of changes in development over time. The State was divided into two low-flow regression regions, the Coastal Plain and the non-coastal region, in order to improve the accuracy of the regression equations. The left-censored parametric survival regression method was used for the analyses to account for streamgages and partial-record stations that had zero flow values for some of the statistics. The average standard error of estimate for the 348 regression equations ranged from 16 to 340 percent. These regression equations and basin characteristics are presented in the U.S. Geological Survey (USGS) StreamStats Web-based geographic information system application. This tool allows users to click on an ungaged site on a stream in New Jersey and get the estimated flow-duration and low-flow frequency statistics. Additionally, the user can click on a streamgage or partial-record station and get the “at-site” streamflow statistics. The low-flow characteristics of a stream ultimately affect the use of the stream by humans. Specific information on the low-flow characteristics of streams is essential to water managers who deal with problems related to municipal and industrial water supply, fish and wildlife conservation, and dilution of wastewater.
Directory of Open Access Journals (Sweden)
Sadreyev Ruslan I
2004-08-01
Full Text Available Abstract Background Profile-based analysis of multiple sequence alignments (MSA allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1 MSA position and a set of predicted residue frequencies, and (2 between two MSA positions. These problems are important for (i evaluation and optimization of methods predicting residue occurrence at protein positions; (ii detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii detection of sites that determine functional or structural specificity in two related families. Results For problems (1 and (2, we propose analytical estimates of P-value and apply them to the detection of significant positional dissimilarities in various experimental situations. (a We compare structure-based predictions of residue propensities at a protein position to the actual residue frequencies in the MSA of homologs. (b We evaluate our method by the ability to detect erroneous position matches produced by an automatic sequence aligner. (c We compare MSA positions that correspond to residues aligned by automatic structure aligners. (d We compare MSA positions that are aligned by high-quality manual superposition of structures. Detected dissimilarities reveal shortcomings of the automatic methods for residue frequency prediction and alignment construction. For the high-quality structural alignments, the dissimilarities suggest sites of potential functional or structural importance. Conclusion The proposed computational method is of significant potential value for the analysis of protein families.
CERN. Geneva
2005-01-01
The three lectures will present an introduction to statistical methods as used in High Energy Physics. As the time will be very limited, the course will seek mainly to define the important issues and to introduce the most wide used tools. Topics will include the interpretation and use of probability, estimation of parameters and testing of hypotheses.
CERN. Geneva
2004-01-01
The three lectures will present an introduction to statistical methods as used in High Energy Physics. As the time will be very limited, the course will seek mainly to define the important issues and to introduce the most wide used tools. Topics will include the interpretation and use of probability, estimation of parameters and testing of hypotheses.
Assaraf, Roland
2014-12-01
We show that the recently proposed correlated sampling without reweighting procedure extends the locality (asymptotic independence of the system size) of a physical property to the statistical fluctuations of its estimator. This makes the approach potentially vastly more efficient for computing space-localized properties in large systems compared with standard correlated methods. A proof is given for a large collection of noninteracting fragments. Calculations on hydrogen chains suggest that this behavior holds not only for systems displaying short-range correlations, but also for systems with long-range correlations.
Statistical Bayesian method for reliability evaluation based on ADT data
Lu, Dawei; Wang, Lizhi; Sun, Yusheng; Wang, Xiaohong
2018-05-01
Accelerated degradation testing (ADT) is frequently conducted in the laboratory to predict the products’ reliability under normal operating conditions. Two kinds of methods, degradation path models and stochastic process models, are utilized to analyze degradation data and the latter one is the most popular method. However, some limitations like imprecise solution process and estimation result of degradation ratio still exist, which may affect the accuracy of the acceleration model and the extrapolation value. Moreover, the conducted solution of this problem, Bayesian method, lose key information when unifying the degradation data. In this paper, a new data processing and parameter inference method based on Bayesian method is proposed to handle degradation data and solve the problems above. First, Wiener process and acceleration model is chosen; Second, the initial values of degradation model and parameters of prior and posterior distribution under each level is calculated with updating and iteration of estimation values; Third, the lifetime and reliability values are estimated on the basis of the estimation parameters; Finally, a case study is provided to demonstrate the validity of the proposed method. The results illustrate that the proposed method is quite effective and accuracy in estimating the lifetime and reliability of a product.
Evaluation of Statistical Methods for Modeling Historical Resource Production and Forecasting
Nanzad, Bolorchimeg
This master's thesis project consists of two parts. Part I of the project compares modeling of historical resource production and forecasting of future production trends using the logit/probit transform advocated by Rutledge (2011) with conventional Hubbert curve fitting, using global coal production as a case study. The conventional Hubbert/Gaussian method fits a curve to historical production data whereas a logit/probit transform uses a linear fit to a subset of transformed production data. Within the errors and limitations inherent in this type of statistical modeling, these methods provide comparable results. That is, despite that apparent goodness-of-fit achievable using the Logit/Probit methodology, neither approach provides a significant advantage over the other in either explaining the observed data or in making future projections. For mature production regions, those that have already substantially passed peak production, results obtained by either method are closely comparable and reasonable, and estimates of ultimately recoverable resources obtained by either method are consistent with geologically estimated reserves. In contrast, for immature regions, estimates of ultimately recoverable resources generated by either of these alternative methods are unstable and thus, need to be used with caution. Although the logit/probit transform generates high quality-of-fit correspondence with historical production data, this approach provides no new information compared to conventional Gaussian or Hubbert-type models and may have the effect of masking the noise and/or instability in the data and the derived fits. In particular, production forecasts for immature or marginally mature production systems based on either method need to be regarded with considerable caution. Part II of the project investigates the utility of a novel alternative method for multicyclic Hubbert modeling tentatively termed "cycle-jumping" wherein overlap of multiple cycles is limited. The
Statistical Methods for Unusual Count Data: Examples From Studies of Microchimerism
Guthrie, Katherine A.; Gammill, Hilary S.; Kamper-Jørgensen, Mads; Tjønneland, Anne; Gadi, Vijayakrishna K.; Nelson, J. Lee; Leisenring, Wendy
2016-01-01
Natural acquisition of small amounts of foreign cells or DNA, referred to as microchimerism, occurs primarily through maternal-fetal exchange during pregnancy. Microchimerism can persist long-term and has been associated with both beneficial and adverse human health outcomes. Quantitative microchimerism data present challenges for statistical analysis, including a skewed distribution, excess zero values, and occasional large values. Methods for comparing microchimerism levels across groups while controlling for covariates are not well established. We compared statistical models for quantitative microchimerism values, applied to simulated data sets and 2 observed data sets, to make recommendations for analytic practice. Modeling the level of quantitative microchimerism as a rate via Poisson or negative binomial model with the rate of detection defined as a count of microchimerism genome equivalents per total cell equivalents tested utilizes all available data and facilitates a comparison of rates between groups. We found that both the marginalized zero-inflated Poisson model and the negative binomial model can provide unbiased and consistent estimates of the overall association of exposure or study group with microchimerism detection rates. The negative binomial model remains the more accessible of these 2 approaches; thus, we conclude that the negative binomial model may be most appropriate for analyzing quantitative microchimerism data. PMID:27769989
P.B., Mohite; R.B., Pandhare; S.G., Khanage
2012-01-01
Purpose: Lamivudine is cytosine and zidovudine is cytidine and is used as an antiretroviral agents. Both drugs are available in tablet dosage forms with a dose of 150 mg for LAM and 300 mg ZID respectively. Method: The method employed is based on first order derivative spectroscopy. Wavelengths 279 nm and 300 nm were selected for the estimation of the Lamovudine and Zidovudine respectively by taking the first order derivative spectra. The conc. of both drugs was determined by proposed method. The results of analysis have been validated statistically and by recovery studies as per ICH guidelines. Result: Both the drugs obey Beer’s law in the concentration range 10-50 μg mL-1,for LAM and ZID; with regression 0.9998 and 0.9999, intercept – 0.0677 and – 0.0043 and slope 0.0457 and 0.0391 for LAM and ZID, respectively.The accuracy and reproducibility results are close to 100% with 2% RSD. Conclusion: A simple, accurate, precise, sensitive and economical procedures for simultaneous estimation of Lamovudine and Zidovudine in tablet dosage form have been developed. PMID:24312779
Evaluation of Oceanic Transport Statistics By Use of Transient Tracers and Bayesian Methods
Trossman, D. S.; Thompson, L.; Mecking, S.; Bryan, F.; Peacock, S.
2013-12-01
Key variables that quantify the time scales over which atmospheric signals penetrate into the oceanic interior and their uncertainties are computed using Bayesian methods and transient tracers from both models and observations. First, the mean residence times, subduction rates, and formation rates of Subtropical Mode Water (STMW) and Subpolar Mode Water (SPMW) in the North Atlantic and Subantarctic Mode Water (SAMW) in the Southern Ocean are estimated by combining a model and observations of chlorofluorocarbon-11 (CFC-11) via Bayesian Model Averaging (BMA), statistical technique that weights model estimates according to how close they agree with observations. Second, a Bayesian method is presented to find two oceanic transport parameters associated with the age distribution of ocean waters, the transit-time distribution (TTD), by combining an eddying global ocean model's estimate of the TTD with hydrographic observations of CFC-11, temperature, and salinity. Uncertainties associated with objectively mapping irregularly spaced bottle data are quantified by making use of a thin-plate spline and then propagated via the two Bayesian techniques. It is found that the subduction of STMW, SPMW, and SAMW is mostly an advective process, but up to about one-third of STMW subduction likely owes to non-advective processes. Also, while the formation of STMW is mostly due to subduction, the formation of SPMW is mostly due to other processes. About half of the formation of SAMW is due to subduction and half is due to other processes. A combination of air-sea flux, acting on relatively short time scales, and turbulent mixing, acting on a wide range of time scales, is likely the dominant SPMW erosion mechanism. Air-sea flux is likely responsible for most STMW erosion, and turbulent mixing is likely responsible for most SAMW erosion. Two oceanic transport parameters, the mean age of a water parcel and the half-variance associated with the TTD, estimated using the model's tracers as
Bayat, Bardia; Zahraie, Banafsheh; Taghavi, Farahnaz; Nasseri, Mohsen
2013-08-01
Identification of spatial and spatiotemporal precipitation variations plays an important role in different hydrological applications such as missing data estimation. In this paper, the results of Bayesian maximum entropy (BME) and ordinary kriging (OK) are compared for modeling spatial and spatiotemporal variations of annual precipitation with and without incorporating elevation variations. The study area of this research is Namak Lake watershed located in the central part of Iran with an area of approximately 90,000 km2. The BME and OK methods have been used to model the spatial and spatiotemporal variations of precipitation in this watershed, and their performances have been evaluated using cross-validation statistics. The results of the case study have shown the superiority of BME over OK in both spatial and spatiotemporal modes. The results have shown that BME estimates are less biased and more accurate than OK. The improvements in the BME estimates are mostly related to incorporating hard and soft data in the estimation process, which resulted in more detailed and reliable results. Estimation error variance for BME results is less than OK estimations in the study area in both spatial and spatiotemporal modes.
Identification of mine waters by statistical multivariate methods
Energy Technology Data Exchange (ETDEWEB)
Mali, N [IGGG, Ljubljana (Slovenia)
1992-01-01
Three water-bearing aquifers are present in the Velenje lignite mine. The aquifer waters have differing chemical composition; a geochemical water analysis can therefore determine the source of mine water influx. Mine water samples from different locations in the mine were analyzed, the results of chemical content and of electric conductivity of mine water were statistically processed by means of MICROGAS, SPSS-X and IN STATPAC computer programs, which apply three multivariate statistical methods (discriminate, cluster and factor analysis). Reliability of calculated values was determined with the Kolmogorov and Smirnov tests. It is concluded that laboratory analysis of single water samples can produce measurement errors, but statistical processing of water sample data can identify origin and movement of mine water. 15 refs.
Max-Moerbeck, W.; Richards, J. L.; Hovatta, T.; Pavlidou, V.; Pearson, T. J.; Readhead, A. C. S.
2014-11-01
We present a practical implementation of a Monte Carlo method to estimate the significance of cross-correlations in unevenly sampled time series of data, whose statistical properties are modelled with a simple power-law power spectral density. This implementation builds on published methods; we introduce a number of improvements in the normalization of the cross-correlation function estimate and a bootstrap method for estimating the significance of the cross-correlations. A closely related matter is the estimation of a model for the light curves, which is critical for the significance estimates. We present a graphical and quantitative demonstration that uses simulations to show how common it is to get high cross-correlations for unrelated light curves with steep power spectral densities. This demonstration highlights the dangers of interpreting them as signs of a physical connection. We show that by using interpolation and the Hanning sampling window function we are able to reduce the effects of red-noise leakage and to recover steep simple power-law power spectral densities. We also introduce the use of a Neyman construction for the estimation of the errors in the power-law index of the power spectral density. This method provides a consistent way to estimate the significance of cross-correlations in unevenly sampled time series of data.
van de Glind, Esther M M; Willems, Hanna C; Eslami, Saeid; Abu-Hanna, Ameen; Lems, Willem F; Hooft, Lotty; de Rooij, Sophia E; Black, Dennis M; van Munster, Barbara C
2016-05-01
For physicians dealing with patients with a limited life expectancy, knowing the time to benefit (TTB) of preventive medication is essential to support treatment decisions. The aim of this study was to investigate the usefulness of statistical process control (SPC) for determining the TTB in relation to fracture risk with alendronate versus placebo in postmenopausal women. We performed a post hoc analysis of the Fracture Intervention Trial (FIT), a randomized, controlled trial that investigated the effect of alendronate versus placebo on fracture risk in postmenopausal women. We used SPC, a statistical method used for monitoring processes for quality control, to determine if and when the intervention group benefited significantly more than the control group. SPC discriminated between the normal variations over time in the numbers of fractures in both groups and the variations that were attributable to alendronate. The TTB was defined as the time point from which the cumulative difference in the number of clinical fractures remained greater than the upper control limit on the SPC chart. For the total group, the TTB was defined as 11 months. For patients aged ≥70 years, the TTB was 8 months [absolute risk reduction (ARR) = 1.4%]; for patients aged <70 years, it was 19 months (ARR = 0.7%). SPC is a clear and understandable graphical method to determine the TTB. Its main advantage is that there is no need to define a prespecified time point, as is the case in traditional survival analyses. Prescribing alendronate to patients who are aged ≥70 years is useful because the TTB shows that they will benefit after 8 months. Investigators should report the TTB to simplify clinical decision making.
Wang, Hongkai; Stout, David B; Chatziioannou, Arion F
2012-01-01
Micro-CT is widely used in preclinical studies of small animals. Due to the low soft-tissue contrast in typical studies, segmentation of soft tissue organs from noncontrast enhanced micro-CT images is a challenging problem. Here, we propose an atlas-based approach for estimating the major organs in mouse micro-CT images. A statistical atlas of major trunk organs was constructed based on 45 training subjects. The statistical shape model technique was used to include inter-subject anatomical variations. The shape correlations between different organs were described using a conditional Gaussian model. For registration, first the high-contrast organs in micro-CT images were registered by fitting the statistical shape model, while the low-contrast organs were subsequently estimated from the high-contrast organs using the conditional Gaussian model. The registration accuracy was validated based on 23 noncontrast-enhanced and 45 contrast-enhanced micro-CT images. Three different accuracy metrics (Dice coefficient, organ volume recovery coefficient, and surface distance) were used for evaluation. The Dice coefficients vary from 0.45 ± 0.18 for the spleen to 0.90 ± 0.02 for the lungs, the volume recovery coefficients vary from 0.96 ± 0.10 for the liver to 1.30 ± 0.75 for the spleen, the surface distances vary from 0.18 ± 0.01 mm for the lungs to 0.72 ± 0.42 mm for the spleen. The registration accuracy of the statistical atlas was compared with two publicly available single-subject mouse atlases, i.e., the MOBY phantom and the DIGIMOUSE atlas, and the results proved that the statistical atlas is more accurate than the single atlases. To evaluate the influence of the training subject size, different numbers of training subjects were used for atlas construction and registration. The results showed an improvement of the registration accuracy when more training subjects were used for the atlas construction. The statistical atlas-based registration was also compared with
Uncertainty analysis with statistically correlated failure data
International Nuclear Information System (INIS)
Modarres, M.; Dezfuli, H.; Roush, M.L.
1987-01-01
Likelihood of occurrence of the top event of a fault tree or sequences of an event tree is estimated from the failure probability of components that constitute the events of the fault/event tree. Component failure probabilities are subject to statistical uncertainties. In addition, there are cases where the failure data are statistically correlated. At present most fault tree calculations are based on uncorrelated component failure data. This chapter describes a methodology for assessing the probability intervals for the top event failure probability of fault trees or frequency of occurrence of event tree sequences when event failure data are statistically correlated. To estimate mean and variance of the top event, a second-order system moment method is presented through Taylor series expansion, which provides an alternative to the normally used Monte Carlo method. For cases where component failure probabilities are statistically correlated, the Taylor expansion terms are treated properly. Moment matching technique is used to obtain the probability distribution function of the top event through fitting the Johnson Ssub(B) distribution. The computer program, CORRELATE, was developed to perform the calculations necessary for the implementation of the method developed. (author)
Development of a biometric method to estimate age on hand radiographs.
Remy, Floriane; Hossu, Gabriela; Cendre, Romain; Micard, Emilien; Mainard-Simard, Laurence; Felblinger, Jacques; Martrille, Laurent; Lalys, Loïc
2017-02-01
Age estimation of living individuals aged less than 13, 18 or 21 years, which are some relevant legal ages in most European countries, is currently problematic in the forensic context. Thus, numerous methods are available for legal authorities, although their efficiency can be discussed. For those reasons, we aimed to propose a new method, based on the biometric analysis of hand bones. 451 hand radiographs of French individuals under the age of 21 were retrospectively analyzed. This total sample was divided into three subgroups bounded by the relevant legal ages previously mentioned: 0-13, 13-18 and 18-21 years. On these radiographs, we numerically applied the osteometric board method used in anthropology, by including each metacarpal and proximal phalange of the five hand rays in the smallest rectangle possible. In that we can access their length and width information thanks to a measurement protocol developed precisely for our treatment with the ORS Visual ® software. Then, a statistical analysis was performed from these biometric data: a Linear Discriminant Analysis (LDA) evaluated the probability for an individual to belong to one of the age group (0-13, 13-18 or 18-21); and several multivariate regression models were tested for the establishment of age estimation formulas for each of these age groups. The mean Correlation Coefficient between chronological age and both lengths and widths of hand bones is equal to 0.90 for the total sample. Repeatability and reproducibility were assessed. The LDA could more easily predict the belonging to the 0-13 age group. Age can be estimated with a mean standard error which never exceeds 1 year for the 95% confidence interval. Finally, compared to the literature, we can conclude that estimating an age from the biometric information of metacarpals and proximal phalanges is promising. Copyright © 2016. Published by Elsevier B.V.
REANALYSIS OF F-STATISTIC GRAVITATIONAL-WAVE SEARCHES WITH THE HIGHER CRITICISM STATISTIC
International Nuclear Information System (INIS)
Bennett, M. F.; Melatos, A.; Delaigle, A.; Hall, P.
2013-01-01
We propose a new method of gravitational-wave detection using a modified form of higher criticism, a statistical technique introduced by Donoho and Jin. Higher criticism is designed to detect a group of sparse, weak sources, none of which are strong enough to be reliably estimated or detected individually. We apply higher criticism as a second-pass method to synthetic F-statistic and C-statistic data for a monochromatic periodic source in a binary system and quantify the improvement relative to the first-pass methods. We find that higher criticism on C-statistic data is more sensitive by ∼6% than the C-statistic alone under optimal conditions (i.e., binary orbit known exactly) and the relative advantage increases as the error in the orbital parameters increases. Higher criticism is robust even when the source is not monochromatic (e.g., phase-wandering in an accreting system). Applying higher criticism to a phase-wandering source over multiple time intervals gives a ∼> 30% increase in detectability with few assumptions about the frequency evolution. By contrast, in all-sky searches for unknown periodic sources, which are dominated by the brightest source, second-pass higher criticism does not provide any benefits over a first-pass search.
METHODOLOGICAL PRINCIPLES AND METHODS OF TERMS OF TRADE STATISTICAL EVALUATION
Directory of Open Access Journals (Sweden)
N. Kovtun
2014-09-01
Full Text Available The paper studies the methodological principles and guidance of the statistical evaluation of terms of trade for the United Nations classification model – Harmonized Commodity Description and Coding System (HS. The practical implementation of the proposed three-stage model of index analysis and estimation of terms of trade for Ukraine's commodity-members for the period of 2011-2012 are realized.
Method for Estimating Water Withdrawals for Livestock in the United States, 2005
Lovelace, John K.
2009-01-01
Livestock water use includes ground water and surface water associated with livestock watering, feedlots, dairy operations, and other on-farm needs. The water may be used for drinking, cooling, sanitation, waste disposal, and other needs related to the animals. Estimates of water withdrawals for livestock are needed for water planning and management. This report documents a method used to estimate withdrawals of fresh ground water and surface water for livestock in 2005 for each county and county equivalent in the United States, Puerto Rico, and the U.S. Virgin Islands. Categories of livestock included dairy cattle, beef and other cattle, hogs and pigs, laying hens, broilers and other chickens, turkeys, sheep and lambs, all goats, and horses (including ponies, mules, burros, and donkeys). Use of the method described in this report could result in more consistent water-withdrawal estimates for livestock that can be used by water managers and planners to determine water needs and trends across the United States. Water withdrawals for livestock in 2005 were estimated by using water-use coefficients, in gallons per head per day for each animal type, and livestock-population data. Coefficients for various livestock for most States were obtained from U.S. Geological Survey water-use program personnel or U.S. Geological Survey water-use publications. When no coefficient was available for an animal type in a State, the median value of reported coefficients for that animal was used. Livestock-population data were provided by the National Agricultural Statistics Service. County estimates were further divided into ground-water and surface-water withdrawals for each county and county equivalent. County totals from 2005 were compared to county totals from 1995 and 2000. Large deviations from 1995 or 2000 livestock withdrawal estimates were investigated and generally were due to comparison with reported withdrawals, differences in estimation techniques, differences in livestock
Students' Attitudes toward Statistics across the Disciplines: A Mixed-Methods Approach
Griffith, James D.; Adams, Lea T.; Gu, Lucy L.; Hart, Christian L.; Nichols-Whitehead, Penney
2012-01-01
Students' attitudes toward statistics were investigated using a mixed-methods approach including a discovery-oriented qualitative methodology among 684 undergraduate students across business, criminal justice, and psychology majors where at least one course in statistics was required. Students were asked about their attitudes toward statistics and…
A Fast Soft Bit Error Rate Estimation Method
Directory of Open Access Journals (Sweden)
Ait-Idir Tarik
2010-01-01
Full Text Available We have suggested in a previous publication a method to estimate the Bit Error Rate (BER of a digital communications system instead of using the famous Monte Carlo (MC simulation. This method was based on the estimation of the probability density function (pdf of soft observed samples. The kernel method was used for the pdf estimation. In this paper, we suggest to use a Gaussian Mixture (GM model. The Expectation Maximisation algorithm is used to estimate the parameters of this mixture. The optimal number of Gaussians is computed by using Mutual Information Theory. The analytical expression of the BER is therefore simply given by using the different estimated parameters of the Gaussian Mixture. Simulation results are presented to compare the three mentioned methods: Monte Carlo, Kernel and Gaussian Mixture. We analyze the performance of the proposed BER estimator in the framework of a multiuser code division multiple access system and show that attractive performance is achieved compared with conventional MC or Kernel aided techniques. The results show that the GM method can drastically reduce the needed number of samples to estimate the BER in order to reduce the required simulation run-time, even at very low BER.
Fabbri, A; Sinding-Larsen, R
1988-01-01
This volume contains the edited papers prepared by lecturers and participants of the NATO Advanced Study Institute on "Statistical Treatments for Estimation of Mineral and Energy Resources" held at II Ciocco (Lucca), Italy, June 22 - July 4, 1986. During the past twenty years, tremendous efforts have been made to acquire quantitative geoscience information from ore deposits, geochemical, geophys ical and remotely-sensed measurements. In October 1981, a two-day symposium on "Quantitative Resource Evaluation" and a three-day workshop on "Interactive Systems for Multivariate Analysis and Image Processing for Resource Evaluation" were held in Ottawa, jointly sponsored by the Geological Survey of Canada, the International Association for Mathematical Geology, and the International Geological Correlation Programme. Thirty scientists from different countries in Europe and North America were invited to form a forum for the discussion of quantitative methods for mineral and energy resource assessment. Since then, not ...
Identifying User Profiles from Statistical Grouping Methods
Directory of Open Access Journals (Sweden)
Francisco Kelsen de Oliveira
2018-02-01
Full Text Available This research aimed to group users into subgroups according to their levels of knowledge about technology. Statistical hierarchical and non-hierarchical clustering methods were studied, compared and used in the creations of the subgroups from the similarities of the skill levels with these users’ technology. The research sample consisted of teachers who answered online questionnaires about their skills with the use of software and hardware with educational bias. The statistical methods of grouping were performed and showed the possibilities of groupings of the users. The analyses of these groups allowed to identify the common characteristics among the individuals of each subgroup. Therefore, it was possible to define two subgroups of users, one with skill in technology and another with skill with technology, so that the partial results of the research showed two main algorithms for grouping with 92% similarity in the formation of groups of users with skill with technology and the other with little skill, confirming the accuracy of the techniques of discrimination against individuals.
Statistics for Locally Scaled Point Patterns
DEFF Research Database (Denmark)
Prokesová, Michaela; Hahn, Ute; Vedel Jensen, Eva B.
2006-01-01
scale factor. The main emphasis of the present paper is on analysis of such models. Statistical methods are developed for estimation of scaling function and template parameters as well as for model validation. The proposed methods are assessed by simulation and used in the analysis of a vegetation...
Han, Fang; Liu, Han
2016-01-01
Correlation matrices play a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in estimating large correlation matrices focuses on the use of Pearson's sample correlation matrix. Although Pearson's sample correlation matrix enjoys various good properties under Gaussian models, it is not an effective estimator when facing heavy-tailed distributions. As a robust alternative, Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-2...
Estimation methods for statistical process control
Schoonhoven, M.
2011-01-01
Marit Schoonhoven onderzocht schattingsmethoden die gebruikt worden om de regelgrenzen in een regelkaart te bepalen. Een regelkaart wordt in de praktijk toegepast om vast te stellen of een proces normaal functioneert. Het is een grafiek waarin metingen op de verticale as worden uitgezet tegen de
Estimation methods for statistical process control
Schoonhoven, M.
2011-01-01
Marit Schoonhoven onderzocht schattingsmethoden die gebruikt worden om de regelgrenzen in een regelkaart te bepalen. Een regelkaart wordt in de praktijk toegepast om vast te stellen of een proces normaal functioneert. Het is een grafiek waarin metingen op de verticale as worden uitgezet tegen de tijd op de horizontale as, aangevuld met een boven- en een onderregelgrens. Wanneer een meting buiten de regelgrenzen valt geeft de regelkaart een signaal. Om de regelgrenzen te bepalen dienen het gem...
Austin, Peter C
2008-09-01
Propensity-score matching is frequently used in the cardiology literature. Recent systematic reviews have found that this method is, in general, poorly implemented in the medical literature. The study objective was to examine the quality of the implementation of propensity-score matching in the general cardiology literature. A total of 44 articles published in the American Heart Journal, the American Journal of Cardiology, Circulation, the European Heart Journal, Heart, the International Journal of Cardiology, and the Journal of the American College of Cardiology between January 1, 2004, and December 31, 2006, were examined. Twenty of the 44 studies did not provide adequate information on how the propensity-score-matched pairs were formed. Fourteen studies did not report whether matching on the propensity score balanced baseline characteristics between treated and untreated subjects in the matched sample. Only 4 studies explicitly used statistical methods appropriate for matched studies to compare baseline characteristics between treated and untreated subjects. Only 11 (25%) of the 44 studies explicitly used statistical methods appropriate for the analysis of matched data when estimating the effect of treatment on the outcomes. Only 2 studies described the matching method used, assessed balance in baseline covariates by appropriate methods, and used appropriate statistical methods to estimate the treatment effect and its significance. Application of propensity-score matching was poor in the cardiology literature. Suggestions for improving the reporting and analysis of studies that use propensity-score matching are provided.
Vortex methods and vortex statistics
International Nuclear Information System (INIS)
Chorin, A.J.
1993-05-01
Vortex methods originated from the observation that in incompressible, inviscid, isentropic flow vorticity (or, more accurately, circulation) is a conserved quantity, as can be readily deduced from the absence of tangential stresses. Thus if the vorticity is known at time t = 0, one can deduce the flow at a later time by simply following it around. In this narrow context, a vortex method is a numerical method that makes use of this observation. Even more generally, the analysis of vortex methods leads, to problems that are closely related to problems in quantum physics and field theory, as well as in harmonic analysis. A broad enough definition of vortex methods ends up by encompassing much of science. Even the purely computational aspects of vortex methods encompass a range of ideas for which vorticity may not be the best unifying theme. The author restricts himself in these lectures to a special class of numerical vortex methods, those that are based on a Lagrangian transport of vorticity in hydrodynamics by smoothed particles (''blobs'') and those whose understanding contributes to the understanding of blob methods. Vortex methods for inviscid flow lead to systems of ordinary differential equations that can be readily clothed in Hamiltonian form, both in three and two space dimensions, and they can preserve exactly a number of invariants of the Euler equations, including topological invariants. Their viscous versions resemble Langevin equations. As a result, they provide a very useful cartoon of statistical hydrodynamics, i.e., of turbulence, one that can to some extent be analyzed analytically and more importantly, explored numerically, with important implications also for superfluids, superconductors, and even polymers. In the authors view, vortex ''blob'' methods provide the most promising path to the understanding of these phenomena
Shirley, Natalie R; Ramirez Montes, Paula Andrea
2015-01-01
The purpose of this study was to assess observer error in phase versus component-based scoring systems used to develop age estimation methods in forensic anthropology. A method preferred by forensic anthropologists in the AAFS was selected for this evaluation (the Suchey-Brooks method for the pubic symphysis). The Suchey-Brooks descriptions were used to develop a corresponding component-based scoring system for comparison. Several commonly used reliability statistics (kappa, weighted kappa, and the intraclass correlation coefficient) were calculated to assess observer agreement between two observers and to evaluate the efficacy of each of these statistics for this study. The linear weighted kappa was determined to be the most suitable measure of observer agreement. The results show that a component-based system offers the possibility for more objective scoring than a phase system as long as the coding possibilities for each trait do not exceed three states of expression, each with as little overlap as possible. © 2014 American Academy of Forensic Sciences.
Ridge Distance Estimation in Fingerprint Images: Algorithm and Performance Evaluation
Directory of Open Access Journals (Sweden)
Tian Jie
2004-01-01
Full Text Available It is important to estimate the ridge distance accurately, an intrinsic texture property of a fingerprint image. Up to now, only several articles have touched directly upon ridge distance estimation. Little has been published providing detailed evaluation of methods for ridge distance estimation, in particular, the traditional spectral analysis method applied in the frequency field. In this paper, a novel method on nonoverlap blocks, called the statistical method, is presented to estimate the ridge distance. Direct estimation ratio (DER and estimation accuracy (EA are defined and used as parameters along with time consumption (TC to evaluate performance of these two methods for ridge distance estimation. Based on comparison of performances of these two methods, a third hybrid method is developed to combine the merits of both methods. Experimental results indicate that DER is 44.7%, 63.8%, and 80.6%; EA is 84%, 93%, and 91%; and TC is , , and seconds, with the spectral analysis method, statistical method, and hybrid method, respectively.
Mathematical and Statistical Methods for Actuarial Sciences and Finance
Legros, Florence; Perna, Cira; Sibillo, Marilena
2017-01-01
This volume gathers selected peer-reviewed papers presented at the international conference "MAF 2016 – Mathematical and Statistical Methods for Actuarial Sciences and Finance”, held in Paris (France) at the Université Paris-Dauphine from March 30 to April 1, 2016. The contributions highlight new ideas on mathematical and statistical methods in actuarial sciences and finance. The cooperation between mathematicians and statisticians working in insurance and finance is a very fruitful field, one that yields unique theoretical models and practical applications, as well as new insights in the discussion of problems of national and international interest. This volume is addressed to academicians, researchers, Ph.D. students and professionals.
A method for statistical steady state thermal analysis of reactor cores
International Nuclear Information System (INIS)
Whetton, P.A.
1980-01-01
This paper presents a method for performing a statistical steady state thermal analysis of a reactor core. The technique is only outlined here since detailed thermal equations are dependent on the core geometry. The method has been applied to a pressurised water reactor core and the results are presented for illustration purposes. Random hypothetical cores are generated using the Monte-Carlo method. The technique shows that by splitting the parameters into two types, denoted core-wise and in-core, the Monte Carlo method may be used inexpensively. The idea of using extremal statistics to characterise the low probability events (i.e. the tails of a distribution) is introduced together with a method of forming the final probability distribution. After establishing an acceptable probability of exceeding a thermal design criterion, the final probability distribution may be used to determine the corresponding thermal response value. If statistical and deterministic (i.e. conservative) thermal response values are compared, information on the degree of pessimism in the deterministic method of analysis may be inferred and the restrictive performance limitations imposed by this method relieved. (orig.)
A MONTE-CARLO METHOD FOR ESTIMATING THE CORRELATION EXPONENT
MIKOSCH, T; WANG, QA
We propose a Monte Carlo method for estimating the correlation exponent of a stationary ergodic sequence. The estimator can be considered as a bootstrap version of the classical Hill estimator. A simulation study shows that the method yields reasonable estimates.
Applied systems ecology: models, data, and statistical methods
Energy Technology Data Exchange (ETDEWEB)
Eberhardt, L L
1976-01-01
In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.
Estimation of potential evapotranspiration of a coastal savannah environment; comparison of methods
International Nuclear Information System (INIS)
Asare, D.K.; Ayeh, E.O.; Amenorpe, G.; Banini, G.K.
2011-01-01
Six potential evapotranspiration models namely, Penman-Monteith, Hargreaves-Samani , Priestley-Taylor, IRMAK1, IRMAK2 and TURC, were used to estimate daily PET values at Atomic-Kwabenya in the coastal savannah environment of Ghana for the year 2005. The study compared PET values generated by six models and identified which ones compared favourably with the Penman-Monteith model which is the recommended standard method for estimating PET. Cross comparison analysis showed that only the daily estimates of PET of Hargreaves-Samani model correlated reasonably (r = 0.82) with estimates by the Penman-Monteith model. Additionally, PET values by the Priestley-Taylor and TURC models were highly correlated (r = 0.99) as well as those generated by IRMAK2 and TURC models (r = 0.96). Statistical analysis, based on pair comparison of means, showed that daily PET estimates of the Penman-Monteith model were not different from the Priestley-Taylor model for the Kwabenya-Atomic area located in the coastal savannah environment of Ghana. The Priestley-Taylor model can be used, in place of the Penman-Monteith model, to estimate daily PET for the Atomic-Kwabenya area of the coastal savannah environment of Ghana. The Hargreaves-Samani model can also be used to estimate PET for the study area because its PET estimates correlated reasonably with those of the Penman-Monteith model (r = 0.82) and requires only air temperature measurements as inputs. (au)
A note on the kappa statistic for clustered dichotomous data.
Zhou, Ming; Yang, Zhao
2014-06-30
The kappa statistic is widely used to assess the agreement between two raters. Motivated by a simulation-based cluster bootstrap method to calculate the variance of the kappa statistic for clustered physician-patients dichotomous data, we investigate its special correlation structure and develop a new simple and efficient data generation algorithm. For the clustered physician-patients dichotomous data, based on the delta method and its special covariance structure, we propose a semi-parametric variance estimator for the kappa statistic. An extensive Monte Carlo simulation study is performed to evaluate the performance of the new proposal and five existing methods with respect to the empirical coverage probability, root-mean-square error, and average width of the 95% confidence interval for the kappa statistic. The variance estimator ignoring the dependence within a cluster is generally inappropriate, and the variance estimators from the new proposal, bootstrap-based methods, and the sampling-based delta method perform reasonably well for at least a moderately large number of clusters (e.g., the number of clusters K ⩾50). The new proposal and sampling-based delta method provide convenient tools for efficient computations and non-simulation-based alternatives to the existing bootstrap-based methods. Moreover, the new proposal has acceptable performance even when the number of clusters is as small as K = 25. To illustrate the practical application of all the methods, one psychiatric research data and two simulated clustered physician-patients dichotomous data are analyzed. Copyright © 2014 John Wiley & Sons, Ltd.
Approximate maximum likelihood estimation for population genetic inference.
Bertl, Johanna; Ewing, Gregory; Kosiol, Carolin; Futschik, Andreas
2017-11-27
In many population genetic problems, parameter estimation is obstructed by an intractable likelihood function. Therefore, approximate estimation methods have been developed, and with growing computational power, sampling-based methods became popular. However, these methods such as Approximate Bayesian Computation (ABC) can be inefficient in high-dimensional problems. This led to the development of more sophisticated iterative estimation methods like particle filters. Here, we propose an alternative approach that is based on stochastic approximation. By moving along a simulated gradient or ascent direction, the algorithm produces a sequence of estimates that eventually converges to the maximum likelihood estimate, given a set of observed summary statistics. This strategy does not sample much from low-likelihood regions of the parameter space, and is fast, even when many summary statistics are involved. We put considerable efforts into providing tuning gui