Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R
2017-09-14
While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. ©Elizabeth Fearon, Sungai T Chabata, Jennifer A Thompson, Frances M Cowan, James R Hargreaves. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 14.09.2017.
Sample size estimation and sampling techniques for selecting a representative sample
Directory of Open Access Journals (Sweden)
Aamir Omair
2014-01-01
Full Text Available Introduction: The purpose of this article is to provide a general understanding of the concepts of sampling as applied to health-related research. Sample Size Estimation: It is important to select a representative sample in quantitative research in order to be able to generalize the results to the target population. The sample should be of the required sample size and must be selected using an appropriate probability sampling technique. There are many hidden biases which can adversely affect the outcome of the study. Important factors to consider for estimating the sample size include the size of the study population, confidence level, expected proportion of the outcome variable (for categorical variables/standard deviation of the outcome variable (for numerical variables, and the required precision (margin of accuracy from the study. The more the precision required, the greater is the required sample size. Sampling Techniques: The probability sampling techniques applied for health related research include simple random sampling, systematic random sampling, stratified random sampling, cluster sampling, and multistage sampling. These are more recommended than the nonprobability sampling techniques, because the results of the study can be generalized to the target population.
Estimation of sample size and testing power (Part 4).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2012-01-01
Sample size estimation is necessary for any experimental or survey research. An appropriate estimation of sample size based on known information and statistical knowledge is of great significance. This article introduces methods of sample size estimation of difference test for data with the design of one factor with two levels, including sample size estimation formulas and realization based on the formulas and the POWER procedure of SAS software for quantitative data and qualitative data with the design of one factor with two levels. In addition, this article presents examples for analysis, which will play a leading role for researchers to implement the repetition principle during the research design phase.
Estimation of sample size and testing power (part 5).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2012-02-01
Estimation of sample size and testing power is an important component of research design. This article introduced methods for sample size and testing power estimation of difference test for quantitative and qualitative data with the single-group design, the paired design or the crossover design. To be specific, this article introduced formulas for sample size and testing power estimation of difference test for quantitative and qualitative data with the above three designs, the realization based on the formulas and the POWER procedure of SAS software and elaborated it with examples, which will benefit researchers for implementing the repetition principle.
Estimating Sample Size for Usability Testing
Directory of Open Access Journals (Sweden)
Alex Cazañas
2017-02-01
Full Text Available One strategy used to assure that an interface meets user requirements is to conduct usability testing. When conducting such testing one of the unknowns is sample size. Since extensive testing is costly, minimizing the number of participants can contribute greatly to successful resource management of a project. Even though a significant number of models have been proposed to estimate sample size in usability testing, there is still not consensus on the optimal size. Several studies claim that 3 to 5 users suffice to uncover 80% of problems in a software interface. However, many other studies challenge this assertion. This study analyzed data collected from the user testing of a web application to verify the rule of thumb, commonly known as the “magic number 5”. The outcomes of the analysis showed that the 5-user rule significantly underestimates the required sample size to achieve reasonable levels of problem detection.
Determining Sample Size for Accurate Estimation of the Squared Multiple Correlation Coefficient.
Algina, James; Olejnik, Stephen
2000-01-01
Discusses determining sample size for estimation of the squared multiple correlation coefficient and presents regression equations that permit determination of the sample size for estimating this parameter for up to 20 predictor variables. (SLD)
Optimum sample size to estimate mean parasite abundance in fish parasite surveys
Directory of Open Access Journals (Sweden)
Shvydka S.
2018-03-01
Full Text Available To reach ethically and scientifically valid mean abundance values in parasitological and epidemiological studies this paper considers analytic and simulation approaches for sample size determination. The sample size estimation was carried out by applying mathematical formula with predetermined precision level and parameter of the negative binomial distribution estimated from the empirical data. A simulation approach to optimum sample size determination aimed at the estimation of true value of the mean abundance and its confidence interval (CI was based on the Bag of Little Bootstraps (BLB. The abundance of two species of monogenean parasites Ligophorus cephali and L. mediterraneus from Mugil cephalus across the Azov-Black Seas localities were subjected to the analysis. The dispersion pattern of both helminth species could be characterized as a highly aggregated distribution with the variance being substantially larger than the mean abundance. The holistic approach applied here offers a wide range of appropriate methods in searching for the optimum sample size and the understanding about the expected precision level of the mean. Given the superior performance of the BLB relative to formulae with its few assumptions, the bootstrap procedure is the preferred method. Two important assessments were performed in the present study: i based on CIs width a reasonable precision level for the mean abundance in parasitological surveys of Ligophorus spp. could be chosen between 0.8 and 0.5 with 1.6 and 1x mean of the CIs width, and ii the sample size equal 80 or more host individuals allows accurate and precise estimation of mean abundance. Meanwhile for the host sample size in range between 25 and 40 individuals, the median estimates showed minimal bias but the sampling distribution skewed to the low values; a sample size of 10 host individuals yielded to unreliable estimates.
Sampling strategies for estimating brook trout effective population size
Andrew R. Whiteley; Jason A. Coombs; Mark Hudy; Zachary Robinson; Keith H. Nislow; Benjamin H. Letcher
2012-01-01
The influence of sampling strategy on estimates of effective population size (Ne) from single-sample genetic methods has not been rigorously examined, though these methods are increasingly used. For headwater salmonids, spatially close kin association among age-0 individuals suggests that sampling strategy (number of individuals and location from...
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-09-01
In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous
Sample size for estimation of the Pearson correlation coefficient in cherry tomato tests
Directory of Open Access Journals (Sweden)
Bruno Giacomini Sari
2017-09-01
Full Text Available ABSTRACT: The aim of this study was to determine the required sample size for estimation of the Pearson coefficient of correlation between cherry tomato variables. Two uniformity tests were set up in a protected environment in the spring/summer of 2014. The observed variables in each plant were mean fruit length, mean fruit width, mean fruit weight, number of bunches, number of fruits per bunch, number of fruits, and total weight of fruits, with calculation of the Pearson correlation matrix between them. Sixty eight sample sizes were planned for one greenhouse and 48 for another, with the initial sample size of 10 plants, and the others were obtained by adding five plants. For each planned sample size, 3000 estimates of the Pearson correlation coefficient were obtained through bootstrap re-samplings with replacement. The sample size for each correlation coefficient was determined when the 95% confidence interval amplitude value was less than or equal to 0.4. Obtaining estimates of the Pearson correlation coefficient with high precision is difficult for parameters with a weak linear relation. Accordingly, a larger sample size is necessary to estimate them. Linear relations involving variables dealing with size and number of fruits per plant have less precision. To estimate the coefficient of correlation between productivity variables of cherry tomato, with a confidence interval of 95% equal to 0.4, it is necessary to sample 275 plants in a 250m² greenhouse, and 200 plants in a 200m² greenhouse.
Effects of sample size on estimates of population growth rates calculated with matrix models.
Directory of Open Access Journals (Sweden)
Ian J Fiske
Full Text Available BACKGROUND: Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. METHODOLOGY/PRINCIPAL FINDINGS: Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. CONCLUSIONS/SIGNIFICANCE: We found significant bias at small sample sizes when survival was low (survival = 0.5, and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high
Effects of sample size on estimates of population growth rates calculated with matrix models.
Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M
2008-08-28
Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
Estimation of sample size and testing power (Part 3).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2011-12-01
This article introduces the definition and sample size estimation of three special tests (namely, non-inferiority test, equivalence test and superiority test) for qualitative data with the design of one factor with two levels having a binary response variable. Non-inferiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is not clinically inferior to that of the positive control drug. Equivalence test refers to the research design of which the objective is to verify that the experimental drug and the control drug have clinically equivalent efficacy. Superiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is clinically superior to that of the control drug. By specific examples, this article introduces formulas of sample size estimation for the three special tests, and their SAS realization in detail.
Estimation of sample size and testing power (part 6).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2012-03-01
The design of one factor with k levels (k ≥ 3) refers to the research that only involves one experimental factor with k levels (k ≥ 3), and there is no arrangement for other important non-experimental factors. This paper introduces the estimation of sample size and testing power for quantitative data and qualitative data having a binary response variable with the design of one factor with k levels (k ≥ 3).
Sample size methods for estimating HIV incidence from cross-sectional surveys.
Konikoff, Jacob; Brookmeyer, Ron
2015-12-01
Understanding HIV incidence, the rate at which new infections occur in populations, is critical for tracking and surveillance of the epidemic. In this article, we derive methods for determining sample sizes for cross-sectional surveys to estimate incidence with sufficient precision. We further show how to specify sample sizes for two successive cross-sectional surveys to detect changes in incidence with adequate power. In these surveys biomarkers such as CD4 cell count, viral load, and recently developed serological assays are used to determine which individuals are in an early disease stage of infection. The total number of individuals in this stage, divided by the number of people who are uninfected, is used to approximate the incidence rate. Our methods account for uncertainty in the durations of time spent in the biomarker defined early disease stage. We find that failure to account for this uncertainty when designing surveys can lead to imprecise estimates of incidence and underpowered studies. We evaluated our sample size methods in simulations and found that they performed well in a variety of underlying epidemics. Code for implementing our methods in R is available with this article at the Biometrics website on Wiley Online Library. © 2015, The International Biometric Society.
Desu, M M
2012-01-01
One of the most important problems in designing an experiment or a survey is sample size determination and this book presents the currently available methodology. It includes both random sampling from standard probability distributions and from finite populations. Also discussed is sample size determination for estimating parameters in a Bayesian setting by considering the posterior distribution of the parameter and specifying the necessary requirements. The determination of the sample size is considered for ranking and selection problems as well as for the design of clinical trials. Appropria
B-graph sampling to estimate the size of a hidden population
Spreen, M.; Bogaerts, S.
2015-01-01
Link-tracing designs are often used to estimate the size of hidden populations by utilizing the relational links between their members. A major problem in studies of hidden populations is the lack of a convenient sampling frame. The most frequently applied design in studies of hidden populations is
Evaluating the performance of species richness estimators: sensitivity to sample grain size
DEFF Research Database (Denmark)
Hortal, Joaquín; Borges, Paulo A. V.; Gaspar, Clara
2006-01-01
and several recent estimators [proposed by Rosenzweig et al. (Conservation Biology, 2003, 17, 864-874), and Ugland et al. (Journal of Animal Ecology, 2003, 72, 888-897)] performed poorly. 3. Estimations developed using the smaller grain sizes (pair of traps, traps, records and individuals) presented similar....... Data obtained with standardized sampling of 78 transects in natural forest remnants of five islands were aggregated in seven different grains (i.e. ways of defining a single sample): islands, natural areas, transects, pairs of traps, traps, database records and individuals to assess the effect of using...
Ellison, Laura E.; Lukacs, Paul M.
2014-01-01
Concern for migratory tree-roosting bats in North America has grown because of possible population declines from wind energy development. This concern has driven interest in estimating population-level changes. Mark-recapture methodology is one possible analytical framework for assessing bat population changes, but sample size requirements to produce reliable estimates have not been estimated. To illustrate the sample sizes necessary for a mark-recapture-based monitoring program we conducted power analyses using a statistical model that allows reencounters of live and dead marked individuals. We ran 1,000 simulations for each of five broad sample size categories in a Burnham joint model, and then compared the proportion of simulations in which 95% confidence intervals overlapped between and among years for a 4-year study. Additionally, we conducted sensitivity analyses of sample size to various capture probabilities and recovery probabilities. More than 50,000 individuals per year would need to be captured and released to accurately determine 10% and 15% declines in annual survival. To detect more dramatic declines of 33% or 50% survival over four years, then sample sizes of 25,000 or 10,000 per year, respectively, would be sufficient. Sensitivity analyses reveal that increasing recovery of dead marked individuals may be more valuable than increasing capture probability of marked individuals. Because of the extraordinary effort that would be required, we advise caution should such a mark-recapture effort be initiated because of the difficulty in attaining reliable estimates. We make recommendations for what techniques show the most promise for mark-recapture studies of bats because some techniques violate the assumptions of mark-recapture methodology when used to mark bats.
Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA.
Kelly, Brendan J; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D; Collman, Ronald G; Bushman, Frederic D; Li, Hongzhe
2015-08-01
The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence-absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-group distance and exposure/intervention effect size must be accurately modeled to estimate statistical power for a microbiome study that will be analyzed with pairwise distances and PERMANOVA. We present a framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a novel method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared (ω2). From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Wan, Xiang; Wang, Wenqian; Liu, Jiming; Tong, Tiejun
2014-12-19
In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials. In this paper, we propose to improve the existing literature in several directions. First, we show that the sample standard deviation estimation in Hozo et al.'s method (BMC Med Res Methodol 5:13, 2005) has some serious limitations and is always less satisfactory in practice. Inspired by this, we propose a new estimation method by incorporating the sample size. Second, we systematically study the sample mean and standard deviation estimation problem under several other interesting settings where the interquartile range is also available for the trials. We demonstrate the performance of the proposed methods through simulation studies for the three frequently encountered scenarios, respectively. For the first two scenarios, our method greatly improves existing methods and provides a nearly unbiased estimate of the true sample standard deviation for normal data and a slightly biased estimate for skewed data. For the third scenario, our method still performs very well for both normal data and skewed data. Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications. In this paper, we discuss different approximation methods in the estimation of the sample mean and standard deviation and propose some new estimation methods to improve the existing literature. We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different
A simple nomogram for sample size for estimating sensitivity and specificity of medical tests
Directory of Open Access Journals (Sweden)
Malhotra Rajeev
2010-01-01
Full Text Available Sensitivity and specificity measure inherent validity of a diagnostic test against a gold standard. Researchers develop new diagnostic methods to reduce the cost, risk, invasiveness, and time. Adequate sample size is a must to precisely estimate the validity of a diagnostic test. In practice, researchers generally decide about the sample size arbitrarily either at their convenience, or from the previous literature. We have devised a simple nomogram that yields statistically valid sample size for anticipated sensitivity or anticipated specificity. MS Excel version 2007 was used to derive the values required to plot the nomogram using varying absolute precision, known prevalence of disease, and 95% confidence level using the formula already available in the literature. The nomogram plot was obtained by suitably arranging the lines and distances to conform to this formula. This nomogram could be easily used to determine the sample size for estimating the sensitivity or specificity of a diagnostic test with required precision and 95% confidence level. Sample size at 90% and 99% confidence level, respectively, can also be obtained by just multiplying 0.70 and 1.75 with the number obtained for the 95% confidence level. A nomogram instantly provides the required number of subjects by just moving the ruler and can be repeatedly used without redoing the calculations. This can also be applied for reverse calculations. This nomogram is not applicable for testing of the hypothesis set-up and is applicable only when both diagnostic test and gold standard results have a dichotomous category.
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
Lanfear, Robert; Hua, Xia; Warren, Dan L.
2016-01-01
Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794
Luo, Shezhou; Chen, Jing M; Wang, Cheng; Xi, Xiaohuan; Zeng, Hongcheng; Peng, Dailiang; Li, Dong
2016-05-30
Vegetation leaf area index (LAI), height, and aboveground biomass are key biophysical parameters. Corn is an important and globally distributed crop, and reliable estimations of these parameters are essential for corn yield forecasting, health monitoring and ecosystem modeling. Light Detection and Ranging (LiDAR) is considered an effective technology for estimating vegetation biophysical parameters. However, the estimation accuracies of these parameters are affected by multiple factors. In this study, we first estimated corn LAI, height and biomass (R2 = 0.80, 0.874 and 0.838, respectively) using the original LiDAR data (7.32 points/m2), and the results showed that LiDAR data could accurately estimate these biophysical parameters. Second, comprehensive research was conducted on the effects of LiDAR point density, sampling size and height threshold on the estimation accuracy of LAI, height and biomass. Our findings indicated that LiDAR point density had an important effect on the estimation accuracy for vegetation biophysical parameters, however, high point density did not always produce highly accurate estimates, and reduced point density could deliver reasonable estimation results. Furthermore, the results showed that sampling size and height threshold were additional key factors that affect the estimation accuracy of biophysical parameters. Therefore, the optimal sampling size and the height threshold should be determined to improve the estimation accuracy of biophysical parameters. Our results also implied that a higher LiDAR point density, larger sampling size and height threshold were required to obtain accurate corn LAI estimation when compared with height and biomass estimations. In general, our results provide valuable guidance for LiDAR data acquisition and estimation of vegetation biophysical parameters using LiDAR data.
Lusiana, Evellin Dewi
2017-12-01
The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
Willie, Jacob; Petre, Charles-Albert; Tagg, Nikki; Lens, Luc
2012-11-01
Data from forest herbaceous plants in a site of known species richness in Cameroon were used to test the performance of rarefaction and eight species richness estimators (ACE, ICE, Chao1, Chao2, Jack1, Jack2, Bootstrap and MM). Bias, accuracy, precision and sensitivity to patchiness and sample grain size were the evaluation criteria. An evaluation of the effects of sampling effort and patchiness on diversity estimation is also provided. Stems were identified and counted in linear series of 1-m2 contiguous square plots distributed in six habitat types. Initially, 500 plots were sampled in each habitat type. The sampling process was monitored using rarefaction and a set of richness estimator curves. Curves from the first dataset suggested adequate sampling in riparian forest only. Additional plots ranging from 523 to 2143 were subsequently added in the undersampled habitats until most of the curves stabilized. Jack1 and ICE, the non-parametric richness estimators, performed better, being more accurate and less sensitive to patchiness and sample grain size, and significantly reducing biases that could not be detected by rarefaction and other estimators. This study confirms the usefulness of non-parametric incidence-based estimators, and recommends Jack1 or ICE alongside rarefaction while describing taxon richness and comparing results across areas sampled using similar or different grain sizes. As patchiness varied across habitat types, accurate estimations of diversity did not require the same number of plots. The number of samples needed to fully capture diversity is not necessarily the same across habitats, and can only be known when taxon sampling curves have indicated adequate sampling. Differences in observed species richness between habitats were generally due to differences in patchiness, except between two habitats where they resulted from differences in abundance. We suggest that communities should first be sampled thoroughly using appropriate taxon sampling
Novikov, I; Fund, N; Freedman, L S
2010-01-15
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
Terry, Leann; Kelley, Ken
2012-11-01
Composite measures play an important role in psychology and related disciplines. Composite measures almost always have error. Correspondingly, it is important to understand the reliability of the scores from any particular composite measure. However, the point estimates of the reliability of composite measures are fallible and thus all such point estimates should be accompanied by a confidence interval. When confidence intervals are wide, there is much uncertainty in the population value of the reliability coefficient. Given the importance of reporting confidence intervals for estimates of reliability, coupled with the undesirability of wide confidence intervals, we develop methods that allow researchers to plan sample size in order to obtain narrow confidence intervals for population reliability coefficients. We first discuss composite reliability coefficients and then provide a discussion on confidence interval formation for the corresponding population value. Using the accuracy in parameter estimation approach, we develop two methods to obtain accurate estimates of reliability by planning sample size. The first method provides a way to plan sample size so that the expected confidence interval width for the population reliability coefficient is sufficiently narrow. The second method ensures that the confidence interval width will be sufficiently narrow with some desired degree of assurance (e.g., 99% assurance that the 95% confidence interval for the population reliability coefficient will be less than W units wide). The effectiveness of our methods was verified with Monte Carlo simulation studies. We demonstrate how to easily implement the methods with easy-to-use and freely available software. ©2011 The British Psychological Society.
Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient
Krishnamoorthy, K.; Xia, Yanping
2008-01-01
The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…
Jun, Jae Kwan; Kim, Mi Jin; Choi, Kui Son; Suh, Mina; Jung, Kyu-Won
2012-01-01
Mammographic breast density is a known risk factor for breast cancer. To conduct a survey to estimate the distribution of mammographic breast density in Korean women, appropriate sampling strategies for representative and efficient sampling design were evaluated through simulation. Using the target population from the National Cancer Screening Programme (NCSP) for breast cancer in 2009, we verified the distribution estimate by repeating the simulation 1,000 times using stratified random sampling to investigate the distribution of breast density of 1,340,362 women. According to the simulation results, using a sampling design stratifying the nation into three groups (metropolitan, urban, and rural), with a total sample size of 4,000, we estimated the distribution of breast density in Korean women at a level of 0.01% tolerance. Based on the results of our study, a nationwide survey for estimating the distribution of mammographic breast density among Korean women can be conducted efficiently.
A Web-based Simulator for Sample Size and Power Estimation in Animal Carcinogenicity Studies
Directory of Open Access Journals (Sweden)
Hojin Moon
2002-12-01
Full Text Available A Web-based statistical tool for sample size and power estimation in animal carcinogenicity studies is presented in this paper. It can be used to provide a design with sufficient power for detecting a dose-related trend in the occurrence of a tumor of interest when competing risks are present. The tumors of interest typically are occult tumors for which the time to tumor onset is not directly observable. It is applicable to rodent tumorigenicity assays that have either a single terminal sacrifice or multiple (interval sacrifices. The design is achieved by varying sample size per group, number of sacrifices, number of sacrificed animals at each interval, if any, and scheduled time points for sacrifice. Monte Carlo simulation is carried out in this tool to simulate experiments of rodent bioassays because no closed-form solution is available. It takes design parameters for sample size and power estimation as inputs through the World Wide Web. The core program is written in C and executed in the background. It communicates with the Web front end via a Component Object Model interface passing an Extensible Markup Language string. The proposed statistical tool is illustrated with an animal study in lung cancer prevention research.
DEFF Research Database (Denmark)
Kostoulas, P.; Nielsen, Søren Saxmose; Browne, W. J.
2013-01-01
and power when applied to these groups. We propose the use of the variance partition coefficient (VPC), which measures the clustering of infection/disease for individuals with a common risk profile. Sample size estimates are obtained separately for those groups that exhibit markedly different heterogeneity......, thus, optimizing resource allocation. A VPC-based predictive simulation method for sample size estimation to substantiate freedom from disease is presented. To illustrate the benefits of the proposed approach we give two examples with the analysis of data from a risk factor study on Mycobacterium avium...
Effects of sample size on estimation of rainfall extremes at high temperatures
Boessenkool, Berry; Bürger, Gerd; Heistermann, Maik
2017-09-01
High precipitation quantiles tend to rise with temperature, following the so-called Clausius-Clapeyron (CC) scaling. It is often reported that the CC-scaling relation breaks down and even reverts for very high temperatures. In our study, we investigate this reversal using observational climate data from 142 stations across Germany. One of the suggested meteorological explanations for the breakdown is limited moisture supply. Here we argue that, instead, it could simply originate from undersampling. As rainfall frequency generally decreases with higher temperatures, rainfall intensities as dictated by CC scaling are less likely to be recorded than for moderate temperatures. Empirical quantiles are conventionally estimated from order statistics via various forms of plotting position formulas. They have in common that their largest representable return period is given by the sample size. In small samples, high quantiles are underestimated accordingly. The small-sample effect is weaker, or disappears completely, when using parametric quantile estimates from a generalized Pareto distribution (GPD) fitted with L moments. For those, we obtain quantiles of rainfall intensities that continue to rise with temperature.
Effects of sample size on estimation of rainfall extremes at high temperatures
Directory of Open Access Journals (Sweden)
B. Boessenkool
2017-09-01
Full Text Available High precipitation quantiles tend to rise with temperature, following the so-called Clausius–Clapeyron (CC scaling. It is often reported that the CC-scaling relation breaks down and even reverts for very high temperatures. In our study, we investigate this reversal using observational climate data from 142 stations across Germany. One of the suggested meteorological explanations for the breakdown is limited moisture supply. Here we argue that, instead, it could simply originate from undersampling. As rainfall frequency generally decreases with higher temperatures, rainfall intensities as dictated by CC scaling are less likely to be recorded than for moderate temperatures. Empirical quantiles are conventionally estimated from order statistics via various forms of plotting position formulas. They have in common that their largest representable return period is given by the sample size. In small samples, high quantiles are underestimated accordingly. The small-sample effect is weaker, or disappears completely, when using parametric quantile estimates from a generalized Pareto distribution (GPD fitted with L moments. For those, we obtain quantiles of rainfall intensities that continue to rise with temperature.
Choosing a suitable sample size in descriptive sampling
International Nuclear Information System (INIS)
Lee, Yong Kyun; Choi, Dong Hoon; Cha, Kyung Joon
2010-01-01
Descriptive sampling (DS) is an alternative to crude Monte Carlo sampling (CMCS) in finding solutions to structural reliability problems. It is known to be an effective sampling method in approximating the distribution of a random variable because it uses the deterministic selection of sample values and their random permutation,. However, because this method is difficult to apply to complex simulations, the sample size is occasionally determined without thorough consideration. Input sample variability may cause the sample size to change between runs, leading to poor simulation results. This paper proposes a numerical method for choosing a suitable sample size for use in DS. Using this method, one can estimate a more accurate probability of failure in a reliability problem while running a minimal number of simulations. The method is then applied to several examples and compared with CMCS and conventional DS to validate its usefulness and efficiency
Sample size calculation in metabolic phenotyping studies.
Billoir, Elise; Navratil, Vincent; Blaise, Benjamin J
2015-09-01
The number of samples needed to identify significant effects is a key question in biomedical studies, with consequences on experimental designs, costs and potential discoveries. In metabolic phenotyping studies, sample size determination remains a complex step. This is due particularly to the multiple hypothesis-testing framework and the top-down hypothesis-free approach, with no a priori known metabolic target. Until now, there was no standard procedure available to address this purpose. In this review, we discuss sample size estimation procedures for metabolic phenotyping studies. We release an automated implementation of the Data-driven Sample size Determination (DSD) algorithm for MATLAB and GNU Octave. Original research concerning DSD was published elsewhere. DSD allows the determination of an optimized sample size in metabolic phenotyping studies. The procedure uses analytical data only from a small pilot cohort to generate an expanded data set. The statistical recoupling of variables procedure is used to identify metabolic variables, and their intensity distributions are estimated by Kernel smoothing or log-normal density fitting. Statistically significant metabolic variations are evaluated using the Benjamini-Yekutieli correction and processed for data sets of various sizes. Optimal sample size determination is achieved in a context of biomarker discovery (at least one statistically significant variation) or metabolic exploration (a maximum of statistically significant variations). DSD toolbox is encoded in MATLAB R2008A (Mathworks, Natick, MA) for Kernel and log-normal estimates, and in GNU Octave for log-normal estimates (Kernel density estimates are not robust enough in GNU octave). It is available at http://www.prabi.fr/redmine/projects/dsd/repository, with a tutorial at http://www.prabi.fr/redmine/projects/dsd/wiki. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Concepts in sample size determination
Directory of Open Access Journals (Sweden)
Umadevi K Rao
2012-01-01
Full Text Available Investigators involved in clinical, epidemiological or translational research, have the drive to publish their results so that they can extrapolate their findings to the population. This begins with the preliminary step of deciding the topic to be studied, the subjects and the type of study design. In this context, the researcher must determine how many subjects would be required for the proposed study. Thus, the number of individuals to be included in the study, i.e., the sample size is an important consideration in the design of many clinical studies. The sample size determination should be based on the difference in the outcome between the two groups studied as in an analytical study, as well as on the accepted p value for statistical significance and the required statistical power to test a hypothesis. The accepted risk of type I error or alpha value, which by convention is set at the 0.05 level in biomedical research defines the cutoff point at which the p value obtained in the study is judged as significant or not. The power in clinical research is the likelihood of finding a statistically significant result when it exists and is typically set to >80%. This is necessary since the most rigorously executed studies may fail to answer the research question if the sample size is too small. Alternatively, a study with too large a sample size will be difficult and will result in waste of time and resources. Thus, the goal of sample size planning is to estimate an appropriate number of subjects for a given study design. This article describes the concepts in estimating the sample size.
Sample size for morphological traits of pigeonpea
Directory of Open Access Journals (Sweden)
Giovani Facco
2015-12-01
Full Text Available The objectives of this study were to determine the sample size (i.e., number of plants required to accurately estimate the average of morphological traits of pigeonpea (Cajanus cajan L. and to check for variability in sample size between evaluation periods and seasons. Two uniformity trials (i.e., experiments without treatment were conducted for two growing seasons. In the first season (2011/2012, the seeds were sown by broadcast seeding, and in the second season (2012/2013, the seeds were sown in rows spaced 0.50 m apart. The ground area in each experiment was 1,848 m2, and 360 plants were marked in the central area, in a 2 m × 2 m grid. Three morphological traits (e.g., number of nodes, plant height and stem diameter were evaluated 13 times during the first season and 22 times in the second season. Measurements for all three morphological traits were normally distributed and confirmed through the Kolmogorov-Smirnov test. Randomness was confirmed using the Run Test, and the descriptive statistics were calculated. For each trait, the sample size (n was calculated for the semiamplitudes of the confidence interval (i.e., estimation error equal to 2, 4, 6, ..., 20% of the estimated mean with a confidence coefficient (1-? of 95%. Subsequently, n was fixed at 360 plants, and the estimation error of the estimated percentage of the average for each trait was calculated. Variability of the sample size for the pigeonpea culture was observed between the morphological traits evaluated, among the evaluation periods and between seasons. Therefore, to assess with an accuracy of 6% of the estimated average, at least 136 plants must be evaluated throughout the pigeonpea crop cycle to determine the sample size for the traits (e.g., number of nodes, plant height and stem diameter in the different evaluation periods and between seasons.
[Practical aspects regarding sample size in clinical research].
Vega Ramos, B; Peraza Yanes, O; Herrera Correa, G; Saldívar Toraya, S
1996-01-01
The knowledge of the right sample size let us to be sure if the published results in medical papers had a suitable design and a proper conclusion according to the statistics analysis. To estimate the sample size we must consider the type I error, type II error, variance, the size of the effect, significance and power of the test. To decide what kind of mathematics formula will be used, we must define what kind of study we have, it means if its a prevalence study, a means values one or a comparative one. In this paper we explain some basic topics of statistics and we describe four simple samples of estimation of sample size.
Directory of Open Access Journals (Sweden)
Manan Gupta
Full Text Available Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates
Multiple sensitive estimation and optimal sample size allocation in the item sum technique.
Perri, Pier Francesco; Rueda García, María Del Mar; Cobo Rodríguez, Beatriz
2018-01-01
For surveys of sensitive issues in life sciences, statistical procedures can be used to reduce nonresponse and social desirability response bias. Both of these phenomena provoke nonsampling errors that are difficult to deal with and can seriously flaw the validity of the analyses. The item sum technique (IST) is a very recent indirect questioning method derived from the item count technique that seeks to procure more reliable responses on quantitative items than direct questioning while preserving respondents' anonymity. This article addresses two important questions concerning the IST: (i) its implementation when two or more sensitive variables are investigated and efficient estimates of their unknown population means are required; (ii) the determination of the optimal sample size to achieve minimum variance estimates. These aspects are of great relevance for survey practitioners engaged in sensitive research and, to the best of our knowledge, were not studied so far. In this article, theoretical results for multiple estimation and optimal allocation are obtained under a generic sampling design and then particularized to simple random sampling and stratified sampling designs. Theoretical considerations are integrated with a number of simulation studies based on data from two real surveys and conducted to ascertain the efficiency gain derived from optimal allocation in different situations. One of the surveys concerns cannabis consumption among university students. Our findings highlight some methodological advances that can be obtained in life sciences IST surveys when optimal allocation is achieved. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Directory of Open Access Journals (Sweden)
Michael B.C. Khoo
2013-11-01
Full Text Available The double sampling (DS X bar chart, one of the most widely-used charting methods, is superior for detecting small and moderate shifts in the process mean. In a right skewed run length distribution, the median run length (MRL provides a more credible representation of the central tendency than the average run length (ARL, as the mean is greater than the median. In this paper, therefore, MRL is used as the performance criterion instead of the traditional ARL. Generally, the performance of the DS X bar chart is investigated under the assumption of known process parameters. In practice, these parameters are usually estimated from an in-control reference Phase-I dataset. Since the performance of the DS X bar chart is significantly affected by estimation errors, we study the effects of parameter estimation on the MRL-based DS X bar chart when the in-control average sample size is minimised. This study reveals that more than 80 samples are required for the MRL-based DS X bar chart with estimated parameters to perform more favourably than the corresponding chart with known parameters.
Estimating spatio-temporal dynamics of size-structured populations
DEFF Research Database (Denmark)
Kristensen, Kasper; Thygesen, Uffe Høgsbro; Andersen, Ken Haste
2014-01-01
with simple stock dynamics, to estimate simultaneously how size distributions and spatial distributions develop in time. We demonstrate the method for a cod population sampled by trawl surveys. Particular attention is paid to correlation between size classes within each trawl haul due to clustering...... of individuals with similar size. The model estimates growth, mortality and reproduction, after which any aspect of size-structure, spatio-temporal population dynamics, as well as the sampling process can be probed. This is illustrated by two applications: 1) tracking the spatial movements of a single cohort...
A Model Based Approach to Sample Size Estimation in Recent Onset Type 1 Diabetes
Bundy, Brian; Krischer, Jeffrey P.
2016-01-01
The area under the curve C-peptide following a 2-hour mixed meal tolerance test from 481 individuals enrolled on 5 prior TrialNet studies of recent onset type 1 diabetes from baseline to 12 months after enrollment were modelled to produce estimates of its rate of loss and variance. Age at diagnosis and baseline C-peptide were found to be significant predictors and adjusting for these in an ANCOVA resulted in estimates with lower variance. Using these results as planning parameters for new studies results in a nearly 50% reduction in the target sample size. The modelling also produces an expected C-peptide that can be used in Observed vs. Expected calculations to estimate the presumption of benefit in ongoing trials. PMID:26991448
Directory of Open Access Journals (Sweden)
Rocío Joo
2017-04-01
Full Text Available The length distribution of catches represents a fundamental source of information for estimating growth and spatio-temporal dynamics of cohorts. The length distribution of caught is estimated based on samples of catched individuals. This work studies the optimum sample size of individuals at each fishing set in order to obtain a representative sample of the length and the proportion of juveniles in the fishing set. For that matter, we use anchovy (Engraulis ringens length data from different fishing sets recorded by observers at-sea from the On-board Observers Program from the Peruvian Marine Research Institute. Finally, we propose an optimum sample size for obtaining robust size and juvenile estimations. Though the application of this work corresponds to the anchovy fishery, the procedure can be applied to any fishery, either for on board or inland biometric measurements.
Sample size calculation for comparing two negative binomial rates.
Zhu, Haiyuan; Lakkis, Hassan
2014-02-10
Negative binomial model has been increasingly used to model the count data in recent clinical trials. It is frequently chosen over Poisson model in cases of overdispersed count data that are commonly seen in clinical trials. One of the challenges of applying negative binomial model in clinical trial design is the sample size estimation. In practice, simulation methods have been frequently used for sample size estimation. In this paper, an explicit formula is developed to calculate sample size based on the negative binomial model. Depending on different approaches to estimate the variance under null hypothesis, three variations of the sample size formula are proposed and discussed. Important characteristics of the formula include its accuracy and its ability to explicitly incorporate dispersion parameter and exposure time. The performance of the formula with each variation is assessed using simulations. Copyright © 2013 John Wiley & Sons, Ltd.
A model-based approach to sample size estimation in recent onset type 1 diabetes.
Bundy, Brian N; Krischer, Jeffrey P
2016-11-01
The area under the curve C-peptide following a 2-h mixed meal tolerance test from 498 individuals enrolled on five prior TrialNet studies of recent onset type 1 diabetes from baseline to 12 months after enrolment were modelled to produce estimates of its rate of loss and variance. Age at diagnosis and baseline C-peptide were found to be significant predictors, and adjusting for these in an ANCOVA resulted in estimates with lower variance. Using these results as planning parameters for new studies results in a nearly 50% reduction in the target sample size. The modelling also produces an expected C-peptide that can be used in observed versus expected calculations to estimate the presumption of benefit in ongoing trials. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Revisiting sample size: are big trials the answer?
Lurati Buse, Giovanna A L; Botto, Fernando; Devereaux, P J
2012-07-18
The superiority of the evidence generated in randomized controlled trials over observational data is not only conditional to randomization. Randomized controlled trials require proper design and implementation to provide a reliable effect estimate. Adequate random sequence generation, allocation implementation, analyses based on the intention-to-treat principle, and sufficient power are crucial to the quality of a randomized controlled trial. Power, or the probability of the trial to detect a difference when a real difference between treatments exists, strongly depends on sample size. The quality of orthopaedic randomized controlled trials is frequently threatened by a limited sample size. This paper reviews basic concepts and pitfalls in sample-size estimation and focuses on the importance of large trials in the generation of valid evidence.
International Nuclear Information System (INIS)
Reiser, I; Lu, Z
2014-01-01
Purpose: Recently, task-based assessment of diagnostic CT systems has attracted much attention. Detection task performance can be estimated using human observers, or mathematical observer models. While most models are well established, considerable bias can be introduced when performance is estimated from a limited number of image samples. Thus, the purpose of this work was to assess the effect of sample size on bias and uncertainty of two channelized Hotelling observers and a template-matching observer. Methods: The image data used for this study consisted of 100 signal-present and 100 signal-absent regions-of-interest, which were extracted from CT slices. The experimental conditions included two signal sizes and five different x-ray beam current settings (mAs). Human observer performance for these images was determined in 2-alternative forced choice experiments. These data were provided by the Mayo clinic in Rochester, MN. Detection performance was estimated from three observer models, including channelized Hotelling observers (CHO) with Gabor or Laguerre-Gauss (LG) channels, and a template-matching observer (TM). Different sample sizes were generated by randomly selecting a subset of image pairs, (N=20,40,60,80). Observer performance was quantified as proportion of correct responses (PC). Bias was quantified as the relative difference of PC for 20 and 80 image pairs. Results: For n=100, all observer models predicted human performance across mAs and signal sizes. Bias was 23% for CHO (Gabor), 7% for CHO (LG), and 3% for TM. The relative standard deviation, σ(PC)/PC at N=20 was highest for the TM observer (11%) and lowest for the CHO (Gabor) observer (5%). Conclusion: In order to make image quality assessment feasible in the clinical practice, a statistically efficient observer model, that can predict performance from few samples, is needed. Our results identified two observer models that may be suited for this task
Uijlenhoet, R.; Porrà, J.M.; Sempere Torres, D.; Creutin, J.D.
2006-01-01
A stochastic model of the microstructure of rainfall is used to derive explicit expressions for the magnitude of the sampling fluctuations in rainfall properties estimated from raindrop size measurements in stationary rainfall. The model is a marked point process, in which the points represent the
DEFF Research Database (Denmark)
Gardi, Jonathan Eyal; Nyengaard, Jens Randel; Gundersen, Hans Jørgen Gottlieb
2008-01-01
examined, which in turn leads to any of the known stereological estimates, including size distributions and spatial distributions. The unbiasedness is not a function of the assumed relation between the weight and the structure, which is in practice always a biased relation from a stereological (integral......, the desired number of fields are sampled automatically with probability proportional to the weight and presented to the expert observer. Using any known stereological probe and estimator, the correct count in these fields leads to a simple, unbiased estimate of the total amount of structure in the sections...... geometric) point of view. The efficiency of the proportionator depends, however, directly on this relation to be positive. The sampling and estimation procedure is simulated in sections with characteristics and various kinds of noises in possibly realistic ranges. In all cases examined, the proportionator...
International Nuclear Information System (INIS)
Haugboel, Steven; Pinborg, Lars H.; Arfan, Haroon M.; Froekjaer, Vibe M.; Svarer, Claus; Knudsen, Gitte M.; Madsen, Jacob; Dyrby, Tim B.
2007-01-01
To determine the reproducibility of measurements of brain 5-HT 2A receptors with an [ 18 F]altanserin PET bolus/infusion approach. Further, to estimate the sample size needed to detect regional differences between two groups and, finally, to evaluate how partial volume correction affects reproducibility and the required sample size. For assessment of the variability, six subjects were investigated with [ 18 F]altanserin PET twice, at an interval of less than 2 weeks. The sample size required to detect a 20% difference was estimated from [ 18 F]altanserin PET studies in 84 healthy subjects. Regions of interest were automatically delineated on co-registered MR and PET images. In cortical brain regions with a high density of 5-HT 2A receptors, the outcome parameter (binding potential, BP 1 ) showed high reproducibility, with a median difference between the two group measurements of 6% (range 5-12%), whereas in regions with a low receptor density, BP 1 reproducibility was lower, with a median difference of 17% (range 11-39%). Partial volume correction reduced the variability in the sample considerably. The sample size required to detect a 20% difference in brain regions with high receptor density is approximately 27, whereas for low receptor binding regions the required sample size is substantially higher. This study demonstrates that [ 18 F]altanserin PET with a bolus/infusion design has very low variability, particularly in larger brain regions with high 5-HT 2A receptor density. Moreover, partial volume correction considerably reduces the sample size required to detect regional changes between groups. (orig.)
Species richness in soil bacterial communities: a proposed approach to overcome sample size bias.
Youssef, Noha H; Elshahed, Mostafa S
2008-09-01
Estimates of species richness based on 16S rRNA gene clone libraries are increasingly utilized to gauge the level of bacterial diversity within various ecosystems. However, previous studies have indicated that regardless of the utilized approach, species richness estimates obtained are dependent on the size of the analyzed clone libraries. We here propose an approach to overcome sample size bias in species richness estimates in complex microbial communities. Parametric (Maximum likelihood-based and rarefaction curve-based) and non-parametric approaches were used to estimate species richness in a library of 13,001 near full-length 16S rRNA clones derived from soil, as well as in multiple subsets of the original library. Species richness estimates obtained increased with the increase in library size. To obtain a sample size-unbiased estimate of species richness, we calculated the theoretical clone library sizes required to encounter the estimated species richness at various clone library sizes, used curve fitting to determine the theoretical clone library size required to encounter the "true" species richness, and subsequently determined the corresponding sample size-unbiased species richness value. Using this approach, sample size-unbiased estimates of 17,230, 15,571, and 33,912 were obtained for the ML-based, rarefaction curve-based, and ACE-1 estimators, respectively, compared to bias-uncorrected values of 15,009, 11,913, and 20,909.
DEFF Research Database (Denmark)
Haugbøl, Steven; Pinborg, Lars H; Arfan, Haroon M
2006-01-01
PURPOSE: To determine the reproducibility of measurements of brain 5-HT2A receptors with an [18F]altanserin PET bolus/infusion approach. Further, to estimate the sample size needed to detect regional differences between two groups and, finally, to evaluate how partial volume correction affects...... reproducibility and the required sample size. METHODS: For assessment of the variability, six subjects were investigated with [18F]altanserin PET twice, at an interval of less than 2 weeks. The sample size required to detect a 20% difference was estimated from [18F]altanserin PET studies in 84 healthy subjects....... Regions of interest were automatically delineated on co-registered MR and PET images. RESULTS: In cortical brain regions with a high density of 5-HT2A receptors, the outcome parameter (binding potential, BP1) showed high reproducibility, with a median difference between the two group measurements of 6...
Soetaert, K.; Heip, C.H.R.
1990-01-01
Diversity indices, although designed for comparative purposes, often cannot be used as such, due to their sample-size dependence. It is argued here that this dependence is more pronounced in high diversity than in low diversity assemblages and that indices more sensitive to rarer species require larger sample sizes to estimate diversity with reasonable precision than indices which put more weight on commoner species. This was tested for Hill's diversity number N sub(0) to N sub( proportional ...
Estimating HIES Data through Ratio and Regression Methods for Different Sampling Designs
Directory of Open Access Journals (Sweden)
Faqir Muhammad
2007-01-01
Full Text Available In this study, comparison has been made for different sampling designs, using the HIES data of North West Frontier Province (NWFP for 2001-02 and 1998-99 collected from the Federal Bureau of Statistics, Statistical Division, Government of Pakistan, Islamabad. The performance of the estimators has also been considered using bootstrap and Jacknife. A two-stage stratified random sample design is adopted by HIES. In the first stage, enumeration blocks and villages are treated as the first stage Primary Sampling Units (PSU. The sample PSU’s are selected with probability proportional to size. Secondary Sampling Units (SSU i.e., households are selected by systematic sampling with a random start. They have used a single study variable. We have compared the HIES technique with some other designs, which are: Stratified Simple Random Sampling. Stratified Systematic Sampling. Stratified Ranked Set Sampling. Stratified Two Phase Sampling. Ratio and Regression methods were applied with two study variables, which are: Income (y and Household sizes (x. Jacknife and Bootstrap are used for variance replication. Simple Random Sampling with sample size (462 to 561 gave moderate variances both by Jacknife and Bootstrap. By applying Systematic Sampling, we received moderate variance with sample size (467. In Jacknife with Systematic Sampling, we obtained variance of regression estimator greater than that of ratio estimator for a sample size (467 to 631. At a sample size (952 variance of ratio estimator gets greater than that of regression estimator. The most efficient design comes out to be Ranked set sampling compared with other designs. The Ranked set sampling with jackknife and bootstrap, gives minimum variance even with the smallest sample size (467. Two Phase sampling gave poor performance. Multi-stage sampling applied by HIES gave large variances especially if used with a single study variable.
The PowerAtlas: a power and sample size atlas for microarray experimental design and research
Directory of Open Access Journals (Sweden)
Wang Jelai
2006-02-01
Full Text Available Abstract Background Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips needed to address the multiple hypotheses with acceptable accuracy? Statistical methods exist for calculating power based upon a single hypothesis, using estimates of the variability in data from pilot studies. There is, however, a need for methods to estimate power and/or required sample sizes in situations where multiple hypotheses are being tested, such as in microarray experiments. In addition, investigators frequently do not have pilot data to estimate the sample sizes required for microarray studies. Results To address this challenge, we have developed a Microrarray PowerAtlas 1. The atlas enables estimation of statistical power by allowing investigators to appropriately plan studies by building upon previous studies that have similar experimental characteristics. Currently, there are sample sizes and power estimates based on 632 experiments from Gene Expression Omnibus (GEO. The PowerAtlas also permits investigators to upload their own pilot data and derive power and sample size estimates from these data. This resource will be updated regularly with new datasets from GEO and other databases such as The Nottingham Arabidopsis Stock Center (NASC. Conclusion This resource provides a valuable tool for investigators who are planning efficient microarray studies and estimating required sample sizes.
Estimated ventricle size using Evans index: reference values from a population-based sample.
Jaraj, D; Rabiei, K; Marlow, T; Jensen, C; Skoog, I; Wikkelsø, C
2017-03-01
Evans index is an estimate of ventricular size used in the diagnosis of idiopathic normal-pressure hydrocephalus (iNPH). Values >0.3 are considered pathological and are required by guidelines for the diagnosis of iNPH. However, there are no previous epidemiological studies on Evans index, and normal values in adults are thus not precisely known. We examined a representative sample to obtain reference values and descriptive data on Evans index. A population-based sample (n = 1235) of men and women aged ≥70 years was examined. The sample comprised people living in private households and residential care, systematically selected from the Swedish population register. Neuropsychiatric examinations, including head computed tomography, were performed between 1986 and 2000. Evans index ranged from 0.11 to 0.46. The mean value in the total sample was 0.28 (SD, 0.04) and 20.6% (n = 255) had values >0.3. Among men aged ≥80 years, the mean value of Evans index was 0.3 (SD, 0.03). Individuals with dementia had a mean value of Evans index of 0.31 (SD, 0.05) and those with radiological signs of iNPH had a mean value of 0.36 (SD, 0.04). A substantial number of subjects had ventricular enlargement according to current criteria. Clinicians and researchers need to be aware of the range of values among older individuals. © 2017 EAN.
Estimating sample size for a small-quadrat method of botanical ...
African Journals Online (AJOL)
Reports the results of a study conducted to determine an appropriate sample size for a small-quadrat method of botanical survey for application in the Mixed Bushveld of South Africa. Species density and grass density were measured using a small-quadrat method in eight plant communities in the Nylsvley Nature Reserve.
Comparison of distance sampling estimates to a known population ...
African Journals Online (AJOL)
Line-transect sampling was used to obtain abundance estimates of an Ant-eating Chat Myrmecocichla formicivora population to compare these with the true size of the population. The population size was determined by a long-term banding study, and abundance estimates were obtained by surveying line transects.
Candel, Math J J M; Van Breukelen, Gerard J P
2010-06-30
Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.
Olives, Casey; Valadez, Joseph J; Pagano, Marcello
2014-03-01
To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.
Evaluation of sampling strategies to estimate crown biomass
Directory of Open Access Journals (Sweden)
Krishna P Poudel
2015-01-01
Full Text Available Background Depending on tree and site characteristics crown biomass accounts for a significant portion of the total aboveground biomass in the tree. Crown biomass estimation is useful for different purposes including evaluating the economic feasibility of crown utilization for energy production or forest products, fuel load assessments and fire management strategies, and wildfire modeling. However, crown biomass is difficult to predict because of the variability within and among species and sites. Thus the allometric equations used for predicting crown biomass should be based on data collected with precise and unbiased sampling strategies. In this study, we evaluate the performance different sampling strategies to estimate crown biomass and to evaluate the effect of sample size in estimating crown biomass. Methods Using data collected from 20 destructively sampled trees, we evaluated 11 different sampling strategies using six evaluation statistics: bias, relative bias, root mean square error (RMSE, relative RMSE, amount of biomass sampled, and relative biomass sampled. We also evaluated the performance of the selected sampling strategies when different numbers of branches (3, 6, 9, and 12 are selected from each tree. Tree specific log linear model with branch diameter and branch length as covariates was used to obtain individual branch biomass. Results Compared to all other methods stratified sampling with probability proportional to size estimation technique produced better results when three or six branches per tree were sampled. However, the systematic sampling with ratio estimation technique was the best when at least nine branches per tree were sampled. Under the stratified sampling strategy, selecting unequal number of branches per stratum produced approximately similar results to simple random sampling, but it further decreased RMSE when information on branch diameter is used in the design and estimation phases. Conclusions Use of
Conservative Sample Size Determination for Repeated Measures Analysis of Covariance.
Morgan, Timothy M; Case, L Douglas
2013-07-05
In the design of a randomized clinical trial with one pre and multiple post randomized assessments of the outcome variable, one needs to account for the repeated measures in determining the appropriate sample size. Unfortunately, one seldom has a good estimate of the variance of the outcome measure, let alone the correlations among the measurements over time. We show how sample sizes can be calculated by making conservative assumptions regarding the correlations for a variety of covariance structures. The most conservative choice for the correlation depends on the covariance structure and the number of repeated measures. In the absence of good estimates of the correlations, the sample size is often based on a two-sample t-test, making the 'ultra' conservative and unrealistic assumption that there are zero correlations between the baseline and follow-up measures while at the same time assuming there are perfect correlations between the follow-up measures. Compared to the case of taking a single measurement, substantial savings in sample size can be realized by accounting for the repeated measures, even with very conservative assumptions regarding the parameters of the assumed correlation matrix. Assuming compound symmetry, the sample size from the two-sample t-test calculation can be reduced at least 44%, 56%, and 61% for repeated measures analysis of covariance by taking 2, 3, and 4 follow-up measures, respectively. The results offer a rational basis for determining a fairly conservative, yet efficient, sample size for clinical trials with repeated measures and a baseline value.
Sample size reassessment for a two-stage design controlling the false discovery rate.
Zehetmayer, Sonja; Graf, Alexandra C; Posch, Martin
2015-11-01
Sample size calculations for gene expression microarray and NGS-RNA-Seq experiments are challenging because the overall power depends on unknown quantities as the proportion of true null hypotheses and the distribution of the effect sizes under the alternative. We propose a two-stage design with an adaptive interim analysis where these quantities are estimated from the interim data. The second stage sample size is chosen based on these estimates to achieve a specific overall power. The proposed procedure controls the power in all considered scenarios except for very low first stage sample sizes. The false discovery rate (FDR) is controlled despite of the data dependent choice of sample size. The two-stage design can be a useful tool to determine the sample size of high-dimensional studies if in the planning phase there is high uncertainty regarding the expected effect sizes and variability.
Influence of Sample Size on Automatic Positional Accuracy Assessment Methods for Urban Areas
Directory of Open Access Journals (Sweden)
Francisco J. Ariza-López
2018-05-01
Full Text Available In recent years, new approaches aimed to increase the automation level of positional accuracy assessment processes for spatial data have been developed. However, in such cases, an aspect as significant as sample size has not yet been addressed. In this paper, we study the influence of sample size when estimating the planimetric positional accuracy of urban databases by means of an automatic assessment using polygon-based methodology. Our study is based on a simulation process, which extracts pairs of homologous polygons from the assessed and reference data sources and applies two buffer-based methods. The parameter used for determining the different sizes (which range from 5 km up to 100 km has been the length of the polygons’ perimeter, and for each sample size 1000 simulations were run. After completing the simulation process, the comparisons between the estimated distribution functions for each sample and population distribution function were carried out by means of the Kolmogorov–Smirnov test. Results show a significant reduction in the variability of estimations when sample size increased from 5 km to 100 km.
Optimizing Sampling Efficiency for Biomass Estimation Across NEON Domains
Abercrombie, H. H.; Meier, C. L.; Spencer, J. J.
2013-12-01
Over the course of 30 years, the National Ecological Observatory Network (NEON) will measure plant biomass and productivity across the U.S. to enable an understanding of terrestrial carbon cycle responses to ecosystem change drivers. Over the next several years, prior to operational sampling at a site, NEON will complete construction and characterization phases during which a limited amount of sampling will be done at each site to inform sampling designs, and guide standardization of data collection across all sites. Sampling biomass in 60+ sites distributed among 20 different eco-climatic domains poses major logistical and budgetary challenges. Traditional biomass sampling methods such as clip harvesting and direct measurements of Leaf Area Index (LAI) involve collecting and processing plant samples, and are time and labor intensive. Possible alternatives include using indirect sampling methods for estimating LAI such as digital hemispherical photography (DHP) or using a LI-COR 2200 Plant Canopy Analyzer. These LAI estimations can then be used as a proxy for biomass. The biomass estimates calculated can then inform the clip harvest sampling design during NEON operations, optimizing both sample size and number so that standardized uncertainty limits can be achieved with a minimum amount of sampling effort. In 2011, LAI and clip harvest data were collected from co-located sampling points at the Central Plains Experimental Range located in northern Colorado, a short grass steppe ecosystem that is the NEON Domain 10 core site. LAI was measured with a LI-COR 2200 Plant Canopy Analyzer. The layout of the sampling design included four, 300 meter transects, with clip harvests plots spaced every 50m, and LAI sub-transects spaced every 10m. LAI was measured at four points along 6m sub-transects running perpendicular to the 300m transect. Clip harvest plots were co-located 4m from corresponding LAI transects, and had dimensions of 0.1m by 2m. We conducted regression analyses
Effects of systematic sampling on satellite estimates of deforestation rates
International Nuclear Information System (INIS)
Steininger, M K; Godoy, F; Harper, G
2009-01-01
Options for satellite monitoring of deforestation rates over large areas include the use of sampling. Sampling may reduce the cost of monitoring but is also a source of error in estimates of areas and rates. A common sampling approach is systematic sampling, in which sample units of a constant size are distributed in some regular manner, such as a grid. The proposed approach for the 2010 Forest Resources Assessment (FRA) of the UN Food and Agriculture Organization (FAO) is a systematic sample of 10 km wide squares at every 1 deg. intersection of latitude and longitude. We assessed the outcome of this and other systematic samples for estimating deforestation at national, sub-national and continental levels. The study is based on digital data on deforestation patterns for the five Amazonian countries outside Brazil plus the Brazilian Amazon. We tested these schemes by varying sample-unit size and frequency. We calculated two estimates of sampling error. First we calculated the standard errors, based on the size, variance and covariance of the samples, and from this calculated the 95% confidence intervals (CI). Second, we calculated the actual errors, based on the difference between the sample-based estimates and the estimates from the full-coverage maps. At the continental level, the 1 deg., 10 km scheme had a CI of 21% and an actual error of 8%. At the national level, this scheme had CIs of 126% for Ecuador and up to 67% for other countries. At this level, increasing sampling density to every 0.25 deg. produced a CI of 32% for Ecuador and CIs of up to 25% for other countries, with only Brazil having a CI of less than 10%. Actual errors were within the limits of the CIs in all but two of the 56 cases. Actual errors were half or less of the CIs in all but eight of these cases. These results indicate that the FRA 2010 should have CIs of smaller than or close to 10% at the continental level. However, systematic sampling at the national level yields large CIs unless the
Evaluation of design flood estimates with respect to sample size
Kobierska, Florian; Engeland, Kolbjorn
2016-04-01
Estimation of design floods forms the basis for hazard management related to flood risk and is a legal obligation when building infrastructure such as dams, bridges and roads close to water bodies. Flood inundation maps used for land use planning are also produced based on design flood estimates. In Norway, the current guidelines for design flood estimates give recommendations on which data, probability distribution, and method to use dependent on length of the local record. If less than 30 years of local data is available, an index flood approach is recommended where the local observations are used for estimating the index flood and regional data are used for estimating the growth curve. For 30-50 years of data, a 2 parameter distribution is recommended, and for more than 50 years of data, a 3 parameter distribution should be used. Many countries have national guidelines for flood frequency estimation, and recommended distributions include the log Pearson II, generalized logistic and generalized extreme value distributions. For estimating distribution parameters, ordinary and linear moments, maximum likelihood and Bayesian methods are used. The aim of this study is to r-evaluate the guidelines for local flood frequency estimation. In particular, we wanted to answer the following questions: (i) Which distribution gives the best fit to the data? (ii) Which estimation method provides the best fit to the data? (iii) Does the answer to (i) and (ii) depend on local data availability? To answer these questions we set up a test bench for local flood frequency analysis using data based cross-validation methods. The criteria were based on indices describing stability and reliability of design flood estimates. Stability is used as a criterion since design flood estimates should not excessively depend on the data sample. The reliability indices describe to which degree design flood predictions can be trusted.
Computing Confidence Bounds for Power and Sample Size of the General Linear Univariate Model
Taylor, Douglas J.; Muller, Keith E.
1995-01-01
The power of a test, the probability of rejecting the null hypothesis in favor of an alternative, may be computed using estimates of one or more distributional parameters. Statisticians frequently fix mean values and calculate power or sample size using a variance estimate from an existing study. Hence computed power becomes a random variable for a fixed sample size. Likewise, the sample size necessary to achieve a fixed power varies randomly. Standard statistical practice requires reporting ...
Wills, Johnny
2008-01-01
The planned widening of U.S. Highway 17 along the east boundary of Great Dismal Swamp National Wildlife Refuge (GDSNWR) and a lack of knowledge about the refugeâ s bear population created the need to identify potential sites for wildlife crossings and estimate the size of the refugeâ s bear population. I collected black bear hair in order to collect DNA samples to estimate population size, density, and sex ratio, and determine road crossing locations for black bears (Ursus americanus) in G...
Small sample GEE estimation of regression parameters for longitudinal data.
Paul, Sudhir; Zhang, Xuemao
2014-09-28
Longitudinal (clustered) response data arise in many bio-statistical applications which, in general, cannot be assumed to be independent. Generalized estimating equation (GEE) is a widely used method to estimate marginal regression parameters for correlated responses. The advantage of the GEE is that the estimates of the regression parameters are asymptotically unbiased even if the correlation structure is misspecified, although their small sample properties are not known. In this paper, two bias adjusted GEE estimators of the regression parameters in longitudinal data are obtained when the number of subjects is small. One is based on a bias correction, and the other is based on a bias reduction. Simulations show that the performances of both the bias-corrected methods are similar in terms of bias, efficiency, coverage probability, average coverage length, impact of misspecification of correlation structure, and impact of cluster size on bias correction. Both these methods show superior properties over the GEE estimates for small samples. Further, analysis of data involving a small number of subjects also shows improvement in bias, MSE, standard error, and length of the confidence interval of the estimates by the two bias adjusted methods over the GEE estimates. For small to moderate sample sizes (N ≤50), either of the bias-corrected methods GEEBc and GEEBr can be used. However, the method GEEBc should be preferred over GEEBr, as the former is computationally easier. For large sample sizes, the GEE method can be used. Copyright © 2014 John Wiley & Sons, Ltd.
Predicting sample size required for classification performance
Directory of Open Access Journals (Sweden)
Figueroa Rosa L
2012-02-01
Full Text Available Abstract Background Supervised learning methods need annotated data in order to generate efficient models. Annotated data, however, is a relatively scarce resource and can be expensive to obtain. For both passive and active learning methods, there is a need to estimate the size of the annotated sample required to reach a performance target. Methods We designed and implemented a method that fits an inverse power law model to points of a given learning curve created using a small annotated training set. Fitting is carried out using nonlinear weighted least squares optimization. The fitted model is then used to predict the classifier's performance and confidence interval for larger sample sizes. For evaluation, the nonlinear weighted curve fitting method was applied to a set of learning curves generated using clinical text and waveform classification tasks with active and passive sampling methods, and predictions were validated using standard goodness of fit measures. As control we used an un-weighted fitting method. Results A total of 568 models were fitted and the model predictions were compared with the observed performances. Depending on the data set and sampling method, it took between 80 to 560 annotated samples to achieve mean average and root mean squared error below 0.01. Results also show that our weighted fitting method outperformed the baseline un-weighted method (p Conclusions This paper describes a simple and effective sample size prediction algorithm that conducts weighted fitting of learning curves. The algorithm outperformed an un-weighted algorithm described in previous literature. It can help researchers determine annotation sample size for supervised machine learning.
The impact of sample size on the reproducibility of voxel-based lesion-deficit mappings.
Lorca-Puls, Diego L; Gajardo-Vidal, Andrea; White, Jitrachote; Seghier, Mohamed L; Leff, Alexander P; Green, David W; Crinion, Jenny T; Ludersdorfer, Philipp; Hope, Thomas M H; Bowman, Howard; Price, Cathy J
2018-07-01
This study investigated how sample size affects the reproducibility of findings from univariate voxel-based lesion-deficit analyses (e.g., voxel-based lesion-symptom mapping and voxel-based morphometry). Our effect of interest was the strength of the mapping between brain damage and speech articulation difficulties, as measured in terms of the proportion of variance explained. First, we identified a region of interest by searching on a voxel-by-voxel basis for brain areas where greater lesion load was associated with poorer speech articulation using a large sample of 360 right-handed English-speaking stroke survivors. We then randomly drew thousands of bootstrap samples from this data set that included either 30, 60, 90, 120, 180, or 360 patients. For each resample, we recorded effect size estimates and p values after conducting exactly the same lesion-deficit analysis within the previously identified region of interest and holding all procedures constant. The results show (1) how often small effect sizes in a heterogeneous population fail to be detected; (2) how effect size and its statistical significance varies with sample size; (3) how low-powered studies (due to small sample sizes) can greatly over-estimate as well as under-estimate effect sizes; and (4) how large sample sizes (N ≥ 90) can yield highly significant p values even when effect sizes are so small that they become trivial in practical terms. The implications of these findings for interpreting the results from univariate voxel-based lesion-deficit analyses are discussed. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Sample Size for Tablet Compression and Capsule Filling Events During Process Validation.
Charoo, Naseem Ahmad; Durivage, Mark; Rahman, Ziyaur; Ayad, Mohamad Haitham
2017-12-01
During solid dosage form manufacturing, the uniformity of dosage units (UDU) is ensured by testing samples at 2 stages, that is, blend stage and tablet compression or capsule/powder filling stage. The aim of this work is to propose a sample size selection approach based on quality risk management principles for process performance qualification (PPQ) and continued process verification (CPV) stages by linking UDU to potential formulation and process risk factors. Bayes success run theorem appeared to be the most appropriate approach among various methods considered in this work for computing sample size for PPQ. The sample sizes for high-risk (reliability level of 99%), medium-risk (reliability level of 95%), and low-risk factors (reliability level of 90%) were estimated to be 299, 59, and 29, respectively. Risk-based assignment of reliability levels was supported by the fact that at low defect rate, the confidence to detect out-of-specification units would decrease which must be supplemented with an increase in sample size to enhance the confidence in estimation. Based on level of knowledge acquired during PPQ and the level of knowledge further required to comprehend process, sample size for CPV was calculated using Bayesian statistics to accomplish reduced sampling design for CPV. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Stefanović Milena
2013-01-01
Full Text Available In studies of population variability, particular attention has to be paid to the selection of a representative sample. The aim of this study was to assess the size of the new representative sample on the basis of the variability of chemical content of the initial sample on the example of a whitebark pine population. Statistical analysis included the content of 19 characteristics (terpene hydrocarbons and their derivates of the initial sample of 10 elements (trees. It was determined that the new sample should contain 20 trees so that the mean value calculated from it represents a basic set with a probability higher than 95 %. Determination of the lower limit of the representative sample size that guarantees a satisfactory reliability of generalization proved to be very important in order to achieve cost efficiency of the research. [Projekat Ministarstva nauke Republike Srbije, br. OI-173011, br. TR-37002 i br. III-43007
The Sample Size Influence in the Accuracy of the Image Classification of the Remote Sensing
Directory of Open Access Journals (Sweden)
Thomaz C. e C. da Costa
2004-12-01
Full Text Available Landuse/landcover maps produced by classification of remote sensing images incorporate uncertainty. This uncertainty is measured by accuracy indices using reference samples. The size of the reference sample is defined by approximation by a binomial function without the use of a pilot sample. This way the accuracy are not estimated, but fixed a priori. In case of divergency between the estimated and a priori accuracy the error of the sampling will deviate from the expected error. The size using pilot sample (theorically correct procedure justify when haven´t estimate of accuracy for work area, referent the product remote sensing utility.
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Estimating population sizes for elusive animals: the forest elephants of Kakum National Park, Ghana.
Eggert, L S; Eggert, J A; Woodruff, D S
2003-06-01
African forest elephants are difficult to observe in the dense vegetation, and previous studies have relied upon indirect methods to estimate population sizes. Using multilocus genotyping of noninvasively collected samples, we performed a genetic survey of the forest elephant population at Kakum National Park, Ghana. We estimated population size, sex ratio and genetic variability from our data, then combined this information with field observations to divide the population into age groups. Our population size estimate was very close to that obtained using dung counts, the most commonly used indirect method of estimating the population sizes of forest elephant populations. As their habitat is fragmented by expanding human populations, management will be increasingly important to the persistence of forest elephant populations. The data that can be obtained from noninvasively collected samples will help managers plan for the conservation of this keystone species.
Particle size distributions (PSD) have long been used to more accurately estimate the PM10 fraction of total particulate matter (PM) stack samples taken from agricultural sources. These PSD analyses were typically conducted using a Coulter Counter with 50 micrometer aperture tube. With recent increa...
Estimating the size of the homeless population in Budapest, Hungary
David, B; Snijders, TAB
In this study we try to estimate the size of the homeless population in Budapest by using two - non-standard - sampling methods: snowball sampling and capture-recapture method. Using two methods and three different data sets we are able to compare the methods as well as the results, and we also
Effects of sample size and sampling frequency on studies of brown bear home ranges and habitat use
Arthur, Steve M.; Schwartz, Charles C.
1999-01-01
We equipped 9 brown bears (Ursus arctos) on the Kenai Peninsula, Alaska, with collars containing both conventional very-high-frequency (VHF) transmitters and global positioning system (GPS) receivers programmed to determine an animal's position at 5.75-hr intervals. We calculated minimum convex polygon (MCP) and fixed and adaptive kernel home ranges for randomly-selected subsets of the GPS data to examine the effects of sample size on accuracy and precision of home range estimates. We also compared results obtained by weekly aerial radiotracking versus more frequent GPS locations to test for biases in conventional radiotracking data. Home ranges based on the MCP were 20-606 km2 (x = 201) for aerial radiotracking data (n = 12-16 locations/bear) and 116-1,505 km2 (x = 522) for the complete GPS data sets (n = 245-466 locations/bear). Fixed kernel home ranges were 34-955 km2 (x = 224) for radiotracking data and 16-130 km2 (x = 60) for the GPS data. Differences between means for radiotracking and GPS data were due primarily to the larger samples provided by the GPS data. Means did not differ between radiotracking data and equivalent-sized subsets of GPS data (P > 0.10). For the MCP, home range area increased and variability decreased asymptotically with number of locations. For the kernel models, both area and variability decreased with increasing sample size. Simulations suggested that the MCP and kernel models required >60 and >80 locations, respectively, for estimates to be both accurate (change in area bears. Our results suggest that the usefulness of conventional radiotracking data may be limited by potential biases and variability due to small samples. Investigators that use home range estimates in statistical tests should consider the effects of variability of those estimates. Use of GPS-equipped collars can facilitate obtaining larger samples of unbiased data and improve accuracy and precision of home range estimates.
Hua, Xue; Hibar, Derrek P; Ching, Christopher R K; Boyle, Christina P; Rajagopalan, Priya; Gutman, Boris A; Leow, Alex D; Toga, Arthur W; Jack, Clifford R; Harvey, Danielle; Weiner, Michael W; Thompson, Paul M
2013-02-01
Various neuroimaging measures are being evaluated for tracking Alzheimer's disease (AD) progression in therapeutic trials, including measures of structural brain change based on repeated scanning of patients with magnetic resonance imaging (MRI). Methods to compute brain change must be robust to scan quality. Biases may arise if any scans are thrown out, as this can lead to the true changes being overestimated or underestimated. Here we analyzed the full MRI dataset from the first phase of Alzheimer's Disease Neuroimaging Initiative (ADNI-1) from the first phase of Alzheimer's Disease Neuroimaging Initiative (ADNI-1) and assessed several sources of bias that can arise when tracking brain changes with structural brain imaging methods, as part of a pipeline for tensor-based morphometry (TBM). In all healthy subjects who completed MRI scanning at screening, 6, 12, and 24months, brain atrophy was essentially linear with no detectable bias in longitudinal measures. In power analyses for clinical trials based on these change measures, only 39AD patients and 95 mild cognitive impairment (MCI) subjects were needed for a 24-month trial to detect a 25% reduction in the average rate of change using a two-sided test (α=0.05, power=80%). Further sample size reductions were achieved by stratifying the data into Apolipoprotein E (ApoE) ε4 carriers versus non-carriers. We show how selective data exclusion affects sample size estimates, motivating an objective comparison of different analysis techniques based on statistical power and robustness. TBM is an unbiased, robust, high-throughput imaging surrogate marker for large, multi-site neuroimaging studies and clinical trials of AD and MCI. Copyright © 2012 Elsevier Inc. All rights reserved.
Estimation for small domains in double sampling for stratification ...
African Journals Online (AJOL)
In this article, we investigate the effect of randomness of the size of a small domain on the precision of an estimator of mean for the domain under double sampling for stratification. The result shows that for a small domain that cuts across various strata with unknown weights, the sampling variance depends on the within ...
Sample Size Calculation for Controlling False Discovery Proportion
Directory of Open Access Journals (Sweden)
Shulian Shang
2012-01-01
Full Text Available The false discovery proportion (FDP, the proportion of incorrect rejections among all rejections, is a direct measure of abundance of false positive findings in multiple testing. Many methods have been proposed to control FDP, but they are too conservative to be useful for power analysis. Study designs for controlling the mean of FDP, which is false discovery rate, have been commonly used. However, there has been little attempt to design study with direct FDP control to achieve certain level of efficiency. We provide a sample size calculation method using the variance formula of the FDP under weak-dependence assumptions to achieve the desired overall power. The relationship between design parameters and sample size is explored. The adequacy of the procedure is assessed by simulation. We illustrate the method using estimated correlations from a prostate cancer dataset.
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival.
Directory of Open Access Journals (Sweden)
Ziya Kordjazi
Full Text Available Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day and four levels of the number of sampling-days (2, 4, 6 and 7 days. The most parsimonious Cormack-Jolly-Seber (CJS model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery.
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival
Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas
2016-01-01
Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561
Re-estimating sample size in cluster randomized trials with active recruitment within clusters
van Schie, Sander; Moerbeek, Mirjam
2014-01-01
Often only a limited number of clusters can be obtained in cluster randomised trials, although many potential participants can be recruited within each cluster. Thus, active recruitment is feasible within the clusters. To obtain an efficient sample size in a cluster randomised trial, the cluster
The large sample size fallacy.
Lantz, Björn
2013-06-01
Significance in the statistical sense has little to do with significance in the common practical sense. Statistical significance is a necessary but not a sufficient condition for practical significance. Hence, results that are extremely statistically significant may be highly nonsignificant in practice. The degree of practical significance is generally determined by the size of the observed effect, not the p-value. The results of studies based on large samples are often characterized by extreme statistical significance despite small or even trivial effect sizes. Interpreting such results as significant in practice without further analysis is referred to as the large sample size fallacy in this article. The aim of this article is to explore the relevance of the large sample size fallacy in contemporary nursing research. Relatively few nursing articles display explicit measures of observed effect sizes or include a qualitative discussion of observed effect sizes. Statistical significance is often treated as an end in itself. Effect sizes should generally be calculated and presented along with p-values for statistically significant results, and observed effect sizes should be discussed qualitatively through direct and explicit comparisons with the effects in related literature. © 2012 Nordic College of Caring Science.
Sample size in qualitative interview studies
DEFF Research Database (Denmark)
Malterud, Kirsti; Siersma, Volkert Dirk; Guassora, Ann Dorrit Kristiane
2016-01-01
Sample sizes must be ascertained in qualitative studies like in quantitative studies but not by the same means. The prevailing concept for sample size in qualitative studies is “saturation.” Saturation is closely tied to a specific methodology, and the term is inconsistently applied. We propose...... the concept “information power” to guide adequate sample size for qualitative studies. Information power indicates that the more information the sample holds, relevant for the actual study, the lower amount of participants is needed. We suggest that the size of a sample with sufficient information power...... and during data collection of a qualitative study is discussed....
Estimating Search Engine Index Size Variability
DEFF Research Database (Denmark)
Van den Bosch, Antal; Bogers, Toine; De Kunder, Maurice
2016-01-01
One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel...... method of estimating the size of a Web search engine’s index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing’s indices over a nine-year period, from March 2006...... until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find...
Wan, Xiang; Wang, Wenqian; Liu, Jiming; Tong, Tiejun
2014-01-01
Background In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials. Methods In this paper, we propose to improve the existing literature in ...
Creel, Scott; Spong, Goran; Sands, Jennifer L; Rotella, Jay; Zeigle, Janet; Joe, Lawrence; Murphy, Kerry M; Smith, Douglas
2003-07-01
Determining population sizes can be difficult, but is essential for conservation. By counting distinct microsatellite genotypes, DNA from noninvasive samples (hair, faeces) allows estimation of population size. Problems arise because genotypes from noninvasive samples are error-prone, but genotyping errors can be reduced by multiple polymerase chain reaction (PCR). For faecal genotypes from wolves in Yellowstone National Park, error rates varied substantially among samples, often above the 'worst-case threshold' suggested by simulation. Consequently, a substantial proportion of multilocus genotypes held one or more errors, despite multiple PCR. These genotyping errors created several genotypes per individual and caused overestimation (up to 5.5-fold) of population size. We propose a 'matching approach' to eliminate this overestimation bias.
Hua, Xue; Hibar, Derrek P.; Ching, Christopher R.K.; Boyle, Christina P.; Rajagopalan, Priya; Gutman, Boris A.; Leow, Alex D.; Toga, Arthur W.; Jack, Clifford R.; Harvey, Danielle; Weiner, Michael W.; Thompson, Paul M.
2013-01-01
Various neuroimaging measures are being evaluated for tracking Alzheimer’s disease (AD) progression in therapeutic trials, including measures of structural brain change based on repeated scanning of patients with magnetic resonance imaging (MRI). Methods to compute brain change must be robust to scan quality. Biases may arise if any scans are thrown out, as this can lead to the true changes being overestimated or underestimated. Here we analyzed the full MRI dataset from the first phase of Alzheimer’s Disease Neuroimaging Initiative (ADNI-1) from the first phase of Alzheimer’s Disease Neuroimaging Initiative (ADNI-1) and assessed several sources of bias that can arise when tracking brain changes with structural brain imaging methods, as part of a pipeline for tensor-based morphometry (TBM). In all healthy subjects who completed MRI scanning at screening, 6, 12, and 24 months, brain atrophy was essentially linear with no detectable bias in longitudinal measures. In power analyses for clinical trials based on these change measures, only 39 AD patients and 95 mild cognitive impairment (MCI) subjects were needed for a 24-month trial to detect a 25% reduction in the average rate of change using a two-sided test (α=0.05, power=80%). Further sample size reductions were achieved by stratifying the data into Apolipoprotein E (ApoE) ε4 carriers versus non-carriers. We show how selective data exclusion affects sample size estimates, motivating an objective comparison of different analysis techniques based on statistical power and robustness. TBM is an unbiased, robust, high-throughput imaging surrogate marker for large, multi-site neuroimaging studies and clinical trials of AD and MCI. PMID:23153970
Baranowski, Tom; Baranowski, Janice C; Watson, Kathleen B; Martin, Shelby; Beltran, Alicia; Islam, Noemi; Dadabhoy, Hafza; Adame, Su-heyla; Cullen, Karen; Thompson, Debbe; Buday, Richard; Subar, Amy
2011-03-01
To test the effect of image size and presence of size cues on the accuracy of portion size estimation by children. Children were randomly assigned to seeing images with or without food size cues (utensils and checked tablecloth) and were presented with sixteen food models (foods commonly eaten by children) in varying portion sizes, one at a time. They estimated each food model's portion size by selecting a digital food image. The same food images were presented in two ways: (i) as small, graduated portion size images all on one screen or (ii) by scrolling across large, graduated portion size images, one per sequential screen. Laboratory-based with computer and food models. Volunteer multi-ethnic sample of 120 children, equally distributed by gender and ages (8 to 13 years) in 2008-2009. Average percentage of correctly classified foods was 60·3 %. There were no differences in accuracy by any design factor or demographic characteristic. Multiple small pictures on the screen at once took half the time to estimate portion size compared with scrolling through large pictures. Larger pictures had more overestimation of size. Multiple images of successively larger portion sizes of a food on one computer screen facilitated quicker portion size responses with no decrease in accuracy. This is the method of choice for portion size estimation on a computer.
Sample sizes to control error estimates in determining soil bulk density in California forest soils
Youzhi Han; Jianwei Zhang; Kim G. Mattson; Weidong Zhang; Thomas A. Weber
2016-01-01
Characterizing forest soil properties with high variability is challenging, sometimes requiring large numbers of soil samples. Soil bulk density is a standard variable needed along with element concentrations to calculate nutrient pools. This study aimed to determine the optimal sample size, the number of observation (n), for predicting the soil bulk density with a...
Improved sample size determination for attributes and variables sampling
International Nuclear Information System (INIS)
Stirpe, D.; Picard, R.R.
1985-01-01
Earlier INMM papers have addressed the attributes/variables problem and, under conservative/limiting approximations, have reported analytical solutions for the attributes and variables sample sizes. Through computer simulation of this problem, we have calculated attributes and variables sample sizes as a function of falsification, measurement uncertainties, and required detection probability without using approximations. Using realistic assumptions for uncertainty parameters of measurement, the simulation results support the conclusions: (1) previously used conservative approximations can be expensive because they lead to larger sample sizes than needed; and (2) the optimal verification strategy, as well as the falsification strategy, are highly dependent on the underlying uncertainty parameters of the measurement instruments. 1 ref., 3 figs
Tang, Yongqiang
2015-01-01
A sample size formula is derived for negative binomial regression for the analysis of recurrent events, in which subjects can have unequal follow-up time. We obtain sharp lower and upper bounds on the required size, which is easy to compute. The upper bound is generally only slightly larger than the required size, and hence can be used to approximate the sample size. The lower and upper size bounds can be decomposed into two terms. The first term relies on the mean number of events in each group, and the second term depends on two factors that measure, respectively, the extent of between-subject variability in event rates, and follow-up time. Simulation studies are conducted to assess the performance of the proposed method. An application of our formulae to a multiple sclerosis trial is provided.
Bergtold, Jason S.; Yeager, Elizabeth A.; Featherstone, Allen M.
2011-01-01
The logistic regression models has been widely used in the social and natural sciences and results from studies using this model can have significant impact. Thus, confidence in the reliability of inferences drawn from these models is essential. The robustness of such inferences is dependent on sample size. The purpose of this study is to examine the impact of sample size on the mean estimated bias and efficiency of parameter estimation and inference for the logistic regression model. A numbe...
Allen, John C; Thumboo, Julian; Lye, Weng Kit; Conaghan, Philip G; Chew, Li-Ching; Tan, York Kiat
2018-03-01
To determine whether novel methods of selecting joints through (i) ultrasonography (individualized-ultrasound [IUS] method), or (ii) ultrasonography and clinical examination (individualized-composite-ultrasound [ICUS] method) translate into smaller rheumatoid arthritis (RA) clinical trial sample sizes when compared to existing methods utilizing predetermined joint sites for ultrasonography. Cohen's effect size (ES) was estimated (ES^) and a 95% CI (ES^L, ES^U) calculated on a mean change in 3-month total inflammatory score for each method. Corresponding 95% CIs [nL(ES^U), nU(ES^L)] were obtained on a post hoc sample size reflecting the uncertainty in ES^. Sample size calculations were based on a one-sample t-test as the patient numbers needed to provide 80% power at α = 0.05 to reject a null hypothesis H 0 : ES = 0 versus alternative hypotheses H 1 : ES = ES^, ES = ES^L and ES = ES^U. We aimed to provide point and interval estimates on projected sample sizes for future studies reflecting the uncertainty in our study ES^S. Twenty-four treated RA patients were followed up for 3 months. Utilizing the 12-joint approach and existing methods, the post hoc sample size (95% CI) was 22 (10-245). Corresponding sample sizes using ICUS and IUS were 11 (7-40) and 11 (6-38), respectively. Utilizing a seven-joint approach, the corresponding sample sizes using ICUS and IUS methods were nine (6-24) and 11 (6-35), respectively. Our pilot study suggests that sample size for RA clinical trials with ultrasound endpoints may be reduced using the novel methods, providing justification for larger studies to confirm these observations. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Experimental determination of size distributions: analyzing proper sample sizes
International Nuclear Information System (INIS)
Buffo, A; Alopaeus, V
2016-01-01
The measurement of various particle size distributions is a crucial aspect for many applications in the process industry. Size distribution is often related to the final product quality, as in crystallization or polymerization. In other cases it is related to the correct evaluation of heat and mass transfer, as well as reaction rates, depending on the interfacial area between the different phases or to the assessment of yield stresses of polycrystalline metals/alloys samples. The experimental determination of such distributions often involves laborious sampling procedures and the statistical significance of the outcome is rarely investigated. In this work, we propose a novel rigorous tool, based on inferential statistics, to determine the number of samples needed to obtain reliable measurements of size distribution, according to specific requirements defined a priori. Such methodology can be adopted regardless of the measurement technique used. (paper)
Effect size estimates: current use, calculations, and interpretation.
Fritz, Catherine O; Morris, Peter E; Richler, Jennifer J
2012-02-01
The Publication Manual of the American Psychological Association (American Psychological Association, 2001, American Psychological Association, 2010) calls for the reporting of effect sizes and their confidence intervals. Estimates of effect size are useful for determining the practical or theoretical importance of an effect, the relative contributions of factors, and the power of an analysis. We surveyed articles published in 2009 and 2010 in the Journal of Experimental Psychology: General, noting the statistical analyses reported and the associated reporting of effect size estimates. Effect sizes were reported for fewer than half of the analyses; no article reported a confidence interval for an effect size. The most often reported analysis was analysis of variance, and almost half of these reports were not accompanied by effect sizes. Partial η2 was the most commonly reported effect size estimate for analysis of variance. For t tests, 2/3 of the articles did not report an associated effect size estimate; Cohen's d was the most often reported. We provide a straightforward guide to understanding, selecting, calculating, and interpreting effect sizes for many types of data and to methods for calculating effect size confidence intervals and power analysis.
[Effect sizes, statistical power and sample sizes in "the Japanese Journal of Psychology"].
Suzukawa, Yumi; Toyoda, Hideki
2012-04-01
This study analyzed the statistical power of research studies published in the "Japanese Journal of Psychology" in 2008 and 2009. Sample effect sizes and sample statistical powers were calculated for each statistical test and analyzed with respect to the analytical methods and the fields of the studies. The results show that in the fields like perception, cognition or learning, the effect sizes were relatively large, although the sample sizes were small. At the same time, because of the small sample sizes, some meaningful effects could not be detected. In the other fields, because of the large sample sizes, meaningless effects could be detected. This implies that researchers who could not get large enough effect sizes would use larger samples to obtain significant results.
Directory of Open Access Journals (Sweden)
Patrick Habecker
Full Text Available Researchers interested in studying populations that are difficult to reach through traditional survey methods can now draw on a range of methods to access these populations. Yet many of these methods are more expensive and difficult to implement than studies using conventional sampling frames and trusted sampling methods. The network scale-up method (NSUM provides a middle ground for researchers who wish to estimate the size of a hidden population, but lack the resources to conduct a more specialized hidden population study. Through this method it is possible to generate population estimates for a wide variety of groups that are perhaps unwilling to self-identify as such (for example, users of illegal drugs or other stigmatized populations via traditional survey tools such as telephone or mail surveys--by asking a representative sample to estimate the number of people they know who are members of such a "hidden" subpopulation. The original estimator is formulated to minimize the weight a single scaling variable can exert upon the estimates. We argue that this introduces hidden and difficult to predict biases, and instead propose a series of methodological advances on the traditional scale-up estimation procedure, including a new estimator. Additionally, we formalize the incorporation of sample weights into the network scale-up estimation process, and propose a recursive process of back estimation "trimming" to identify and remove poorly performing predictors from the estimation process. To demonstrate these suggestions we use data from a network scale-up mail survey conducted in Nebraska during 2014. We find that using the new estimator and recursive trimming process provides more accurate estimates, especially when used in conjunction with sampling weights.
Directory of Open Access Journals (Sweden)
Aidan G. O’Keeffe
2017-12-01
Full Text Available Abstract Background In healthcare research, outcomes with skewed probability distributions are common. Sample size calculations for such outcomes are typically based on estimates on a transformed scale (e.g. log which may sometimes be difficult to obtain. In contrast, estimates of median and variance on the untransformed scale are generally easier to pre-specify. The aim of this paper is to describe how to calculate a sample size for a two group comparison of interest based on median and untransformed variance estimates for log-normal outcome data. Methods A log-normal distribution for outcome data is assumed and a sample size calculation approach for a two-sample t-test that compares log-transformed outcome data is demonstrated where the change of interest is specified as difference in median values on the untransformed scale. A simulation study is used to compare the method with a non-parametric alternative (Mann-Whitney U test in a variety of scenarios and the method is applied to a real example in neurosurgery. Results The method attained a nominal power value in simulation studies and was favourable in comparison to a Mann-Whitney U test and a two-sample t-test of untransformed outcomes. In addition, the method can be adjusted and used in some situations where the outcome distribution is not strictly log-normal. Conclusions We recommend the use of this sample size calculation approach for outcome data that are expected to be positively skewed and where a two group comparison on a log-transformed scale is planned. An advantage of this method over usual calculations based on estimates on the log-transformed scale is that it allows clinical efficacy to be specified as a difference in medians and requires a variance estimate on the untransformed scale. Such estimates are often easier to obtain and more interpretable than those for log-transformed outcomes.
Directory of Open Access Journals (Sweden)
Annegret Grimm
Full Text Available Reliable estimates of population size are fundamental in many ecological studies and biodiversity conservation. Selecting appropriate methods to estimate abundance is often very difficult, especially if data are scarce. Most studies concerning the reliability of different estimators used simulation data based on assumptions about capture variability that do not necessarily reflect conditions in natural populations. Here, we used data from an intensively studied closed population of the arboreal gecko Gehyra variegata to construct reference population sizes for assessing twelve different population size estimators in terms of bias, precision, accuracy, and their 95%-confidence intervals. Two of the reference populations reflect natural biological entities, whereas the other reference populations reflect artificial subsets of the population. Since individual heterogeneity was assumed, we tested modifications of the Lincoln-Petersen estimator, a set of models in programs MARK and CARE-2, and a truncated geometric distribution. Ranking of methods was similar across criteria. Models accounting for individual heterogeneity performed best in all assessment criteria. For populations from heterogeneous habitats without obvious covariates explaining individual heterogeneity, we recommend using the moment estimator or the interpolated jackknife estimator (both implemented in CAPTURE/MARK. If data for capture frequencies are substantial, we recommend the sample coverage or the estimating equation (both models implemented in CARE-2. Depending on the distribution of catchabilities, our proposed multiple Lincoln-Petersen and a truncated geometric distribution obtained comparably good results. The former usually resulted in a minimum population size and the latter can be recommended when there is a long tail of low capture probabilities. Models with covariates and mixture models performed poorly. Our approach identified suitable methods and extended options to
Gluttonous predators: how to estimate prey size when there are too many prey
Directory of Open Access Journals (Sweden)
MS. Araújo
Full Text Available Prey size is an important factor in food consumption. In studies of feeding ecology, prey items are usually measured individually using calipers or ocular micrometers. Among amphibians and reptiles, there are species that feed on large numbers of small prey items (e.g. ants, termites. This high intake makes it difficult to estimate prey size consumed by these animals. We addressed this problem by developing and evaluating a procedure for subsampling the stomach contents of such predators in order to estimate prey size. Specifically, we developed a protocol based on a bootstrap procedure to obtain a subsample with a precision error of at the most 5%, with a confidence level of at least 95%. This guideline should reduce the sampling effort and facilitate future studies on the feeding habits of amphibians and reptiles, and also provide a means of obtaining precise estimates of prey size.
Comparing interval estimates for small sample ordinal CFA models.
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading
Sample size calculations for case-control studies
This R package can be used to calculate the required samples size for unconditional multivariate analyses of unmatched case-control studies. The sample sizes are for a scalar exposure effect, such as binary, ordinal or continuous exposures. The sample sizes can also be computed for scalar interaction effects. The analyses account for the effects of potential confounder variables that are also included in the multivariate logistic model.
Umesh P. Agarwal; Sally A. Ralph; Carlos Baez; Richard S. Reiner; Steve P. Verrill
2017-01-01
Although X-ray diffraction (XRD) has been the most widely used technique to investigate crystallinity index (CrI) and crystallite size (L200) of cellulose materials, there are not many studies that have taken into account the role of sample moisture on these measurements. The present investigation focuses on a variety of celluloses and cellulose...
Photometric estimation of defect size in radiation direction
International Nuclear Information System (INIS)
Zuev, V.M.
1993-01-01
Factors, affecting accuracy of photometric estimation of defect size in radiation transmission direction, are analyzed. Experimentally obtained dependences of contrast of defect image on its size in radiation transmission direction are presented. Practical recommendations on improving accuracy of photometric estimation of defect size in radiation transmission direction, are developed
On the Structure of Cortical Microcircuits Inferred from Small Sample Sizes.
Vegué, Marina; Perin, Rodrigo; Roxin, Alex
2017-08-30
The structure in cortical microcircuits deviates from what would be expected in a purely random network, which has been seen as evidence of clustering. To address this issue, we sought to reproduce the nonrandom features of cortical circuits by considering several distinct classes of network topology, including clustered networks, networks with distance-dependent connectivity, and those with broad degree distributions. To our surprise, we found that all of these qualitatively distinct topologies could account equally well for all reported nonrandom features despite being easily distinguishable from one another at the network level. This apparent paradox was a consequence of estimating network properties given only small sample sizes. In other words, networks that differ markedly in their global structure can look quite similar locally. This makes inferring network structure from small sample sizes, a necessity given the technical difficulty inherent in simultaneous intracellular recordings, problematic. We found that a network statistic called the sample degree correlation (SDC) overcomes this difficulty. The SDC depends only on parameters that can be estimated reliably given small sample sizes and is an accurate fingerprint of every topological family. We applied the SDC criterion to data from rat visual and somatosensory cortex and discovered that the connectivity was not consistent with any of these main topological classes. However, we were able to fit the experimental data with a more general network class, of which all previous topologies were special cases. The resulting network topology could be interpreted as a combination of physical spatial dependence and nonspatial, hierarchical clustering. SIGNIFICANCE STATEMENT The connectivity of cortical microcircuits exhibits features that are inconsistent with a simple random network. Here, we show that several classes of network models can account for this nonrandom structure despite qualitative differences in
Directory of Open Access Journals (Sweden)
Simon Boitard
2016-03-01
Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.
Liu, Jingxia; Colditz, Graham A
2018-05-01
There is growing interest in conducting cluster randomized trials (CRTs). For simplicity in sample size calculation, the cluster sizes are assumed to be identical across all clusters. However, equal cluster sizes are not guaranteed in practice. Therefore, the relative efficiency (RE) of unequal versus equal cluster sizes has been investigated when testing the treatment effect. One of the most important approaches to analyze a set of correlated data is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which the "working correlation structure" is introduced and the association pattern depends on a vector of association parameters denoted by ρ. In this paper, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in CRTs. The variances of the estimator of the treatment effect are derived for the different types of outcome. RE is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal cluster sizes. We discuss a commonly used structure in CRTs-exchangeable, and derive the simpler formula of RE with continuous, binary, and count outcomes. Finally, REs are investigated for several scenarios of cluster size distributions through simulation studies. We propose an adjusted sample size due to efficiency loss. Additionally, we also propose an optimal sample size estimation based on the GEE models under a fixed budget for known and unknown association parameter (ρ) in the working correlation structure within the cluster. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.
You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary
2011-02-01
The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure
On population size estimators in the Poisson mixture model.
Mao, Chang Xuan; Yang, Nan; Zhong, Jinhua
2013-09-01
Estimating population sizes via capture-recapture experiments has enormous applications. The Poisson mixture model can be adopted for those applications with a single list in which individuals appear one or more times. We compare several nonparametric estimators, including the Chao estimator, the Zelterman estimator, two jackknife estimators and the bootstrap estimator. The target parameter of the Chao estimator is a lower bound of the population size. Those of the other four estimators are not lower bounds, and they may produce lower confidence limits for the population size with poor coverage probabilities. A simulation study is reported and two examples are investigated. © 2013, The International Biometric Society.
Estimating the size of juvenile fish populations in southeastern coastal-plain estuaries
International Nuclear Information System (INIS)
Kjelson, M.A.
1977-01-01
Understanding the ecological significance of man's activities upon fishery resources requires information on the size of affected fish stocks. The objective of this paper is to provide information to evaluate and plan sampling programs designed to obtain accurate and precise estimates of fish abundance. Nursery habitats, as marsh--tidal creeks and submerged grass beds, offer the optimal conditions for estimating natural mortality rates for young-of-the-year fish in Atlantic and Gulf of Mexico coast estuaries. The area-density method of abundance estimation using quantitative gears is more feasible than either mark-recapture or direct-count techniques. The blockage method provides the most accurate estimates, while encircling devices enable highly mobile species found in open water to be captured. Drop nets and lift nets allow samples to be taken in obstructed sites, but trawls and seines are the most economical gears. Replicate samples are necessary to improve the precision of density estimates, while evaluation and use of gear-catch efficiencies is feasible and required to improve the accuracy of density estimates. Coefficients of variation for replicate trawl samples range from 50 to 150 percent, while catch efficiencies for both trawls and seines for many juvenile fishes range from approximately 30 to 70 percent
Neuromuscular dose-response studies: determining sample size.
Kopman, A F; Lien, C A; Naguib, M
2011-02-01
Investigators planning dose-response studies of neuromuscular blockers have rarely used a priori power analysis to determine the minimal sample size their protocols require. Institutional Review Boards and peer-reviewed journals now generally ask for this information. This study outlines a proposed method for meeting these requirements. The slopes of the dose-response relationships of eight neuromuscular blocking agents were determined using regression analysis. These values were substituted for γ in the Hill equation. When this is done, the coefficient of variation (COV) around the mean value of the ED₅₀ for each drug is easily calculated. Using these values, we performed an a priori one-sample two-tailed t-test of the means to determine the required sample size when the allowable error in the ED₅₀ was varied from ±10-20%. The COV averaged 22% (range 15-27%). We used a COV value of 25% in determining the sample size. If the allowable error in finding the mean ED₅₀ is ±15%, a sample size of 24 is needed to achieve a power of 80%. Increasing 'accuracy' beyond this point requires increasing greater sample sizes (e.g. an 'n' of 37 for a ±12% error). On the basis of the results of this retrospective analysis, a total sample size of not less than 24 subjects should be adequate for determining a neuromuscular blocking drug's clinical potency with a reasonable degree of assurance.
Sample size calculation to externally validate scoring systems based on logistic regression models.
Directory of Open Access Journals (Sweden)
Antonio Palazón-Bru
Full Text Available A sample size containing at least 100 events and 100 non-events has been suggested to validate a predictive model, regardless of the model being validated and that certain factors can influence calibration of the predictive model (discrimination, parameterization and incidence. Scoring systems based on binary logistic regression models are a specific type of predictive model.The aim of this study was to develop an algorithm to determine the sample size for validating a scoring system based on a binary logistic regression model and to apply it to a case study.The algorithm was based on bootstrap samples in which the area under the ROC curve, the observed event probabilities through smooth curves, and a measure to determine the lack of calibration (estimated calibration index were calculated. To illustrate its use for interested researchers, the algorithm was applied to a scoring system, based on a binary logistic regression model, to determine mortality in intensive care units.In the case study provided, the algorithm obtained a sample size with 69 events, which is lower than the value suggested in the literature.An algorithm is provided for finding the appropriate sample size to validate scoring systems based on binary logistic regression models. This could be applied to determine the sample size in other similar cases.
Joint inversion of NMR and SIP data to estimate pore size distribution of geomaterials
Niu, Qifei; Zhang, Chi
2018-03-01
There are growing interests in using geophysical tools to characterize the microstructure of geomaterials because of the non-invasive nature and the applicability in field. In these applications, multiple types of geophysical data sets are usually processed separately, which may be inadequate to constrain the key feature of target variables. Therefore, simultaneous processing of multiple data sets could potentially improve the resolution. In this study, we propose a method to estimate pore size distribution by joint inversion of nuclear magnetic resonance (NMR) T2 relaxation and spectral induced polarization (SIP) spectra. The petrophysical relation between NMR T2 relaxation time and SIP relaxation time is incorporated in a nonlinear least squares problem formulation, which is solved using Gauss-Newton method. The joint inversion scheme is applied to a synthetic sample and a Berea sandstone sample. The jointly estimated pore size distributions are very close to the true model and results from other experimental method. Even when the knowledge of the petrophysical models of the sample is incomplete, the joint inversion can still capture the main features of the pore size distribution of the samples, including the general shape and relative peak positions of the distribution curves. It is also found from the numerical example that the surface relaxivity of the sample could be extracted with the joint inversion of NMR and SIP data if the diffusion coefficient of the ions in the electrical double layer is known. Comparing to individual inversions, the joint inversion could improve the resolution of the estimated pore size distribution because of the addition of extra data sets. The proposed approach might constitute a first step towards a comprehensive joint inversion that can extract the full pore geometry information of a geomaterial from NMR and SIP data.
Creel survey sampling designs for estimating effort in short-duration Chinook salmon fisheries
McCormick, Joshua L.; Quist, Michael C.; Schill, Daniel J.
2013-01-01
Chinook Salmon Oncorhynchus tshawytscha sport fisheries in the Columbia River basin are commonly monitored using roving creel survey designs and require precise, unbiased catch estimates. The objective of this study was to examine the relative bias and precision of total catch estimates using various sampling designs to estimate angling effort under the assumption that mean catch rate was known. We obtained information on angling populations based on direct visual observations of portions of Chinook Salmon fisheries in three Idaho river systems over a 23-d period. Based on the angling population, Monte Carlo simulations were used to evaluate the properties of effort and catch estimates for each sampling design. All sampling designs evaluated were relatively unbiased. Systematic random sampling (SYS) resulted in the most precise estimates. The SYS and simple random sampling designs had mean square error (MSE) estimates that were generally half of those observed with cluster sampling designs. The SYS design was more efficient (i.e., higher accuracy per unit cost) than a two-cluster design. Increasing the number of clusters available for sampling within a day decreased the MSE of estimates of daily angling effort, but the MSE of total catch estimates was variable depending on the fishery. The results of our simulations provide guidelines on the relative influence of sample sizes and sampling designs on parameters of interest in short-duration Chinook Salmon fisheries.
Directory of Open Access Journals (Sweden)
Esther Wong
Full Text Available We have developed a modified FlowCAM procedure for efficiently quantifying the size distribution of zooplankton. The modified method offers the following new features: 1 prevents animals from settling and clogging with constant bubbling in the sample container; 2 prevents damage to sample animals and facilitates recycling by replacing the built-in peristaltic pump with an external syringe pump, in order to generate negative pressure, creates a steady flow by drawing air from the receiving conical flask (i.e. vacuum pump, and transfers plankton from the sample container toward the main flowcell of the imaging system and finally into the receiving flask; 3 aligns samples in advance of imaging and prevents clogging with an additional flowcell placed ahead of the main flowcell. These modifications were designed to overcome the difficulties applying the standard FlowCAM procedure to studies where the number of individuals per sample is small, and since the FlowCAM can only image a subset of a sample. Our effective recycling procedure allows users to pass the same sample through the FlowCAM many times (i.e. bootstrapping the sample in order to generate a good size distribution. Although more advanced FlowCAM models are equipped with syringe pump and Field of View (FOV flowcells which can image all particles passing through the flow field; we note that these advanced setups are very expensive, offer limited syringe and flowcell sizes, and do not guarantee recycling. In contrast, our modifications are inexpensive and flexible. Finally, we compared the biovolumes estimated by automated FlowCAM image analysis versus conventional manual measurements, and found that the size of an individual zooplankter can be estimated by the FlowCAM image system after ground truthing.
Threshold-dependent sample sizes for selenium assessment with stream fish tissue
Hitt, Nathaniel P.; Smith, David R.
2015-01-01
precision of composites for estimating mean conditions. However, low sample sizes (<5 fish) did not achieve 80% power to detect near-threshold values (i.e., <1 mg Se/kg) under any scenario we evaluated. This analysis can assist the sampling design and interpretation of Se assessments from fish tissue by accounting for natural variation in stream fish populations.
DEFF Research Database (Denmark)
Gardi, Jonathan Eyal; Nyengaard, Jens Randel; Gundersen, Hans Jørgen Gottlieb
2008-01-01
Quantification of tissue properties is improved using the general proportionator sampling and estimation procedure: automatic image analysis and non-uniform sampling with probability proportional to size (PPS). The complete region of interest is partitioned into fields of view, and every field...... of view is given a weight (the size) proportional to the total amount of requested image analysis features in it. The fields of view sampled with known probabilities proportional to individual weight are the only ones seen by the observer who provides the correct count. Even though the image analysis...... cerebellum, total number of orexin positive neurons in transgenic mice brain, and estimating the absolute area and the areal fraction of β islet cells in dog pancreas. The proportionator was at least eight times more efficient (precision and time combined) than traditional computer controlled sampling....
Sample Size Determination for One- and Two-Sample Trimmed Mean Tests
Luh, Wei-Ming; Olejnik, Stephen; Guo, Jiin-Huarng
2008-01-01
Formulas to determine the necessary sample sizes for parametric tests of group comparisons are available from several sources and appropriate when population distributions are normal. However, in the context of nonnormal population distributions, researchers recommend Yuen's trimmed mean test, but formulas to determine sample sizes have not been…
Sampling and estimating recreational use.
Timothy G. Gregoire; Gregory J. Buhyoff
1999-01-01
Probability sampling methods applicable to estimate recreational use are presented. Both single- and multiple-access recreation sites are considered. One- and two-stage sampling methods are presented. Estimation of recreational use is presented in a series of examples.
Estimating software development project size, using probabilistic ...
African Journals Online (AJOL)
Estimating software development project size, using probabilistic techniques. ... of managing the size of software development projects by Purchasers (Clients) and Vendors (Development ... EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT
Mclean, Elizabeth L; Forrester, Graham E
2018-04-01
We tested whether fishers' local ecological knowledge (LEK) of two fish life-history parameters, size at maturity (SAM) at maximum body size (MS), was comparable to scientific estimates (SEK) of the same parameters, and whether LEK influenced fishers' perceptions of sustainability. Local ecological knowledge was documented for 82 fishers from a small-scale fishery in Samaná Bay, Dominican Republic, whereas SEK was compiled from the scientific literature. Size at maturity estimates derived from LEK and SEK overlapped for most of the 15 commonly harvested species (10 of 15). In contrast, fishers' maximum size estimates were usually lower than (eight species), or overlapped with (five species) scientific estimates. Fishers' size-based estimates of catch composition indicate greater potential for overfishing than estimates based on SEK. Fishers' estimates of size at capture relative to size at maturity suggest routine inclusion of juveniles in the catch (9 of 15 species), and fishers' estimates suggest that harvested fish are substantially smaller than maximum body size for most species (11 of 15 species). Scientific estimates also suggest that harvested fish are generally smaller than maximum body size (13 of 15), but suggest that the catch is dominated by adults for most species (9 of 15 species), and that juveniles are present in the catch for fewer species (6 of 15). Most Samaná fishers characterized the current state of their fishery as poor (73%) and as having changed for the worse over the past 20 yr (60%). Fishers stated that concern about overfishing, catching small fish, and catching immature fish contributed to these perceptions, indicating a possible influence of catch-size composition on their perceptions. Future work should test this link more explicitly because we found no evidence that the minority of fishers with more positive perceptions of their fishery reported systematically different estimates of catch-size composition than those with the more
Low-sampling-rate ultra-wideband channel estimation using equivalent-time sampling
Ballal, Tarig
2014-09-01
In this paper, a low-sampling-rate scheme for ultra-wideband channel estimation is proposed. The scheme exploits multiple observations generated by transmitting multiple pulses. In the proposed scheme, P pulses are transmitted to produce channel impulse response estimates at a desired sampling rate, while the ADC samples at a rate that is P times slower. To avoid loss of fidelity, the number of sampling periods (based on the desired rate) in the inter-pulse interval is restricted to be co-prime with P. This condition is affected when clock drift is present and the transmitted pulse locations change. To handle this case, and to achieve an overall good channel estimation performance, without using prior information, we derive an improved estimator based on the bounded data uncertainty (BDU) model. It is shown that this estimator is related to the Bayesian linear minimum mean squared error (LMMSE) estimator. Channel estimation performance of the proposed sub-sampling scheme combined with the new estimator is assessed in simulation. The results show that high reduction in sampling rate can be achieved. The proposed estimator outperforms the least squares estimator in almost all cases, while in the high SNR regime it also outperforms the LMMSE estimator. In addition to channel estimation, a synchronization method is also proposed that utilizes the same pulse sequence used for channel estimation. © 2014 IEEE.
Sample size determination for mediation analysis of longitudinal data.
Pan, Haitao; Liu, Suyu; Miao, Danmin; Yuan, Ying
2018-03-27
Sample size planning for longitudinal data is crucial when designing mediation studies because sufficient statistical power is not only required in grant applications and peer-reviewed publications, but is essential to reliable research results. However, sample size determination is not straightforward for mediation analysis of longitudinal design. To facilitate planning the sample size for longitudinal mediation studies with a multilevel mediation model, this article provides the sample size required to achieve 80% power by simulations under various sizes of the mediation effect, within-subject correlations and numbers of repeated measures. The sample size calculation is based on three commonly used mediation tests: Sobel's method, distribution of product method and the bootstrap method. Among the three methods of testing the mediation effects, Sobel's method required the largest sample size to achieve 80% power. Bootstrapping and the distribution of the product method performed similarly and were more powerful than Sobel's method, as reflected by the relatively smaller sample sizes. For all three methods, the sample size required to achieve 80% power depended on the value of the ICC (i.e., within-subject correlation). A larger value of ICC typically required a larger sample size to achieve 80% power. Simulation results also illustrated the advantage of the longitudinal study design. The sample size tables for most encountered scenarios in practice have also been published for convenient use. Extensive simulations study showed that the distribution of the product method and bootstrapping method have superior performance to the Sobel's method, but the product method was recommended to use in practice in terms of less computation time load compared to the bootstrapping method. A R package has been developed for the product method of sample size determination in mediation longitudinal study design.
Sample size of the reference sample in a case-augmented study.
Ghosh, Palash; Dewanji, Anup
2017-05-01
The case-augmented study, in which a case sample is augmented with a reference (random) sample from the source population with only covariates information known, is becoming popular in different areas of applied science such as pharmacovigilance, ecology, and econometrics. In general, the case sample is available from some source (for example, hospital database, case registry, etc.); however, the reference sample is required to be drawn from the corresponding source population. The required minimum size of the reference sample is an important issue in this regard. In this work, we address the minimum sample size calculation and discuss related issues. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
40 CFR 80.127 - Sample size guidelines.
2010-07-01
... 40 Protection of Environment 16 2010-07-01 2010-07-01 false Sample size guidelines. 80.127 Section 80.127 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) REGULATION OF FUELS AND FUEL ADDITIVES Attest Engagements § 80.127 Sample size guidelines. In performing the...
Galbraith, Niall D; Manktelow, Ken I; Morris, Neil G
2010-11-01
Previous studies demonstrate that people high in delusional ideation exhibit a data-gathering bias on inductive reasoning tasks. The current study set out to investigate the factors that may underpin such a bias by examining healthy individuals, classified as either high or low scorers on the Peters et al. Delusions Inventory (PDI). More specifically, whether high PDI scorers have a relatively poor appreciation of sample size and heterogeneity when making statistical judgments. In Expt 1, high PDI scorers made higher probability estimates when generalizing from a sample of 1 with regard to the heterogeneous human property of obesity. In Expt 2, this effect was replicated and was also observed in relation to the heterogeneous property of aggression. The findings suggest that delusion-prone individuals are less appreciative of the importance of sample size when making statistical judgments about heterogeneous properties; this may underpin the data gathering bias observed in previous studies. There was some support for the hypothesis that threatening material would exacerbate high PDI scorers' indifference to sample size.
Determination of the optimal sample size for a clinical trial accounting for the population size.
Stallard, Nigel; Miller, Frank; Day, Simon; Hee, Siew Wan; Madan, Jason; Zohar, Sarah; Posch, Martin
2017-07-01
The problem of choosing a sample size for a clinical trial is a very common one. In some settings, such as rare diseases or other small populations, the large sample sizes usually associated with the standard frequentist approach may be infeasible, suggesting that the sample size chosen should reflect the size of the population under consideration. Incorporation of the population size is possible in a decision-theoretic approach either explicitly by assuming that the population size is fixed and known, or implicitly through geometric discounting of the gain from future patients reflecting the expected population size. This paper develops such approaches. Building on previous work, an asymptotic expression is derived for the sample size for single and two-arm clinical trials in the general case of a clinical trial with a primary endpoint with a distribution of one parameter exponential family form that optimizes a utility function that quantifies the cost and gain per patient as a continuous function of this parameter. It is shown that as the size of the population, N, or expected size, N∗ in the case of geometric discounting, becomes large, the optimal trial size is O(N1/2) or O(N∗1/2). The sample size obtained from the asymptotic expression is also compared with the exact optimal sample size in examples with responses with Bernoulli and Poisson distributions, showing that the asymptotic approximations can also be reasonable in relatively small sample sizes. © 2016 The Author. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Schillaci, Michael A; Schillaci, Mario E
2009-02-01
The use of small sample sizes in human and primate evolutionary research is commonplace. Estimating how well small samples represent the underlying population, however, is not commonplace. Because the accuracy of determinations of taxonomy, phylogeny, and evolutionary process are dependant upon how well the study sample represents the population of interest, characterizing the uncertainty, or potential error, associated with analyses of small sample sizes is essential. We present a method for estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean using small (nresearchers to determine post hoc the probability that their sample is a meaningful approximation of the population parameter. We tested the method using a large craniometric data set commonly used by researchers in the field. Given our results, we suggest that sample estimates of the population mean can be reasonable and meaningful even when based on small, and perhaps even very small, sample sizes.
Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas
2014-01-01
Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology. PMID:25192357
Population estimates of extended family structure and size.
Garceau, Anne; Wideroff, Louise; McNeel, Timothy; Dunn, Marsha; Graubard, Barry I
2008-01-01
Population-based estimates of biological family size can be useful for planning genetic studies, assessing how distributions of relatives affect disease associations with family history and estimating prevalence of potential family support. Mean family size per person is estimated from a population-based telephone survey (n = 1,019). After multivariate adjustment for demographic variables, older and non-White respondents reported greater mean numbers of total, first- and second-degree relatives. Females reported more total and first-degree relatives, while less educated respondents reported more second-degree relatives. Demographic differences in family size have implications for genetic research. Therefore, periodic collection of family structure data in representative populations would be useful. Copyright 2008 S. Karger AG, Basel.
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
Schrago, Carlos G
2014-08-01
Reliable estimates of ancestral effective population sizes are necessary to unveil the population-level phenomena that shaped the phylogeny and molecular evolution of the African great apes. Although several methods have previously been applied to infer ancestral effective population sizes, an analysis of the influence of the selective regime on the estimates of ancestral demography has not been thoroughly conducted. In this study, three independent data sets under different selective regimes were used were composed to tackle this issue. The results showed that selection had a significant impact on the estimates of ancestral effective population sizes of the African great apes. The inference of the ancestral demography of African great apes was affected by the selection regime. The effects, however, were not homogeneous along the ancestral populations of great apes. The effective population size of the ancestor of humans and chimpanzees was more impacted by the selection regime when compared to the same parameter in the ancestor of humans, chimpanzees and gorillas. Because the selection regime influenced the estimates of ancestral effective population size, it is reasonable to assume that a portion of the discrepancy found in previous studies that inferred the ancestral effective population size may be attributable to the differential action of selection on the genes sampled.
Sample size determination and power
Ryan, Thomas P, Jr
2013-01-01
THOMAS P. RYAN, PhD, teaches online advanced statistics courses for Northwestern University and The Institute for Statistics Education in sample size determination, design of experiments, engineering statistics, and regression analysis.
Association between inaccurate estimation of body size and obesity in schoolchildren
Directory of Open Access Journals (Sweden)
Larissa da Cunha Feio Costa
2015-12-01
Full Text Available Objectives: To investigate the prevalence of inaccurate estimation of own body size among Brazilian schoolchildren of both sexes aged 7-10 years, and to test whether overweight/obesity; excess body fat and central obesity are associated with inaccuracy. Methods: Accuracy of body size estimation was assessed using the Figure Rating Scale for Brazilian Children. Multinomial logistic regression was used to analyze associations. Results: The overall prevalence of inaccurate body size estimation was 76%, with 34% of the children underestimating their body size and 42% overestimating their body size. Obesity measured by body mass index was associated with underestimation of body size in both sexes, while central obesity was only associated with overestimation of body size among girls. Conclusions: The results of this study suggest there is a high prevalence of inaccurate body size estimation and that inaccurate estimation is associated with obesity. Accurate estimation of own body size is important among obese schoolchildren because it may be the first step towards adopting healthy lifestyle behaviors.
Sampling of systematic errors to estimate likelihood weights in nuclear data uncertainty propagation
International Nuclear Information System (INIS)
Helgesson, P.; Sjöstrand, H.; Koning, A.J.; Rydén, J.; Rochman, D.; Alhassan, E.; Pomp, S.
2016-01-01
In methodologies for nuclear data (ND) uncertainty assessment and propagation based on random sampling, likelihood weights can be used to infer experimental information into the distributions for the ND. As the included number of correlated experimental points grows large, the computational time for the matrix inversion involved in obtaining the likelihood can become a practical problem. There are also other problems related to the conventional computation of the likelihood, e.g., the assumption that all experimental uncertainties are Gaussian. In this study, a way to estimate the likelihood which avoids matrix inversion is investigated; instead, the experimental correlations are included by sampling of systematic errors. It is shown that the model underlying the sampling methodology (using univariate normal distributions for random and systematic errors) implies a multivariate Gaussian for the experimental points (i.e., the conventional model). It is also shown that the likelihood estimates obtained through sampling of systematic errors approach the likelihood obtained with matrix inversion as the sample size for the systematic errors grows large. In studied practical cases, it is seen that the estimates for the likelihood weights converge impractically slowly with the sample size, compared to matrix inversion. The computational time is estimated to be greater than for matrix inversion in cases with more experimental points, too. Hence, the sampling of systematic errors has little potential to compete with matrix inversion in cases where the latter is applicable. Nevertheless, the underlying model and the likelihood estimates can be easier to intuitively interpret than the conventional model and the likelihood function involving the inverted covariance matrix. Therefore, this work can both have pedagogical value and be used to help motivating the conventional assumption of a multivariate Gaussian for experimental data. The sampling of systematic errors could also
Sample size determination in clinical trials with multiple endpoints
Sozu, Takashi; Hamasaki, Toshimitsu; Evans, Scott R
2015-01-01
This book integrates recent methodological developments for calculating the sample size and power in trials with more than one endpoint considered as multiple primary or co-primary, offering an important reference work for statisticians working in this area. The determination of sample size and the evaluation of power are fundamental and critical elements in the design of clinical trials. If the sample size is too small, important effects may go unnoticed; if the sample size is too large, it represents a waste of resources and unethically puts more participants at risk than necessary. Recently many clinical trials have been designed with more than one endpoint considered as multiple primary or co-primary, creating a need for new approaches to the design and analysis of these clinical trials. The book focuses on the evaluation of power and sample size determination when comparing the effects of two interventions in superiority clinical trials with multiple endpoints. Methods for sample size calculation in clin...
Fan, Chunpeng; Zhang, Donghui
2012-01-01
Although the Kruskal-Wallis test has been widely used to analyze ordered categorical data, power and sample size methods for this test have been investigated to a much lesser extent when the underlying multinomial distributions are unknown. This article generalizes the power and sample size procedures proposed by Fan et al. ( 2011 ) for continuous data to ordered categorical data, when estimates from a pilot study are used in the place of knowledge of the true underlying distribution. Simulations show that the proposed power and sample size formulas perform well. A myelin oligodendrocyte glycoprotein (MOG) induced experimental autoimmunce encephalomyelitis (EAE) mouse study is used to demonstrate the application of the methods.
Overall, John E; Tonidandel, Scott; Starbuck, Robert R
2006-01-01
Recent contributions to the statistical literature have provided elegant model-based solutions to the problem of estimating sample sizes for testing the significance of differences in mean rates of change across repeated measures in controlled longitudinal studies with differentially correlated error and missing data due to dropouts. However, the mathematical complexity and model specificity of these solutions make them generally inaccessible to most applied researchers who actually design and undertake treatment evaluation research in psychiatry. In contrast, this article relies on a simple two-stage analysis in which dropout-weighted slope coefficients fitted to the available repeated measurements for each subject separately serve as the dependent variable for a familiar ANCOVA test of significance for differences in mean rates of change. This article is about how a sample of size that is estimated or calculated to provide desired power for testing that hypothesis without considering dropouts can be adjusted appropriately to take dropouts into account. Empirical results support the conclusion that, whatever reasonable level of power would be provided by a given sample size in the absence of dropouts, essentially the same power can be realized in the presence of dropouts simply by adding to the original dropout-free sample size the number of subjects who would be expected to drop from a sample of that original size under conditions of the proposed study.
Support vector regression to predict porosity and permeability: Effect of sample size
Al-Anazi, A. F.; Gates, I. D.
2012-02-01
Porosity and permeability are key petrophysical parameters obtained from laboratory core analysis. Cores, obtained from drilled wells, are often few in number for most oil and gas fields. Porosity and permeability correlations based on conventional techniques such as linear regression or neural networks trained with core and geophysical logs suffer poor generalization to wells with only geophysical logs. The generalization problem of correlation models often becomes pronounced when the training sample size is small. This is attributed to the underlying assumption that conventional techniques employing the empirical risk minimization (ERM) inductive principle converge asymptotically to the true risk values as the number of samples increases. In small sample size estimation problems, the available training samples must span the complexity of the parameter space so that the model is able both to match the available training samples reasonably well and to generalize to new data. This is achieved using the structural risk minimization (SRM) inductive principle by matching the capability of the model to the available training data. One method that uses SRM is support vector regression (SVR) network. In this research, the capability of SVR to predict porosity and permeability in a heterogeneous sandstone reservoir under the effect of small sample size is evaluated. Particularly, the impact of Vapnik's ɛ-insensitivity loss function and least-modulus loss function on generalization performance was empirically investigated. The results are compared to the multilayer perception (MLP) neural network, a widely used regression method, which operates under the ERM principle. The mean square error and correlation coefficients were used to measure the quality of predictions. The results demonstrate that SVR yields consistently better predictions of the porosity and permeability with small sample size than the MLP method. Also, the performance of SVR depends on both kernel function
Christensen, Jette; Stryhn, Henrik; Vallières, André; El Allaki, Farouk
2011-05-01
In 2008, Canada designed and implemented the Canadian Notifiable Avian Influenza Surveillance System (CanNAISS) with six surveillance activities in a phased-in approach. CanNAISS was a surveillance system because it had more than one surveillance activity or component in 2008: passive surveillance; pre-slaughter surveillance; and voluntary enhanced notifiable avian influenza surveillance. Our objectives were to give a short overview of two active surveillance components in CanNAISS; describe the CanNAISS scenario tree model and its application to estimation of probability of populations being free of NAI virus infection and sample size determination. Our data from the pre-slaughter surveillance component included diagnostic test results from 6296 serum samples representing 601 commercial chicken and turkey farms collected from 25 August 2008 to 29 January 2009. In addition, we included data from a sub-population of farms with high biosecurity standards: 36,164 samples from 55 farms sampled repeatedly over the 24 months study period from January 2007 to December 2008. All submissions were negative for Notifiable Avian Influenza (NAI) virus infection. We developed the CanNAISS scenario tree model, so that it will estimate the surveillance component sensitivity and the probability of a population being free of NAI at the 0.01 farm-level and 0.3 within-farm-level prevalences. We propose that a general model, such as the CanNAISS scenario tree model, may have a broader application than more detailed models that require disease specific input parameters, such as relative risk estimates. Crown Copyright © 2011. Published by Elsevier B.V. All rights reserved.
Results and evaluation of a survey to estimate Pacific walrus population size, 2006
Speckman, Suzann G.; Chernook, Vladimir I.; Burn, Douglas M.; Udevitz, Mark S.; Kochnev, Anatoly A.; Vasilev, Alexander; Jay, Chadwick V.; Lisovsky, Alexander; Fischbach, Anthony S.; Benter, R. Bradley
2011-01-01
In spring 2006, we conducted a collaborative U.S.-Russia survey to estimate abundance of the Pacific walrus (Odobenus rosmarus divergens). The Bering Sea was partitioned into survey blocks, and a systematic random sample of transects within a subset of the blocks was surveyed with airborne thermal scanners using standard strip-transect methodology. Counts of walruses in photographed groups were used to model the relation between thermal signatures and the number of walruses in groups, which was used to estimate the number of walruses in groups that were detected by the scanner but not photographed. We also modeled the probability of thermally detecting various-sized walrus groups to estimate the number of walruses in groups undetected by the scanner. We used data from radio-tagged walruses to adjust on-ice estimates to account for walruses in the water during the survey. The estimated area of available habitat averaged 668,000 km2 and the area of surveyed blocks was 318,204 km2. The number of Pacific walruses within the surveyed area was estimated at 129,000 with 95% confidence limits of 55,000 to 507,000 individuals. This value can be used by managers as a minimum estimate of the total population size.
Zhong, Wei; Koopmeiners, Joseph S; Carlin, Bradley P
2013-11-01
Frequentist sample size determination for binary outcome data in a two-arm clinical trial requires initial guesses of the event probabilities for the two treatments. Misspecification of these event rates may lead to a poor estimate of the necessary sample size. In contrast, the Bayesian approach that considers the treatment effect to be random variable having some distribution may offer a better, more flexible approach. The Bayesian sample size proposed by (Whitehead et al., 2008) for exploratory studies on efficacy justifies the acceptable minimum sample size by a "conclusiveness" condition. In this work, we introduce a new two-stage Bayesian design with sample size reestimation at the interim stage. Our design inherits the properties of good interpretation and easy implementation from Whitehead et al. (2008), generalizes their method to a two-sample setting, and uses a fully Bayesian predictive approach to reduce an overly large initial sample size when necessary. Moreover, our design can be extended to allow patient level covariates via logistic regression, now adjusting sample size within each subgroup based on interim analyses. We illustrate the benefits of our approach with a design in non-Hodgkin lymphoma with a simple binary covariate (patient gender), offering an initial step toward within-trial personalized medicine. Copyright © 2013 Elsevier Inc. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Faye, C.B.; Amodeo, T.; Fréjafon, E. [Institut National de l' Environnement Industriel et des Risques (INERIS/DRC/CARA/NOVA), Parc Technologique Alata, BP 2, 60550 Verneuil-En-Halatte (France); Delepine-Gilon, N. [Institut des Sciences Analytiques, 5 rue de la Doua, 69100 Villeurbanne (France); Dutouquet, C., E-mail: christophe.dutouquet@ineris.fr [Institut National de l' Environnement Industriel et des Risques (INERIS/DRC/CARA/NOVA), Parc Technologique Alata, BP 2, 60550 Verneuil-En-Halatte (France)
2014-01-01
Pollution of water is a matter of concern all over the earth. Particles are known to play an important role in the transportation of pollutants in this medium. In addition, the emergence of new materials such as NOAA (Nano-Objects, their Aggregates and their Agglomerates) emphasizes the need to develop adapted instruments for their detection. Surveillance of pollutants in particulate form in waste waters in industries involved in nanoparticle manufacturing and processing is a telling example of possible applications of such instrumental development. The LIBS (laser-induced breakdown spectroscopy) technique coupled with the liquid jet as sampling mode for suspensions was deemed as a potential candidate for on-line and real time monitoring. With the final aim in view to obtain the best detection limits, the interaction of nanosecond laser pulses with the liquid jet was examined. The evolution of the volume sampled by laser pulses was estimated as a function of the laser energy applying conditional analysis when analyzing a suspension of micrometric-sized particles of borosilicate glass. An estimation of the sampled depth was made. Along with the estimation of the sampled volume, the evolution of the SNR (signal to noise ratio) as a function of the laser energy was investigated as well. Eventually, the laser energy and the corresponding fluence optimizing both the sampling volume and the SNR were determined. The obtained results highlight intrinsic limitations of the liquid jet sampling mode when using 532 nm nanosecond laser pulses with suspensions. - Highlights: • Micrometric-sized particles in suspensions are analyzed using LIBS and a liquid jet. • The evolution of the sampling volume is estimated as a function of laser energy. • The sampling volume happens to saturate beyond a certain laser fluence. • Its value was found much lower than the beam diameter times the jet thickness. • Particles proved not to be entirely vaporized.
International Nuclear Information System (INIS)
Yoo, Seung-Hoon; Lim, Hea-Jin; Kwak, Seung-Jun
2009-01-01
Over the last twenty years, the consumption of natural gas in Korea has increased dramatically. This increase has mainly resulted from the rise of consumption in the residential sector. The main objective of the study is to estimate households' demand function for natural gas by applying a sample selection model using data from a survey of households in Seoul. The results show that there exists a selection bias in the sample and that failure to correct for sample selection bias distorts the mean estimate, of the demand for natural gas, downward by 48.1%. In addition, according to the estimation results, the size of the house, the dummy variable for dwelling in an apartment, the dummy variable for having a bed in an inner room, and the household's income all have positive relationships with the demand for natural gas. On the other hand, the size of the family and the price of gas negatively contribute to the demand for natural gas. (author)
Estimates of laboratory accuracy and precision on Hanford waste tank samples
International Nuclear Information System (INIS)
Dodd, D.A.
1995-01-01
A review was performed on three sets of analyses generated in Battelle, Pacific Northwest Laboratories and three sets generated by Westinghouse Hanford Company, 222-S Analytical Laboratory. Laboratory accuracy and precision was estimated by analyte and is reported in tables. The sources used to generate this estimate is of limited size but does include the physical forms, liquid and solid, which are representative of samples from tanks to be characterized. This estimate was published as an aid to programs developing data quality objectives in which specified limits are established. Data resulting from routine analyses of waste matrices can be expected to be bounded by the precision and accuracy estimates of the tables. These tables do not preclude or discourage direct negotiations between program and laboratory personnel while establishing bounding conditions. Programmatic requirements different than those listed may be reliably met on specific measurements and matrices. It should be recognized, however, that these are specific to waste tank matrices and may not be indicative of performance on samples from other sources
How to Estimate and Interpret Various Effect Sizes
Vacha-Haase, Tammi; Thompson, Bruce
2004-01-01
The present article presents a tutorial on how to estimate and interpret various effect sizes. The 5th edition of the Publication Manual of the American Psychological Association (2001) described the failure to report effect sizes as a "defect" (p. 5), and 23 journals have published author guidelines requiring effect size reporting. Although…
Sample size determination for equivalence assessment with multiple endpoints.
Sun, Anna; Dong, Xiaoyu; Tsong, Yi
2014-01-01
Equivalence assessment between a reference and test treatment is often conducted by two one-sided tests (TOST). The corresponding power function and sample size determination can be derived from a joint distribution of the sample mean and sample variance. When an equivalence trial is designed with multiple endpoints, it often involves several sets of two one-sided tests. A naive approach for sample size determination in this case would select the largest sample size required for each endpoint. However, such a method ignores the correlation among endpoints. With the objective to reject all endpoints and when the endpoints are uncorrelated, the power function is the production of all power functions for individual endpoints. With correlated endpoints, the sample size and power should be adjusted for such a correlation. In this article, we propose the exact power function for the equivalence test with multiple endpoints adjusted for correlation under both crossover and parallel designs. We further discuss the differences in sample size for the naive method without and with correlation adjusted methods and illustrate with an in vivo bioequivalence crossover study with area under the curve (AUC) and maximum concentration (Cmax) as the two endpoints.
Preeminence and prerequisites of sample size calculations in clinical trials
Richa Singhal; Rakesh Rana
2015-01-01
The key components while planning a clinical study are the study design, study duration, and sample size. These features are an integral part of planning a clinical trial efficiently, ethically, and cost-effectively. This article describes some of the prerequisites for sample size calculation. It also explains that sample size calculation is different for different study designs. The article in detail describes the sample size calculation for a randomized controlled trial when the primary out...
Guo, Jiin-Huarng; Luh, Wei-Ming
2009-05-01
When planning a study, sample size determination is one of the most important tasks facing the researcher. The size will depend on the purpose of the study, the cost limitations, and the nature of the data. By specifying the standard deviation ratio and/or the sample size ratio, the present study considers the problem of heterogeneous variances and non-normality for Yuen's two-group test and develops sample size formulas to minimize the total cost or maximize the power of the test. For a given power, the sample size allocation ratio can be manipulated so that the proposed formulas can minimize the total cost, the total sample size, or the sum of total sample size and total cost. On the other hand, for a given total cost, the optimum sample size allocation ratio can maximize the statistical power of the test. After the sample size is determined, the present simulation applies Yuen's test to the sample generated, and then the procedure is validated in terms of Type I errors and power. Simulation results show that the proposed formulas can control Type I errors and achieve the desired power under the various conditions specified. Finally, the implications for determining sample sizes in experimental studies and future research are discussed.
Optimal sample size for probability of detection curves
International Nuclear Information System (INIS)
Annis, Charles; Gandossi, Luca; Martin, Oliver
2013-01-01
Highlights: • We investigate sample size requirement to develop probability of detection curves. • We develop simulations to determine effective inspection target sizes, number and distribution. • We summarize these findings and provide guidelines for the NDE practitioner. -- Abstract: The use of probability of detection curves to quantify the reliability of non-destructive examination (NDE) systems is common in the aeronautical industry, but relatively less so in the nuclear industry, at least in European countries. Due to the nature of the components being inspected, sample sizes tend to be much lower. This makes the manufacturing of test pieces with representative flaws, in sufficient numbers, so to draw statistical conclusions on the reliability of the NDT system under investigation, quite costly. The European Network for Inspection and Qualification (ENIQ) has developed an inspection qualification methodology, referred to as the ENIQ Methodology. It has become widely used in many European countries and provides assurance on the reliability of NDE systems, but only qualitatively. The need to quantify the output of inspection qualification has become more important as structural reliability modelling and quantitative risk-informed in-service inspection methodologies become more widely used. A measure of the NDE reliability is necessary to quantify risk reduction after inspection and probability of detection (POD) curves provide such a metric. The Joint Research Centre, Petten, The Netherlands supported ENIQ by investigating the question of the sample size required to determine a reliable POD curve. As mentioned earlier manufacturing of test pieces with defects that are typically found in nuclear power plants (NPPs) is usually quite expensive. Thus there is a tendency to reduce sample sizes, which in turn increases the uncertainty associated with the resulting POD curve. The main question in conjunction with POS curves is the appropriate sample size. Not
Food photographs in portion size estimation among adolescent Mozambican girls.
Korkalo, Liisa; Erkkola, Maijaliisa; Fidalgo, Lourdes; Nevalainen, Jaakko; Mutanen, Marja
2013-09-01
To assess the validity of food photographs in portion size estimation among adolescent girls in Mozambique. The study was carried out in preparation for the larger ZANE study, which used the 24 h dietary recall method. Life-sized photographs of three portion sizes of two staple foods and three sauces were produced. Participants ate weighed portions of one staple food and one sauce. After the meal, they were asked to estimate the amount of food with the aid of the food photographs. Zambezia Province, Mozambique. Ninety-nine girls aged 13–18 years. The mean differences between estimated and actual portion sizes relative to the actual portion size ranged from 219% to 8% for different foods. The respective mean difference for all foods combined was 25% (95% CI 212, 2 %). Especially larger portions of the staple foods were often underestimated. For the staple foods, between 62% and 64% of the participants were classified into the same thirds of the distribution of estimated and actual food consumption and for sauces, the percentages ranged from 38% to 63%. Bland–Altman plots showed wide limits of agreement. Using life-sized food photographs among adolescent Mozambican girls resulted in a rather large variation in the accuracy of individuals’ estimates. The ability to rank individuals according to their consumption was, however, satisfactory for most foods. There seems to be a need to further develop and test food photographs used in different populations in Sub-Saharan Africa to improve the accuracy of portion size estimates.
Estimation of individual reference intervals in small sample sizes
DEFF Research Database (Denmark)
Hansen, Ase Marie; Garde, Anne Helene; Eller, Nanna Hurwitz
2007-01-01
In occupational health studies, the study groups most often comprise healthy subjects performing their work. Sampling is often planned in the most practical way, e.g., sampling of blood in the morning at the work site just after the work starts. Optimal use of reference intervals requires...... from various variables such as gender, age, BMI, alcohol, smoking, and menopause. The reference intervals were compared to reference intervals calculated using IFCC recommendations. Where comparable, the IFCC calculated reference intervals had a wider range compared to the variance component models...
Vereecken, Carine; Dohogne, Sophie; Covents, Marc; Maes, Lea
2010-01-01
Computer-administered questionnaires have received increased attention for large-scale population research on nutrition. In Belgium-Flanders, Young Adolescents' Nutrition Assessment on Computer (YANA-C) has been developed. In this tool, standardised photographs are available to assist in portion-size estimation. The purpose of the present study is to assess how accurate adolescents are in estimating portion sizes of food using YANA-C. A convenience sample, aged 11-17 years, estimated the amou...
Habermehl, Christina; Benner, Axel; Kopp-Schneider, Annette
2018-03-01
In recent years, numerous approaches for biomarker-based clinical trials have been developed. One of these developments are multiple-biomarker trials, which aim to investigate multiple biomarkers simultaneously in independent subtrials. For low-prevalence biomarkers, small sample sizes within the subtrials have to be expected, as well as many biomarker-negative patients at the screening stage. The small sample sizes may make it unfeasible to analyze the subtrials individually. This imposes the need to develop new approaches for the analysis of such trials. With an expected large group of biomarker-negative patients, it seems reasonable to explore options to benefit from including them in such trials. We consider advantages and disadvantages of the inclusion of biomarker-negative patients in a multiple-biomarker trial with a survival endpoint. We discuss design options that include biomarker-negative patients in the study and address the issue of small sample size bias in such trials. We carry out a simulation study for a design where biomarker-negative patients are kept in the study and are treated with standard of care. We compare three different analysis approaches based on the Cox model to examine if the inclusion of biomarker-negative patients can provide a benefit with respect to bias and variance of the treatment effect estimates. We apply the Firth correction to reduce the small sample size bias. The results of the simulation study suggest that for small sample situations, the Firth correction should be applied to adjust for the small sample size bias. Additional to the Firth penalty, the inclusion of biomarker-negative patients in the analysis can lead to further but small improvements in bias and standard deviation of the estimates. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Engemann, Kristine; Enquist, Brian J; Sandel, Brody; Boyle, Brad; Jørgensen, Peter M; Morueta-Holme, Naia; Peet, Robert K; Violle, Cyrille; Svenning, Jens-Christian
2015-02-01
Macro-scale species richness studies often use museum specimens as their main source of information. However, such datasets are often strongly biased due to variation in sampling effort in space and time. These biases may strongly affect diversity estimates and may, thereby, obstruct solid inference on the underlying diversity drivers, as well as mislead conservation prioritization. In recent years, this has resulted in an increased focus on developing methods to correct for sampling bias. In this study, we use sample-size-correcting methods to examine patterns of tropical plant diversity in Ecuador, one of the most species-rich and climatically heterogeneous biodiversity hotspots. Species richness estimates were calculated based on 205,735 georeferenced specimens of 15,788 species using the Margalef diversity index, the Chao estimator, the second-order Jackknife and Bootstrapping resampling methods, and Hill numbers and rarefaction. Species richness was heavily correlated with sampling effort, and only rarefaction was able to remove this effect, and we recommend this method for estimation of species richness with "big data" collections.
Preeminence and prerequisites of sample size calculations in clinical trials
Directory of Open Access Journals (Sweden)
Richa Singhal
2015-01-01
Full Text Available The key components while planning a clinical study are the study design, study duration, and sample size. These features are an integral part of planning a clinical trial efficiently, ethically, and cost-effectively. This article describes some of the prerequisites for sample size calculation. It also explains that sample size calculation is different for different study designs. The article in detail describes the sample size calculation for a randomized controlled trial when the primary outcome is a continuous variable and when it is a proportion or a qualitative variable.
Giorli, Giacomo; Drazen, Jeffrey C.; Neuheimer, Anna B.; Copeland, Adrienne; Au, Whitlow W. L.
2018-01-01
Pelagic animals that form deep sea scattering layers (DSLs) represent an important link in the food web between zooplankton and top predators. While estimating the composition, density and location of the DSL is important to understand mesopelagic ecosystem dynamics and to predict top predators' distribution, DSL composition and density are often estimated from trawls which may be biased in terms of extrusion, avoidance, and gear-associated biases. Instead, location and biomass of DSLs can be estimated from active acoustic techniques, though estimates are often in aggregate without regard to size or taxon specific information. For the first time in the open ocean, we used a DIDSON sonar to characterize the fauna in DSLs. Estimates of the numerical density and length of animals at different depths and locations along the Kona coast of the Island of Hawaii were determined. Data were collected below and inside the DSLs with the sonar mounted on a profiler. A total of 7068 animals were counted and sized. We estimated numerical densities ranging from 1 to 7 animals/m3 and individuals as long as 3 m were detected. These numerical densities were orders of magnitude higher than those estimated from trawls and average sizes of animals were much larger as well. A mixed model was used to characterize numerical density and length of animals as a function of deep sea layer sampled, location, time of day, and day of the year. Numerical density and length of animals varied by month, with numerical density also a function of depth. The DIDSON proved to be a good tool for open-ocean/deep-sea estimation of the numerical density and size of marine animals, especially larger ones. Further work is needed to understand how this methodology relates to estimates of volume backscatters obtained with standard echosounding techniques, density measures obtained with other sampling methodologies, and to precisely evaluate sampling biases.
Impaired hand size estimation in CRPS.
Peltz, Elena; Seifert, Frank; Lanz, Stefan; Müller, Rüdiger; Maihöfner, Christian
2011-10-01
A triad of clinical symptoms, ie, autonomic, motor and sensory dysfunctions, characterizes complex regional pain syndromes (CRPS). Sensory dysfunction comprises sensory loss or spontaneous and stimulus-evoked pain. Furthermore, a disturbance in the body schema may occur. In the present study, patients with CRPS of the upper extremity and healthy controls estimated their hand sizes on the basis of expanded or compressed schematic drawings of hands. In patients with CRPS we found an impairment in accurate hand size estimation; patients estimated their own CRPS-affected hand to be larger than it actually was when measured objectively. Moreover, overestimation correlated significantly with disease duration, neglect score, and increase of two-point-discrimination-thresholds (TPDT) compared to the unaffected hand and to control subjects' estimations. In line with previous functional imaging studies in CRPS patients demonstrating changes in central somatotopic maps, we suggest an involvement of the central nervous system in this disruption of the body schema. Potential cortical areas may be the primary somatosensory and posterior parietal cortices, which have been proposed to play a critical role in integrating visuospatial information. CRPS patients perceive their affected hand to be bigger than it is. The magnitude of this overestimation correlates with disease duration, decreased tactile thresholds, and neglect-score. Suggesting a disrupted body schema as the source of this impairment, our findings corroborate the current assumption of a CNS involvement in CRPS. Copyright © 2011 American Pain Society. Published by Elsevier Inc. All rights reserved.
Test of a sample container for shipment of small size plutonium samples with PAT-2
International Nuclear Information System (INIS)
Kuhn, E.; Aigner, H.; Deron, S.
1981-11-01
A light-weight container for the air transport of plutonium, to be designated PAT-2, has been developed in the USA and is presently undergoing licensing. The very limited effective space for bearing plutonium required the design of small size sample canisters to meet the needs of international safeguards for the shipment of plutonium samples. The applicability of a small canister for the sampling of small size powder and solution samples has been tested in an intralaboratory experiment. The results of the experiment, based on the concept of pre-weighed samples, show that the tested canister can successfully be used for the sampling of small size PuO 2 -powder samples of homogeneous source material, as well as for dried aliquands of plutonium nitrate solutions. (author)
Directory of Open Access Journals (Sweden)
R. Eric Heidel
2016-01-01
Full Text Available Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power.
Body size estimation of self and others in females varying in BMI.
Directory of Open Access Journals (Sweden)
Anne Thaler
Full Text Available Previous literature suggests that a disturbed ability to accurately identify own body size may contribute to overweight. Here, we investigated the influence of personal body size, indexed by body mass index (BMI, on body size estimation in a non-clinical population of females varying in BMI. We attempted to disentangle general biases in body size estimates and attitudinal influences by manipulating whether participants believed the body stimuli (personalized avatars with realistic weight variations represented their own body or that of another person. Our results show that the accuracy of own body size estimation is predicted by personal BMI, such that participants with lower BMI underestimated their body size and participants with higher BMI overestimated their body size. Further, participants with higher BMI were less likely to notice the same percentage of weight gain than participants with lower BMI. Importantly, these results were only apparent when participants were judging a virtual body that was their own identity (Experiment 1, but not when they estimated the size of a body with another identity and the same underlying body shape (Experiment 2a. The different influences of BMI on accuracy of body size estimation and sensitivity to weight change for self and other identity suggests that effects of BMI on visual body size estimation are self-specific and not generalizable to other bodies.
Body size estimation of self and others in females varying in BMI.
Thaler, Anne; Geuss, Michael N; Mölbert, Simone C; Giel, Katrin E; Streuber, Stephan; Romero, Javier; Black, Michael J; Mohler, Betty J
2018-01-01
Previous literature suggests that a disturbed ability to accurately identify own body size may contribute to overweight. Here, we investigated the influence of personal body size, indexed by body mass index (BMI), on body size estimation in a non-clinical population of females varying in BMI. We attempted to disentangle general biases in body size estimates and attitudinal influences by manipulating whether participants believed the body stimuli (personalized avatars with realistic weight variations) represented their own body or that of another person. Our results show that the accuracy of own body size estimation is predicted by personal BMI, such that participants with lower BMI underestimated their body size and participants with higher BMI overestimated their body size. Further, participants with higher BMI were less likely to notice the same percentage of weight gain than participants with lower BMI. Importantly, these results were only apparent when participants were judging a virtual body that was their own identity (Experiment 1), but not when they estimated the size of a body with another identity and the same underlying body shape (Experiment 2a). The different influences of BMI on accuracy of body size estimation and sensitivity to weight change for self and other identity suggests that effects of BMI on visual body size estimation are self-specific and not generalizable to other bodies.
International Nuclear Information System (INIS)
Walsh, Conor; Bows, Alice
2012-01-01
Highlights: ► Ship emission baselines can be used to inform studies but require prior knowledge. ► Region specific conditions alter average shipping emission factors. ► Region specific conditions are clearer when individual callings are examined. ► Relationship between ship size and emissions frustrates estimating mean emissions. -- Abstract: The decarbonisation agenda is placing increasing pressure on retailers to directly and indirectly influence greenhouse gas emissions associated with full supply chains. Transportation by sea is an important and significant element of these supply chains, yet the emissions associated with shipping, particularly international shipping, are often poorly accounted for. The magnitude of emissions embodied in a product is directly related to the distances involved in globalised product chains, where shipping can represent the most emission intensive stage per tonne of goods transported. Specifically, limited choice of ship type and size within assessment tools negates a fair estimate of product chain emissions. To address this, the correlation between ship emissions and size is quantified for a sample of United Kingdom (UK) port callings to estimate typical UK emission factors by ship type and size and to determine how well existing global data and available databases represent UK shipping activity. The results highlight that although ship type is a crucial determinant of emissions, vessel size is also important, particularly for smaller ships where the variance in emission factors is greatest. Existing, globally averaged data correlating ship size with emissions agree well with the UK data. However, the relatively higher proportion of smaller ships satisfying a UK demand for short sea shipping results in a skew towards higher typical emission factors, principally within the general cargo, product and chemical tanker categories. This bias is most visible when emissions per individual ship calling are estimated. Incorporating
Directory of Open Access Journals (Sweden)
Daud Jones Kachamba
2017-06-01
Full Text Available Applications of unmanned aircraft systems (UASs to assist in forest inventories have provided promising results in biomass estimation for different forest types. Recent studies demonstrating use of different types of remotely sensed data to assist in biomass estimation have shown that accuracy and precision of estimates are influenced by the size of field sample plots used to obtain reference values for biomass. The objective of this case study was to assess the influence of sample plot size on efficiency of UAS-assisted biomass estimates in the dry tropical miombo woodlands of Malawi. The results of a design-based field sample inventory assisted by three-dimensional point clouds obtained from aerial imagery acquired with a UAS showed that the root mean square errors as well as the standard error estimates of mean biomass decreased as sample plot sizes increased. Furthermore, relative efficiency values over different sample plot sizes were above 1.0 in a design-based and model-assisted inferential framework, indicating that UAS-assisted inventories were more efficient than purely field-based inventories. The results on relative costs for UAS-assisted and pure field-based sample plot inventories revealed that there is a trade-off between inventory costs and required precision. For example, in our study if a standard error of less than approximately 3 Mg ha−1 was targeted, then a UAS-assisted forest inventory should be applied to ensure more cost effective and precise estimates. Future studies should therefore focus on finding optimum plot sizes for particular applications, like for example in projects under the Reducing Emissions from Deforestation and Forest Degradation, plus forest conservation, sustainable management of forest and enhancement of carbon stocks (REDD+ mechanism with different geographical scales.
Energy Technology Data Exchange (ETDEWEB)
Viskari, T.
2012-07-01
Atmospheric aerosol particles have several important effects on the environment and human society. The exact impact of aerosol particles is largely determined by their particle size distributions. However, no single instrument is able to measure the whole range of the particle size distribution. Estimating a particle size distribution from multiple simultaneous measurements remains a challenge in aerosol physical research. Current methods to combine different measurements require assumptions concerning the overlapping measurement ranges and have difficulties in accounting for measurement uncertainties. In this thesis, Extended Kalman Filter (EKF) is presented as a promising method to estimate particle number size distributions from multiple simultaneous measurements. The particle number size distribution estimated by EKF includes information from prior particle number size distributions as propagated by a dynamical model and is based on the reliabilities of the applied information sources. Known physical processes and dynamically evolving error covariances constrain the estimate both over time and particle size. The method was tested with measurements from Differential Mobility Particle Sizer (DMPS), Aerodynamic Particle Sizer (APS) and nephelometer. The particle number concentration was chosen as the state of interest. The initial EKF implementation presented here includes simplifications, yet the results are positive and the estimate successfully incorporated information from the chosen instruments. For particle sizes smaller than 4 micrometers, the estimate fits the available measurements and smooths the particle number size distribution over both time and particle diameter. The estimate has difficulties with particles larger than 4 micrometers due to issues with both measurements and the dynamical model in that particle size range. The EKF implementation appears to reduce the impact of measurement noise on the estimate, but has a delayed reaction to sudden
Improving accuracy of portion-size estimations through a stimulus equivalence paradigm.
Hausman, Nicole L; Borrero, John C; Fisher, Alyssa; Kahng, SungWoo
2014-01-01
The prevalence of obesity continues to increase in the United States (Gordon-Larsen, The, & Adair, 2010). Obesity can be attributed, in part, to overconsumption of energy-dense foods. Given that overeating plays a role in the development of obesity, interventions that teach individuals to identify and consume appropriate portion sizes are warranted. Specifically, interventions that teach individuals to estimate portion sizes correctly without the use of aids may be critical to the success of nutrition education programs. The current study evaluated the use of a stimulus equivalence paradigm to teach 9 undergraduate students to estimate portion size accurately. Results suggested that the stimulus equivalence paradigm was effective in teaching participants to make accurate portion size estimations without aids, and improved accuracy was observed in maintenance sessions that were conducted 1 week after training. Furthermore, 5 of 7 participants estimated the target portion size of novel foods during extension sessions. These data extend existing research on teaching accurate portion-size estimations and may be applicable to populations who seek treatment (e.g., overweight or obese children and adults) to teach healthier eating habits. © Society for the Experimental Analysis of Behavior.
International Nuclear Information System (INIS)
Sampson, T.E.
1991-01-01
Recent advances in segmented gamma scanning have emphasized software corrections for gamma-ray self-adsorption in particulates or lumps of special nuclear material in the sample. another feature of this software is an attenuation correction factor formalism that explicitly accounts for differences in sample container size and composition between the calibration standards and the individual items being measured. Software without this container-size correction produces biases when the unknowns are not packaged in the same containers as the calibration standards. This new software allows the use of different size and composition containers for standards and unknowns, as enormous savings considering the expense of multiple calibration standard sets otherwise needed. This paper presents calculations of the bias resulting from not using this new formalism. These calculations may be used to estimate bias corrections for segmented gamma scanners that do not incorporate these advanced concepts
CT dose survey in adults: what sample size for what precision?
International Nuclear Information System (INIS)
Taylor, Stephen; Muylem, Alain van; Howarth, Nigel; Gevenois, Pierre Alain; Tack, Denis
2017-01-01
To determine variability of volume computed tomographic dose index (CTDIvol) and dose-length product (DLP) data, and propose a minimum sample size to achieve an expected precision. CTDIvol and DLP values of 19,875 consecutive CT acquisitions of abdomen (7268), thorax (3805), lumbar spine (3161), cervical spine (1515) and head (4106) were collected in two centers. Their variabilities were investigated according to sample size (10 to 1000 acquisitions) and patient body weight categories (no weight selection, 67-73 kg and 60-80 kg). The 95 % confidence interval in percentage of their median (CI95/med) value was calculated for increasing sample sizes. We deduced the sample size that set a 95 % CI lower than 10 % of the median (CI95/med ≤ 10 %). Sample size ensuring CI95/med ≤ 10 %, ranged from 15 to 900 depending on the body region and the dose descriptor considered. In sample sizes recommended by regulatory authorities (i.e., from 10-20 patients), mean CTDIvol and DLP of one sample ranged from 0.50 to 2.00 times its actual value extracted from 2000 samples. The sampling error in CTDIvol and DLP means is high in dose surveys based on small samples of patients. Sample size should be increased at least tenfold to decrease this variability. (orig.)
CT dose survey in adults: what sample size for what precision?
Energy Technology Data Exchange (ETDEWEB)
Taylor, Stephen [Hopital Ambroise Pare, Department of Radiology, Mons (Belgium); Muylem, Alain van [Hopital Erasme, Department of Pneumology, Brussels (Belgium); Howarth, Nigel [Clinique des Grangettes, Department of Radiology, Chene-Bougeries (Switzerland); Gevenois, Pierre Alain [Hopital Erasme, Department of Radiology, Brussels (Belgium); Tack, Denis [EpiCURA, Clinique Louis Caty, Department of Radiology, Baudour (Belgium)
2017-01-15
To determine variability of volume computed tomographic dose index (CTDIvol) and dose-length product (DLP) data, and propose a minimum sample size to achieve an expected precision. CTDIvol and DLP values of 19,875 consecutive CT acquisitions of abdomen (7268), thorax (3805), lumbar spine (3161), cervical spine (1515) and head (4106) were collected in two centers. Their variabilities were investigated according to sample size (10 to 1000 acquisitions) and patient body weight categories (no weight selection, 67-73 kg and 60-80 kg). The 95 % confidence interval in percentage of their median (CI95/med) value was calculated for increasing sample sizes. We deduced the sample size that set a 95 % CI lower than 10 % of the median (CI95/med ≤ 10 %). Sample size ensuring CI95/med ≤ 10 %, ranged from 15 to 900 depending on the body region and the dose descriptor considered. In sample sizes recommended by regulatory authorities (i.e., from 10-20 patients), mean CTDIvol and DLP of one sample ranged from 0.50 to 2.00 times its actual value extracted from 2000 samples. The sampling error in CTDIvol and DLP means is high in dose surveys based on small samples of patients. Sample size should be increased at least tenfold to decrease this variability. (orig.)
Zeng, Chen; Rosengard, Sarah Z.; Burt, William; Peña, M. Angelica; Nemcek, Nina; Zeng, Tao; Arrigo, Kevin R.; Tortell, Philippe D.
2018-06-01
We evaluate several algorithms for the estimation of phytoplankton size class (PSC) and functional type (PFT) biomass from ship-based optical measurements in the Subarctic Northeast Pacific Ocean. Using underway measurements of particulate absorption and backscatter in surface waters, we derived estimates of PSC/PFT based on chlorophyll-a concentrations (Chl-a), particulate absorption spectra and the wavelength dependence of particulate backscatter. Optically-derived [Chl-a] and phytoplankton absorption measurements were validated against discrete calibration samples, while the derived PSC/PFT estimates were validated using size-fractionated Chl-a measurements and HPLC analysis of diagnostic photosynthetic pigments (DPA). Our results showflo that PSC/PFT algorithms based on [Chl-a] and particulate absorption spectra performed significantly better than the backscatter slope approach. These two more successful algorithms yielded estimates of phytoplankton size classes that agreed well with HPLC-derived DPA estimates (RMSE = 12.9%, and 16.6%, respectively) across a range of hydrographic and productivity regimes. Moreover, the [Chl-a] algorithm produced PSC estimates that agreed well with size-fractionated [Chl-a] measurements, and estimates of the biomass of specific phytoplankton groups that were consistent with values derived from HPLC. Based on these results, we suggest that simple [Chl-a] measurements should be more fully exploited to improve the classification of phytoplankton assemblages in the Northeast Pacific Ocean.
Genome size estimation: a new methodology
Álvarez-Borrego, Josué; Gallardo-Escárate, Crisitian; Kober, Vitaly; López-Bonilla, Oscar
2007-03-01
Recently, within the cytogenetic analysis, the evolutionary relations implied in the content of nuclear DNA in plants and animals have received a great attention. The first detailed measurements of the nuclear DNA content were made in the early 40's, several years before Watson and Crick proposed the molecular structure of the DNA. In the following years Hewson Swift developed the concept of "C-value" in reference to the haploid phase of DNA in plants. Later Mirsky and Ris carried out the first systematic study of genomic size in animals, including representatives of the five super classes of vertebrates as well as of some invertebrates. From these preliminary results it became evident that the DNA content varies enormously between the species and that this variation does not bear relation to the intuitive notion from the complexity of the organism. Later, this observation was reaffirmed in the following years as the studies increased on genomic size, thus denominating to this characteristic of the organisms like the "Paradox of the C-value". Few years later along with the no-codification discovery of DNA the paradox was solved, nevertheless, numerous questions remain until nowadays unfinished, taking to denominate this type of studies like the "C-value enigma". In this study, we reported a new method for genome size estimation by quantification of fluorescence fading. We measured the fluorescence intensity each 1600 milliseconds in DAPI-stained nuclei. The estimation of the area under the graph (integral fading) during fading period was related with the genome size.
Dunham, Kylee; Grand, James B.
2016-01-01
We examined the effects of complexity and priors on the accuracy of models used to estimate ecological and observational processes, and to make predictions regarding population size and structure. State-space models are useful for estimating complex, unobservable population processes and making predictions about future populations based on limited data. To better understand the utility of state space models in evaluating population dynamics, we used them in a Bayesian framework and compared the accuracy of models with differing complexity, with and without informative priors using sequential importance sampling/resampling (SISR). Count data were simulated for 25 years using known parameters and observation process for each model. We used kernel smoothing to reduce the effect of particle depletion, which is common when estimating both states and parameters with SISR. Models using informative priors estimated parameter values and population size with greater accuracy than their non-informative counterparts. While the estimates of population size and trend did not suffer greatly in models using non-informative priors, the algorithm was unable to accurately estimate demographic parameters. This model framework provides reasonable estimates of population size when little to no information is available; however, when information on some vital rates is available, SISR can be used to obtain more precise estimates of population size and process. Incorporating model complexity such as that required by structured populations with stage-specific vital rates affects precision and accuracy when estimating latent population variables and predicting population dynamics. These results are important to consider when designing monitoring programs and conservation efforts requiring management of specific population segments.
Sample sizing of biological materials analyzed by energy dispersion X-ray fluorescence
International Nuclear Information System (INIS)
Paiva, Jose D.S.; Franca, Elvis J.; Magalhaes, Marcelo R.L.; Almeida, Marcio E.S.; Hazin, Clovis A.
2013-01-01
Analytical portions used in chemical analyses are usually less than 1g. Errors resulting from the sampling are barely evaluated, since this type of study is a time-consuming procedure, with high costs for the chemical analysis of large number of samples. The energy dispersion X-ray fluorescence - EDXRF is a non-destructive and fast analytical technique with the possibility of determining several chemical elements. Therefore, the aim of this study was to provide information on the minimum analytical portion for quantification of chemical elements in biological matrices using EDXRF. Three species were sampled in mangroves from the Pernambuco, Brazil. Tree leaves were washed with distilled water, oven-dried at 60 deg C and milled until 0.5 mm particle size. Ten test-portions of approximately 500 mg for each species were transferred to vials sealed with polypropylene film. The quality of the analytical procedure was evaluated from the reference materials IAEA V10 Hay Powder, SRM 2976 Apple Leaves. After energy calibration, all samples were analyzed under vacuum for 100 seconds for each group of chemical elements. The voltage used was 15 kV and 50 kV for chemical elements of atomic number lower than 22 and the others, respectively. For the best analytical conditions, EDXRF was capable of estimating the sample size uncertainty for further determination of chemical elements in leaves. (author)
Sample sizing of biological materials analyzed by energy dispersion X-ray fluorescence
Energy Technology Data Exchange (ETDEWEB)
Paiva, Jose D.S.; Franca, Elvis J.; Magalhaes, Marcelo R.L.; Almeida, Marcio E.S.; Hazin, Clovis A., E-mail: dan-paiva@hotmail.com, E-mail: ejfranca@cnen.gov.br, E-mail: marcelo_rlm@hotmail.com, E-mail: maensoal@yahoo.com.br, E-mail: chazin@cnen.gov.b [Centro Regional de Ciencias Nucleares do Nordeste (CRCN-NE/CNEN-PE), Recife, PE (Brazil)
2013-07-01
Analytical portions used in chemical analyses are usually less than 1g. Errors resulting from the sampling are barely evaluated, since this type of study is a time-consuming procedure, with high costs for the chemical analysis of large number of samples. The energy dispersion X-ray fluorescence - EDXRF is a non-destructive and fast analytical technique with the possibility of determining several chemical elements. Therefore, the aim of this study was to provide information on the minimum analytical portion for quantification of chemical elements in biological matrices using EDXRF. Three species were sampled in mangroves from the Pernambuco, Brazil. Tree leaves were washed with distilled water, oven-dried at 60 deg C and milled until 0.5 mm particle size. Ten test-portions of approximately 500 mg for each species were transferred to vials sealed with polypropylene film. The quality of the analytical procedure was evaluated from the reference materials IAEA V10 Hay Powder, SRM 2976 Apple Leaves. After energy calibration, all samples were analyzed under vacuum for 100 seconds for each group of chemical elements. The voltage used was 15 kV and 50 kV for chemical elements of atomic number lower than 22 and the others, respectively. For the best analytical conditions, EDXRF was capable of estimating the sample size uncertainty for further determination of chemical elements in leaves. (author)
Estimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics
DEFF Research Database (Denmark)
Holland, Dominic; Wang, Yunpeng; Thompson, Wesley K
2016-01-01
Genome-wide Association Studies (GWAS) result in millions of summary statistics ("z-scores") for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric......-scores, as such knowledge would enhance causal SNP and gene discovery, help elucidate mechanistic pathways, and inform future study design. Here we present a parsimonious methodology for modeling effect sizes and replication probabilities, relying only on summary statistics from GWAS substudies, and a scheme allowing...... for estimating the degree of polygenicity of the phenotype and predicting the proportion of chip heritability explainable by genome-wide significant SNPs in future studies with larger sample sizes. We apply the model to recent GWAS of schizophrenia (N = 82,315) and putamen volume (N = 12,596), with approximately...
Estimation of particle size distribution of nanoparticles from electrical ...
Indian Academy of Sciences (India)
2018-02-02
Feb 2, 2018 ... An indirect method of estimation of size distribution of nanoparticles in a nanocomposite is ... The present approach exploits DC electrical current–voltage ... the sizes of nanoparticles (NPs) by electrical characterization.
Frictional behaviour of sandstone: A sample-size dependent triaxial investigation
Roshan, Hamid; Masoumi, Hossein; Regenauer-Lieb, Klaus
2017-01-01
Frictional behaviour of rocks from the initial stage of loading to final shear displacement along the formed shear plane has been widely investigated in the past. However the effect of sample size on such frictional behaviour has not attracted much attention. This is mainly related to the limitations in rock testing facilities as well as the complex mechanisms involved in sample-size dependent frictional behaviour of rocks. In this study, a suite of advanced triaxial experiments was performed on Gosford sandstone samples at different sizes and confining pressures. The post-peak response of the rock along the formed shear plane has been captured for the analysis with particular interest in sample-size dependency. Several important phenomena have been observed from the results of this study: a) the rate of transition from brittleness to ductility in rock is sample-size dependent where the relatively smaller samples showed faster transition toward ductility at any confining pressure; b) the sample size influences the angle of formed shear band and c) the friction coefficient of the formed shear plane is sample-size dependent where the relatively smaller sample exhibits lower friction coefficient compared to larger samples. We interpret our results in terms of a thermodynamics approach in which the frictional properties for finite deformation are viewed as encompassing a multitude of ephemeral slipping surfaces prior to the formation of the through going fracture. The final fracture itself is seen as a result of the self-organisation of a sufficiently large ensemble of micro-slip surfaces and therefore consistent in terms of the theory of thermodynamics. This assumption vindicates the use of classical rock mechanics experiments to constrain failure of pressure sensitive rocks and the future imaging of these micro-slips opens an exciting path for research in rock failure mechanisms.
Evaluation of Approaches to Analyzing Continuous Correlated Eye Data When Sample Size Is Small.
Huang, Jing; Huang, Jiayan; Chen, Yong; Ying, Gui-Shuang
2018-02-01
To evaluate the performance of commonly used statistical methods for analyzing continuous correlated eye data when sample size is small. We simulated correlated continuous data from two designs: (1) two eyes of a subject in two comparison groups; (2) two eyes of a subject in the same comparison group, under various sample size (5-50), inter-eye correlation (0-0.75) and effect size (0-0.8). Simulated data were analyzed using paired t-test, two sample t-test, Wald test and score test using the generalized estimating equations (GEE) and F-test using linear mixed effects model (LMM). We compared type I error rates and statistical powers, and demonstrated analysis approaches through analyzing two real datasets. In design 1, paired t-test and LMM perform better than GEE, with nominal type 1 error rate and higher statistical power. In design 2, no test performs uniformly well: two sample t-test (average of two eyes or a random eye) achieves better control of type I error but yields lower statistical power. In both designs, the GEE Wald test inflates type I error rate and GEE score test has lower power. When sample size is small, some commonly used statistical methods do not perform well. Paired t-test and LMM perform best when two eyes of a subject are in two different comparison groups, and t-test using the average of two eyes performs best when the two eyes are in the same comparison group. When selecting the appropriate analysis approach the study design should be considered.
International Nuclear Information System (INIS)
Bode, P.; Koster-Ammerlaan, M.J.J.
2018-01-01
Pragmatic rather than physical correction factors for neutron and gamma-ray shielding were studied for samples of intermediate size, i.e. up to the 10-100 gram range. It was found that for most biological and geological materials, the neutron self-shielding is less than 5 % and the gamma-ray self-attenuation can easily be estimated. A trueness control material of 1 kg size was made based on use of left-overs of materials, used in laboratory intercomparisons. A design study for a large sample pool-side facility, handling plate-type volumes, had to be stopped because of a reduction in human resources, available for this CRP. The large sample NAA facilities were made available to guest scientists from Greece and Brazil. The laboratory for neutron activation analysis participated in the world’s first laboratory intercomparison utilizing large samples. (author)
Can rarefaction be used to estimate song repertoire size in birds?
Directory of Open Access Journals (Sweden)
Kathleen R. PESHEK, Daniel T. BLUMSTEIN
2011-06-01
Full Text Available Song repertoire size is the number of distinct syllables, phrases, or song types produced by an individual or population. Repertoire size estimation is particularly difficult for species that produce highly variable songs and those that produce many song types. Estimating repertoire size is important for ecological and evolutionary studies of speciation, studies of sexual selection, as well as studies of how species may adapt their songs to various acoustic environments. There are several methods to estimate repertoire size, however prior studies discovered that all but a full numerical count of song types might have substantial inaccuracies associated with them. We evaluated a somewhat novel approach to estimate repertoire size—rarefaction; a technique ecologists use to measure species diversity on individual and population levels. Using the syllables within American robins’ Turdus migratorius repertoire, we compared the most commonly used techniques of estimating repertoires to the results of a rarefaction analysis. American robins have elaborate and unique songs with few syllables shared between individuals, and there is no evidence that robins mimic their neighbors. Thus, they are an ideal system in which to compare techniques. We found that the rarefaction technique results resembled that of the numerical count, and were better than two alternative methods (behavioral accumulation curves, and capture-recapture to estimate syllable repertoire size. Future estimates of repertoire size, particularly in vocally complex species, may benefit from using rarefaction techniques when numerical counts are unable to be performed [Current Zoology 57 (3: 300–306, 2011].
Estimating the average grain size of metals - approved standard 1969
International Nuclear Information System (INIS)
Anon.
1975-01-01
These methods cover procedures for estimating and rules for expressing the average grain size of all metals and consisting entirely, or principally, of a single phase. The methods may also be used for any structures having appearances similar to those of the metallic structures shown in the comparison charts. The three basic procedures for grain size estimation which are discussed are comparison procedure, intercept (or Heyn) procedure, and planimetric (or Jeffries) procedure. For specimens consisting of equiaxed grains, the method of comparing the specimen with a standard chart is most convenient and is sufficiently accurate for most commercial purposes. For high degrees of accuracy in estimating grain size, the intercept or planimetric procedures may be used
Estimation of portion size in children's dietary assessment: lessons learnt.
Foster, E; Adamson, A J; Anderson, A S; Barton, K L; Wrieden, W L
2009-02-01
Assessing the dietary intake of young children is challenging. In any 1 day, children may have several carers responsible for providing them with their dietary requirements, and once children reach school age, traditional methods such as weighing all items consumed become impractical. As an alternative to weighed records, food portion size assessment tools are available to assist subjects in estimating the amounts of foods consumed. Existing food photographs designed for use with adults and based on adult portion sizes have been found to be inappropriate for use with children. This article presents a review and summary of a body of work carried out to improve the estimation of portion sizes consumed by children. Feasibility work was undertaken to determine the accuracy and precision of three portion size assessment tools; food photographs, food models and a computer-based Interactive Portion Size Assessment System (IPSAS). These tools were based on portion sizes served to children during the National Diet and Nutrition Survey. As children often do not consume all of the food served to them, smaller portions were included in each tool for estimation of leftovers. The tools covered 22 foods, which children commonly consume. Children were served known amounts of each food and leftovers were recorded. They were then asked to estimate both the amount of food that they were served and the amount of any food leftover. Children were found to estimate food portion size with an accuracy approaching that of adults using both the food photographs and IPSAS. Further development is underway to increase the number of food photographs and to develop IPSAS to cover a much wider range of foods and to validate the use of these tools in a 'real life' setting.
Effects of sample size on the second magnetization peak in ...
Indian Academy of Sciences (India)
the sample size decreases – a result that could be interpreted as a size effect in the order– disorder vortex matter phase transition. However, local magnetic measurements trace this effect to metastable disordered vortex states, revealing the same order–disorder transition induction in samples of different size. Keywords.
Brownscombe, J W; Lennox, R J; Danylchuk, A J; Cooke, S J
2018-06-21
Accelerometry is growing in popularity for remotely measuring fish swimming metrics, but appropriate sampling frequencies for accurately measuring these metrics are not well studied. This research examined the influence of sampling frequency (1-25 Hz) with tri-axial accelerometer biologgers on estimates of overall dynamic body acceleration (ODBA), tail-beat frequency, swimming speed and metabolic rate of bonefish Albula vulpes in a swim-tunnel respirometer and free-swimming in a wetland mesocosm. In the swim tunnel, sampling frequencies of ≥ 5 Hz were sufficient to establish strong relationships between ODBA, swimming speed and metabolic rate. However, in free-swimming bonefish, estimates of metabolic rate were more variable below 10 Hz. Sampling frequencies should be at least twice the maximum tail-beat frequency to estimate this metric effectively, which is generally higher than those required to estimate ODBA, swimming speed and metabolic rate. While optimal sampling frequency probably varies among species due to tail-beat frequency and swimming style, this study provides a reference point with a medium body-sized sub-carangiform teleost fish, enabling researchers to measure these metrics effectively and maximize study duration. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Żebrowska, Magdalena; Posch, Martin; Magirr, Dominic
2016-05-30
Consider a parallel group trial for the comparison of an experimental treatment to a control, where the second-stage sample size may depend on the blinded primary endpoint data as well as on additional blinded data from a secondary endpoint. For the setting of normally distributed endpoints, we demonstrate that this may lead to an inflation of the type I error rate if the null hypothesis holds for the primary but not the secondary endpoint. We derive upper bounds for the inflation of the type I error rate, both for trials that employ random allocation and for those that use block randomization. We illustrate the worst-case sample size reassessment rule in a case study. For both randomization strategies, the maximum type I error rate increases with the effect size in the secondary endpoint and the correlation between endpoints. The maximum inflation increases with smaller block sizes if information on the block size is used in the reassessment rule. Based on our findings, we do not question the well-established use of blinded sample size reassessment methods with nuisance parameter estimates computed from the blinded interim data of the primary endpoint. However, we demonstrate that the type I error rate control of these methods relies on the application of specific, binding, pre-planned and fully algorithmic sample size reassessment rules and does not extend to general or unplanned sample size adjustments based on blinded data. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Constrained statistical inference: sample-size tables for ANOVA and regression
Directory of Open Access Journals (Sweden)
Leonard eVanbrabant
2015-01-01
Full Text Available Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient beta1 is larger than beta2 and beta3. The corresponding hypothesis is H: beta1 > {beta2, beta3} and this is known as an (order constrained hypothesis. A major advantage of testing such a hypothesis is that power can be gained and inherently a smaller sample size is needed. This article discusses this gain in sample size reduction, when an increasing number of constraints is included into the hypothesis. The main goal is to present sample-size tables for constrained hypotheses. A sample-size table contains the necessary sample-size at a prespecified power (say, 0.80 for an increasing number of constraints. To obtain sample-size tables, two Monte Carlo simulations were performed, one for ANOVA and one for multiple regression. Three results are salient. First, in an ANOVA the needed sample-size decreases with 30% to 50% when complete ordering of the parameters is taken into account. Second, small deviations from the imposed order have only a minor impact on the power. Third, at the maximum number of constraints, the linear regression results are comparable with the ANOVA results. However, in the case of fewer constraints, ordering the parameters (e.g., beta1 > beta2 results in a higher power than assigning a positive or a negative sign to the parameters (e.g., beta1 > 0.
The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory
Sahin, Alper; Anil, Duygu
2017-01-01
This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…
Suarez Diez, M.; Saccenti, E.
2015-01-01
We investigated the effect of sample size and dimensionality on the performance of four algorithms (ARACNE, CLR, CORR, and PCLRC) when they are used for the inference of metabolite association networks. We report that as many as 100-400 samples may be necessary to obtain stable network estimations,
Sample Size in Qualitative Interview Studies: Guided by Information Power.
Malterud, Kirsti; Siersma, Volkert Dirk; Guassora, Ann Dorrit
2015-11-27
Sample sizes must be ascertained in qualitative studies like in quantitative studies but not by the same means. The prevailing concept for sample size in qualitative studies is "saturation." Saturation is closely tied to a specific methodology, and the term is inconsistently applied. We propose the concept "information power" to guide adequate sample size for qualitative studies. Information power indicates that the more information the sample holds, relevant for the actual study, the lower amount of participants is needed. We suggest that the size of a sample with sufficient information power depends on (a) the aim of the study, (b) sample specificity, (c) use of established theory, (d) quality of dialogue, and (e) analysis strategy. We present a model where these elements of information and their relevant dimensions are related to information power. Application of this model in the planning and during data collection of a qualitative study is discussed. © The Author(s) 2015.
Cornelissen, Katri; Bester, Andre; Cairns, Paul; Tovee, Martin; Cornelissen, Piers
2015-01-01
In this cross-sectional study, we investigated the influence of personal BMI on body size estimation in 42 women who have symptoms of anorexia (referred to henceforth as anorexia spectrum disorders, ANSD), and 100 healthy controls. Low BMI control participants over-estimate their size and high BMI controls under-estimate, a pattern which is predicted by a perceptual phenomenon called contraction bias. In addition, control participants' sensitivity to size change declines as their BMI increase...
Graf, Alexandra C; Bauer, Peter
2011-06-30
We calculate the maximum type 1 error rate of the pre-planned conventional fixed sample size test for comparing the means of independent normal distributions (with common known variance) which can be yielded when sample size and allocation rate to the treatment arms can be modified in an interim analysis. Thereby it is assumed that the experimenter fully exploits knowledge of the unblinded interim estimates of the treatment effects in order to maximize the conditional type 1 error rate. The 'worst-case' strategies require knowledge of the unknown common treatment effect under the null hypothesis. Although this is a rather hypothetical scenario it may be approached in practice when using a standard control treatment for which precise estimates are available from historical data. The maximum inflation of the type 1 error rate is substantially larger than derived by Proschan and Hunsberger (Biometrics 1995; 51:1315-1324) for design modifications applying balanced samples before and after the interim analysis. Corresponding upper limits for the maximum type 1 error rate are calculated for a number of situations arising from practical considerations (e.g. restricting the maximum sample size, not allowing sample size to decrease, allowing only increase in the sample size in the experimental treatment). The application is discussed for a motivating example. Copyright © 2011 John Wiley & Sons, Ltd.
Foster, E; Matthews, J N S; Lloyd, J; Marshall, L; Mathers, J C; Nelson, M; Barton, K L; Wrieden, W L; Cornelissen, P; Harris, J; Adamson, A J
2008-01-01
A number of methods have been developed to assist subjects in providing an estimate of portion size but their application in improving portion size estimation by children has not been investigated systematically. The aim was to develop portion size assessment tools for use with children and to assess the accuracy of children's estimates of portion size using the tools. The tools were food photographs, food models and an interactive portion size assessment system (IPSAS). Children (n 201), aged 4-16 years, were supplied with known quantities of food to eat, in school. Food leftovers were weighed. Children estimated the amount of each food using each tool, 24 h after consuming the food. The age-specific portion sizes represented were based on portion sizes consumed by children in a national survey. Significant differences were found between the accuracy of estimates using the three tools. Children of all ages performed well using the IPSAS and food photographs. The accuracy and precision of estimates made using the food models were poor. For all tools, estimates of the amount of food served were more accurate than estimates of the amount consumed. Issues relating to reporting of foods left over which impact on estimates of the amounts of foods actually consumed require further study. The IPSAS has shown potential for assessment of dietary intake with children. Before practical application in assessment of dietary intake of children the tool would need to be expanded to cover a wider range of foods and to be validated in a 'real-life' situation.
Froud, Robert; Rajendran, Dévan; Patel, Shilpa; Bright, Philip; Bjørkli, Tom; Eldridge, Sandra; Buchbinder, Rachelle; Underwood, Martin
2017-06-01
A systematic review of nonspecific low back pain trials published between 1980 and 2012. To explore what proportion of trials have been powered to detect different bands of effect size; whether there is evidence that sample size in low back pain trials has been increasing; what proportion of trial reports include a sample size calculation; and whether likelihood of reporting sample size calculations has increased. Clinical trials should have a sample size sufficient to detect a minimally important difference for a given power and type I error rate. An underpowered trial is one within which probability of type II error is too high. Meta-analyses do not mitigate underpowered trials. Reviewers independently abstracted data on sample size at point of analysis, whether a sample size calculation was reported, and year of publication. Descriptive analyses were used to explore ability to detect effect sizes, and regression analyses to explore the relationship between sample size, or reporting sample size calculations, and time. We included 383 trials. One-third were powered to detect a standardized mean difference of less than 0.5, and 5% were powered to detect less than 0.3. The average sample size was 153 people, which increased only slightly (∼4 people/yr) from 1980 to 2000, and declined slightly (∼4.5 people/yr) from 2005 to 2011 (P pain trials and the reporting of sample size calculations may need to be increased. It may be justifiable to power a trial to detect only large effects in the case of novel interventions. 3.
Sobel Leonard, Ashley; Weissman, Daniel B; Greenbaum, Benjamin; Ghedin, Elodie; Koelle, Katia
2017-07-15
The bottleneck governing infectious disease transmission describes the size of the pathogen population transferred from the donor to the recipient host. Accurate quantification of the bottleneck size is particularly important for rapidly evolving pathogens such as influenza virus, as narrow bottlenecks reduce the amount of transferred viral genetic diversity and, thus, may decrease the rate of viral adaptation. Previous studies have estimated bottleneck sizes governing viral transmission by using statistical analyses of variants identified in pathogen sequencing data. These analyses, however, did not account for variant calling thresholds and stochastic viral replication dynamics within recipient hosts. Because these factors can skew bottleneck size estimates, we introduce a new method for inferring bottleneck sizes that accounts for these factors. Through the use of a simulated data set, we first show that our method, based on beta-binomial sampling, accurately recovers transmission bottleneck sizes, whereas other methods fail to do so. We then apply our method to a data set of influenza A virus (IAV) infections for which viral deep-sequencing data from transmission pairs are available. We find that the IAV transmission bottleneck size estimates in this study are highly variable across transmission pairs, while the mean bottleneck size of 196 virions is consistent with a previous estimate for this data set. Furthermore, regression analysis shows a positive association between estimated bottleneck size and donor infection severity, as measured by temperature. These results support findings from experimental transmission studies showing that bottleneck sizes across transmission events can be variable and influenced in part by epidemiological factors. IMPORTANCE The transmission bottleneck size describes the size of the pathogen population transferred from the donor to the recipient host and may affect the rate of pathogen adaptation within host populations. Recent
Sample size choices for XRCT scanning of highly unsaturated soil mixtures
Directory of Open Access Journals (Sweden)
Smith Jonathan C.
2016-01-01
Full Text Available Highly unsaturated soil mixtures (clay, sand and gravel are used as building materials in many parts of the world, and there is increasing interest in understanding their mechanical and hydraulic behaviour. In the laboratory, x-ray computed tomography (XRCT is becoming more widely used to investigate the microstructures of soils, however a crucial issue for such investigations is the choice of sample size, especially concerning the scanning of soil mixtures where there will be a range of particle and void sizes. In this paper we present a discussion (centred around a new set of XRCT scans on sample sizing for scanning of samples comprising soil mixtures, where a balance has to be made between realistic representation of the soil components and the desire for high resolution scanning, We also comment on the appropriateness of differing sample sizes in comparison to sample sizes used for other geotechnical testing. Void size distributions for the samples are presented and from these some hypotheses are made as to the roles of inter- and intra-aggregate voids in the mechanical behaviour of highly unsaturated soils.
International Nuclear Information System (INIS)
Clementi, Luis A.; Vega, Jorge R.; Gugliotta, Luis M.; Quirantes, Arturo
2012-01-01
A numerical method is proposed for the characterization of core–shell spherical particles from static light scattering (SLS) measurements. The method is able to estimate the core size distribution (CSD) and the particle size distribution (PSD), through the following two-step procedure: (i) the estimation of the bivariate core–particle size distribution (C–PSD), by solving a linear ill-conditioned inverse problem through a generalized Tikhonov regularization strategy, and (ii) the calculation of the CSD and the PSD from the estimated C–PSD. First, the method was evaluated on the basis of several simulated examples, with polystyrene–poly(methyl methacrylate) core–shell particles of different CSDs and PSDs. Then, two samples of hematite–Yttrium basic carbonate core–shell particles were successfully characterized. In all analyzed examples, acceptable estimates of the PSD and the average diameter of the CSD were obtained. Based on the single-scattering Mie theory, the proposed method is an effective tool for characterizing core–shell colloidal particles larger than their Rayleigh limits without requiring any a-priori assumption on the shapes of the size distributions. Under such conditions, the PSDs can always be adequately estimated, while acceptable CSD estimates are obtained when the core/shell particles exhibit either a high optical contrast, or a moderate optical contrast but with a high ‘average core diameter’/‘average particle diameter’ ratio. -- Highlights: ► Particles with core–shell morphology are characterized by static light scattering. ► Core size distribution and particle size distribution are successfully estimated. ► Simulated and experimental examples are used to validate the numerical method. ► The positive effect of a large core/shell optical contrast is investigated. ► No a-priori assumption on the shapes of the size distributions is required.
Directory of Open Access Journals (Sweden)
Christopher Ryan Penton
2016-06-01
Full Text Available We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5 and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungal community structure, replicate dispersion and the number of operational taxonomic units (OTUs retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation.
DEFF Research Database (Denmark)
Kokkalis, Alexandros; Thygesen, Uffe Høgsbro; Nielsen, Anders
, were investigated and our estimations were compared to the ICES advice. Only size-specific catch data were used, in order to emulate data limited situations. The simulation analysis reveals that the status of the stock, i.e. F/Fmsy, is estimated more accurately than the fishing mortality F itself....... Specific knowledge of the natural mortality improves the estimation more than having information about all other life history parameters. Our approach gives, at least qualitatively, an estimated stock status which is similar to the results of an age-based assessment. Since our approach only uses size...
Size Estimates in Inverse Problems
Di Cristo, Michele
2014-01-06
Detection of inclusions or obstacles inside a body by boundary measurements is an inverse problems very useful in practical applications. When only finite numbers of measurements are available, we try to detect some information on the embedded object such as its size. In this talk we review some recent results on several inverse problems. The idea is to provide constructive upper and lower estimates of the area/volume of the unknown defect in terms of a quantity related to the work that can be expressed with the available boundary data.
Williams, K.A.; Frederick, P.C.; Nichols, J.D.
2011-01-01
Many populations of animals are fluid in both space and time, making estimation of numbers difficult. Much attention has been devoted to estimation of bias in detection of animals that are present at the time of survey. However, an equally important problem is estimation of population size when all animals are not present on all survey occasions. Here, we showcase use of the superpopulation approach to capture-recapture modeling for estimating populations where group membership is asynchronous, and where considerable overlap in group membership among sampling occasions may occur. We estimate total population size of long-legged wading bird (Great Egret and White Ibis) breeding colonies from aerial observations of individually identifiable nests at various times in the nesting season. Initiation and termination of nests were analogous to entry and departure from a population. Estimates using the superpopulation approach were 47-382% larger than peak aerial counts of the same colonies. Our results indicate that the use of the superpopulation approach to model nesting asynchrony provides a considerably less biased and more efficient estimate of nesting activity than traditional methods. We suggest that this approach may also be used to derive population estimates in a variety of situations where group membership is fluid. ?? 2011 by the Ecological Society of America.
Decision Support on Small size Passive Samples
Directory of Open Access Journals (Sweden)
Vladimir Popukaylo
2018-05-01
Full Text Available A construction technique of adequate mathematical models for small size passive samples, in conditions when classical probabilistic-statis\\-tical methods do not allow obtaining valid conclusions was developed.
Estimating the encounter rate variance in distance sampling
Fewster, R.M.; Buckland, S.T.; Burnham, K.P.; Borchers, D.L.; Jupp, P.E.; Laake, J.L.; Thomas, L.
2009-01-01
The dominant source of variance in line transect sampling is usually the encounter rate variance. Systematic survey designs are often used to reduce the true variability among different realizations of the design, but estimating the variance is difficult and estimators typically approximate the variance by treating the design as a simple random sample of lines. We explore the properties of different encounter rate variance estimators under random and systematic designs. We show that a design-based variance estimator improves upon the model-based estimator of Buckland et al. (2001, Introduction to Distance Sampling. Oxford: Oxford University Press, p. 79) when transects are positioned at random. However, if populations exhibit strong spatial trends, both estimators can have substantial positive bias under systematic designs. We show that poststratification is effective in reducing this bias. ?? 2008, The International Biometric Society.
Investigation of Bicycle Travel Time Estimation Using Bluetooth Sensors for Low Sampling Rates
Directory of Open Access Journals (Sweden)
Zhenyu Mei
2014-10-01
Full Text Available Filtering the data for bicycle travel time using Bluetooth sensors is crucial to the estimation of link travel times on a corridor. The current paper describes an adaptive filtering algorithm for estimating bicycle travel times using Bluetooth data, with consideration of low sampling rates. The data for bicycle travel time using Bluetooth sensors has two characteristics. First, the bicycle flow contains stable and unstable conditions. Second, the collected data have low sampling rates (less than 1%. To avoid erroneous inference, filters are introduced to “purify” multiple time series. The valid data are identified within a dynamically varying validity window with the use of a robust data-filtering procedure. The size of the validity window varies based on the number of preceding sampling intervals without a Bluetooth record. Applications of the proposed algorithm to the dataset from Genshan East Road and Moganshan Road in Hangzhou demonstrate its ability to track typical variations in bicycle travel time efficiently, while suppressing high frequency noise signals.
Sample Size for Estimation of G and Phi Coefficients in Generalizability Theory
Atilgan, Hakan
2013-01-01
Problem Statement: Reliability, which refers to the degree to which measurement results are free from measurement errors, as well as its estimation, is an important issue in psychometrics. Several methods for estimating reliability have been suggested by various theories in the field of psychometrics. One of these theories is the generalizability…
The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.
Shen, Dan; Shen, Haipeng; Zhu, Hongtu; Marron, J S
2016-10-01
The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size. When this ratio converges to a nonzero constant, the sample eigenvector converges to a cone, with a certain angle to its corresponding population eigenvector. In the High Dimension, Low Sample Size case, the angle between the sample eigenvector and its population counterpart converges to a limiting distribution. Several generalizations of the multi-spike covariance models are also explored, and additional theoretical results are presented.
Mandava, Pitchaiah; Krumpelman, Chase S; Shah, Jharna N; White, Donna L; Kent, Thomas A
2013-01-01
Clinical trial outcomes often involve an ordinal scale of subjective functional assessments but the optimal way to quantify results is not clear. In stroke, the most commonly used scale, the modified Rankin Score (mRS), a range of scores ("Shift") is proposed as superior to dichotomization because of greater information transfer. The influence of known uncertainties in mRS assessment has not been quantified. We hypothesized that errors caused by uncertainties could be quantified by applying information theory. Using Shannon's model, we quantified errors of the "Shift" compared to dichotomized outcomes using published distributions of mRS uncertainties and applied this model to clinical trials. We identified 35 randomized stroke trials that met inclusion criteria. Each trial's mRS distribution was multiplied with the noise distribution from published mRS inter-rater variability to generate an error percentage for "shift" and dichotomized cut-points. For the SAINT I neuroprotectant trial, considered positive by "shift" mRS while the larger follow-up SAINT II trial was negative, we recalculated sample size required if classification uncertainty was taken into account. Considering the full mRS range, error rate was 26.1%±5.31 (Mean±SD). Error rates were lower for all dichotomizations tested using cut-points (e.g. mRS 1; 6.8%±2.89; overall pdecrease in reliability. The resultant errors need to be considered since sample size may otherwise be underestimated. In principle, we have outlined an approach to error estimation for any condition in which there are uncertainties in outcome assessment. We provide the user with programs to calculate and incorporate errors into sample size estimation.
Directory of Open Access Journals (Sweden)
Jiacheng Wu
Full Text Available Estimating the size of key risk populations is essential for determining the resources needed to implement effective public health intervention programs. Several standard methods for population size estimation exist, but the statistical and practical assumptions required for their use may not be met when applied to HIV risk groups. We apply three approaches to estimate the number of people who inject drugs (PWID in the Kohtla-Järve region of Estonia using data from a respondent-driven sampling (RDS study: the standard "multiplier" estimate gives 654 people (95% CI 509-804, the "successive sampling" method gives estimates between 600 and 2500 people, and a network-based estimate that uses the RDS recruitment chain gives between 700 and 2800 people. We critically assess the strengths and weaknesses of these statistical approaches for estimating the size of hidden or hard-to-reach HIV risk groups.
The attention-weighted sample-size model of visual short-term memory
DEFF Research Database (Denmark)
Smith, Philip L.; Lilburn, Simon D.; Corbett, Elaine A.
2016-01-01
exceeded that predicted by the sample-size model for both simultaneously and sequentially presented stimuli. Instead, the set-size effect and the serial position curves with sequential presentation were predicted by an attention-weighted version of the sample-size model, which assumes that one of the items...
Breaking Free of Sample Size Dogma to Perform Innovative Translational Research
Bacchetti, Peter; Deeks, Steven G.; McCune, Joseph M.
2011-01-01
Innovative clinical and translational research is often delayed or prevented by reviewers’ expectations that any study performed in humans must be shown in advance to have high statistical power. This supposed requirement is not justifiable and is contradicted by the reality that increasing sample size produces diminishing marginal returns. Studies of new ideas often must start small (sometimes even with an N of 1) because of cost and feasibility concerns, and recent statistical work shows that small sample sizes for such research can produce more projected scientific value per dollar spent than larger sample sizes. Renouncing false dogma about sample size would remove a serious barrier to innovation and translation. PMID:21677197
Fung, Tak; Keenan, Kevin
2014-01-01
The estimation of population allele frequencies using sample data forms a central component of studies in population genetics. These estimates can be used to test hypotheses on the evolutionary processes governing changes in genetic variation among populations. However, existing studies frequently do not account for sampling uncertainty in these estimates, thus compromising their utility. Incorporation of this uncertainty has been hindered by the lack of a method for constructing confidence intervals containing the population allele frequencies, for the general case of sampling from a finite diploid population of any size. In this study, we address this important knowledge gap by presenting a rigorous mathematical method to construct such confidence intervals. For a range of scenarios, the method is used to demonstrate that for a particular allele, in order to obtain accurate estimates within 0.05 of the population allele frequency with high probability (> or = 95%), a sample size of > 30 is often required. This analysis is augmented by an application of the method to empirical sample allele frequency data for two populations of the checkerspot butterfly (Melitaea cinxia L.), occupying meadows in Finland. For each population, the method is used to derive > or = 98.3% confidence intervals for the population frequencies of three alleles. These intervals are then used to construct two joint > or = 95% confidence regions, one for the set of three frequencies for each population. These regions are then used to derive a > or = 95%% confidence interval for Jost's D, a measure of genetic differentiation between the two populations. Overall, the results demonstrate the practical utility of the method with respect to informing sampling design and accounting for sampling uncertainty in studies of population genetics, important for scientific hypothesis-testing and also for risk-based natural resource management.
Directory of Open Access Journals (Sweden)
Tak Fung
Full Text Available The estimation of population allele frequencies using sample data forms a central component of studies in population genetics. These estimates can be used to test hypotheses on the evolutionary processes governing changes in genetic variation among populations. However, existing studies frequently do not account for sampling uncertainty in these estimates, thus compromising their utility. Incorporation of this uncertainty has been hindered by the lack of a method for constructing confidence intervals containing the population allele frequencies, for the general case of sampling from a finite diploid population of any size. In this study, we address this important knowledge gap by presenting a rigorous mathematical method to construct such confidence intervals. For a range of scenarios, the method is used to demonstrate that for a particular allele, in order to obtain accurate estimates within 0.05 of the population allele frequency with high probability (> or = 95%, a sample size of > 30 is often required. This analysis is augmented by an application of the method to empirical sample allele frequency data for two populations of the checkerspot butterfly (Melitaea cinxia L., occupying meadows in Finland. For each population, the method is used to derive > or = 98.3% confidence intervals for the population frequencies of three alleles. These intervals are then used to construct two joint > or = 95% confidence regions, one for the set of three frequencies for each population. These regions are then used to derive a > or = 95%% confidence interval for Jost's D, a measure of genetic differentiation between the two populations. Overall, the results demonstrate the practical utility of the method with respect to informing sampling design and accounting for sampling uncertainty in studies of population genetics, important for scientific hypothesis-testing and also for risk-based natural resource management.
Directory of Open Access Journals (Sweden)
Carlos Montenegro Silva
2009-01-01
Full Text Available Se analizó el desempeño de distintos tamaños de muestra para estimar la composición de tallas de las capturas del langostino colorado (Pleuroncodes monodon, a partir de un procedimiento de remuestreo computacional. Se seleccionaron datos recolectados en mayo de 2002 entre los 29°10'S y 32°10'S. A partir de éstos, se probaron siete escenarios de muestreo de viajes de pesca (1-7 viajes, 12 escenarios de número de ejemplares muestreados (25, 50,...300, cada 25 ejemplares y dos estrategias de muestreo de lances de pesca al interior de un viaje de pesca (censo de lances y muestreo sistemático. Se probó la combinación de todos estos escenarios, lo que permitió analizar el desempeño de 168 escenarios de tamaño de muestra para estimar la composición de tallas por sexo. Los resultados indicaron una disminución en el índice de error en la estimación de la distribución de frecuencia de tallas, conforme aumentó el número de viajes de pesca, con disminuciones progresivamente menores entre escenarios adyacentes. Del mismo modo, se verificó una disminución en el índice de error al aumentar el número de ejemplares muestreados, con mejoras marginales sobre los 175 ejemplares.The performances of different sample sizes for estimating the size distribution of squat lobster (Pleuroncodes monodon catches were analyzed using a computer resampling procedure. The data selected were gathered in May 2002 between 29°10'S and 32°10'S. These data were used to test seven sampling scenarios for fishing trips (1-7 trips, twelve scenarios of the number of individuals sampled per tow (25, 50,..., 300, and two within-trip sampling strategies (sampling all tows and systematic tow sampling. By testing the combination of all these scenarios, we were able to analyze the performance of 168 scenarios of sample size for estimating the composition of sizes by sex. The results indicate a lower error index for estimates of the size frequency distribution as the
Levin, Gregory P; Emerson, Sarah C; Emerson, Scott S
2013-04-15
Adaptive clinical trial design has been proposed as a promising new approach that may improve the drug discovery process. Proponents of adaptive sample size re-estimation promote its ability to avoid 'up-front' commitment of resources, better address the complicated decisions faced by data monitoring committees, and minimize accrual to studies having delayed ascertainment of outcomes. We investigate aspects of adaptation rules, such as timing of the adaptation analysis and magnitude of sample size adjustment, that lead to greater or lesser statistical efficiency. Owing in part to the recent Food and Drug Administration guidance that promotes the use of pre-specified sampling plans, we evaluate alternative approaches in the context of well-defined, pre-specified adaptation. We quantify the relative costs and benefits of fixed sample, group sequential, and pre-specified adaptive designs with respect to standard operating characteristics such as type I error, maximal sample size, power, and expected sample size under a range of alternatives. Our results build on others' prior research by demonstrating in realistic settings that simple and easily implemented pre-specified adaptive designs provide only very small efficiency gains over group sequential designs with the same number of analyses. In addition, we describe optimal rules for modifying the sample size, providing efficient adaptation boundaries on a variety of scales for the interim test statistic for adaptation analyses occurring at several different stages of the trial. We thus provide insight into what are good and bad choices of adaptive sampling plans when the added flexibility of adaptive designs is desired. Copyright © 2012 John Wiley & Sons, Ltd.
Estimating search engine index size variability: a 9-year longitudinal study.
van den Bosch, Antal; Bogers, Toine; de Kunder, Maurice
One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel method of estimating the size of a Web search engine's index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing's indices over a nine-year period, from March 2006 until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find that much, if not all of this variability can be explained by changes in the indexing and ranking infrastructure of Google and Bing. This casts further doubt on whether Web search engines can be used reliably for cross-sectional webometric studies.
Broberg, Per
2013-07-19
One major concern with adaptive designs, such as the sample size adjustable designs, has been the fear of inflating the type I error rate. In (Stat Med 23:1023-1038, 2004) it is however proven that when observations follow a normal distribution and the interim result show promise, meaning that the conditional power exceeds 50%, type I error rate is protected. This bound and the distributional assumptions may seem to impose undesirable restrictions on the use of these designs. In (Stat Med 30:3267-3284, 2011) the possibility of going below 50% is explored and a region that permits an increased sample size without inflation is defined in terms of the conditional power at the interim. A criterion which is implicit in (Stat Med 30:3267-3284, 2011) is derived by elementary methods and expressed in terms of the test statistic at the interim to simplify practical use. Mathematical and computational details concerning this criterion are exhibited. Under very general conditions the type I error rate is preserved under sample size adjustable schemes that permit a raise. The main result states that for normally distributed observations raising the sample size when the result looks promising, where the definition of promising depends on the amount of knowledge gathered so far, guarantees the protection of the type I error rate. Also, in the many situations where the test statistic approximately follows a normal law, the deviation from the main result remains negligible. This article provides details regarding the Weibull and binomial distributions and indicates how one may approach these distributions within the current setting. There is thus reason to consider such designs more often, since they offer a means of adjusting an important design feature at little or no cost in terms of error rate.
International Nuclear Information System (INIS)
Tai, Bee-Choo; Grundy, Richard; Machin, David
2011-01-01
Purpose: To accurately model the cumulative need for radiotherapy in trials designed to delay or avoid irradiation among children with malignant brain tumor, it is crucial to account for competing events and evaluate how each contributes to the timing of irradiation. An appropriate choice of statistical model is also important for adequate determination of sample size. Methods and Materials: We describe the statistical modeling of competing events (A, radiotherapy after progression; B, no radiotherapy after progression; and C, elective radiotherapy) using proportional cause-specific and subdistribution hazard functions. The procedures of sample size estimation based on each method are outlined. These are illustrated by use of data comparing children with ependymoma and other malignant brain tumors. The results from these two approaches are compared. Results: The cause-specific hazard analysis showed a reduction in hazards among infants with ependymoma for all event types, including Event A (adjusted cause-specific hazard ratio, 0.76; 95% confidence interval, 0.45-1.28). Conversely, the subdistribution hazard analysis suggested an increase in hazard for Event A (adjusted subdistribution hazard ratio, 1.35; 95% confidence interval, 0.80-2.30), but the reduction in hazards for Events B and C remained. Analysis based on subdistribution hazard requires a larger sample size than the cause-specific hazard approach. Conclusions: Notable differences in effect estimates and anticipated sample size were observed between methods when the main event showed a beneficial effect whereas the competing events showed an adverse effect on the cumulative incidence. The subdistribution hazard is the most appropriate for modeling treatment when its effects on both the main and competing events are of interest.
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
On efficiency of some ratio estimators in double sampling design ...
African Journals Online (AJOL)
In this paper, three sampling ratio estimators in double sampling design were proposed with the intention of finding an alternative double sampling design estimator to the conventional ratio estimator in double sampling design discussed by Cochran (1997), Okafor (2002) , Raj (1972) and Raj and Chandhok (1999).
Angly, Florent E.; Willner, Dana; Prieto-Dav?, Alejandra; Edwards, Robert A.; Schmieder, Robert; Vega-Thurber, Rebecca; Antonopoulos, Dionysios A.; Barott, Katie; Cottrell, Matthew T.; Desnues, Christelle; Dinsdale, Elizabeth A.; Furlan, Mike; Haynes, Matthew; Henn, Matthew R.; Hu, Yongfei
2009-01-01
Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimate...
Lee, Christina D; Chae, Junghoon; Schap, TusaRebecca E; Kerr, Deborah A; Delp, Edward J; Ebert, David S; Boushey, Carol J
2012-03-01
Diet is a critical element of diabetes self-management. An emerging area of research is the use of images for dietary records using mobile telephones with embedded cameras. These tools are being designed to reduce user burden and to improve accuracy of portion-size estimation through automation. The objectives of this study were to (1) assess the error of automatically determined portion weights compared to known portion weights of foods and (2) to compare the error between automation and human. Adolescents (n = 15) captured images of their eating occasions over a 24 h period. All foods and beverages served were weighed. Adolescents self-reported portion sizes for one meal. Image analysis was used to estimate portion weights. Data analysis compared known weights, automated weights, and self-reported portions. For the 19 foods, the mean ratio of automated weight estimate to known weight ranged from 0.89 to 4.61, and 9 foods were within 0.80 to 1.20. The largest error was for lettuce and the most accurate was strawberry jam. The children were fairly accurate with portion estimates for two foods (sausage links, toast) using one type of estimation aid and two foods (sausage links, scrambled eggs) using another aid. The automated method was fairly accurate for two foods (sausage links, jam); however, the 95% confidence intervals for the automated estimates were consistently narrower than human estimates. The ability of humans to estimate portion sizes of foods remains a problem and a perceived burden. Errors in automated portion-size estimation can be systematically addressed while minimizing the burden on people. Future applications that take over the burden of these processes may translate to better diabetes self-management. © 2012 Diabetes Technology Society.
Dependence of fracture mechanical and fluid flow properties on fracture roughness and sample size
International Nuclear Information System (INIS)
Tsang, Y.W.; Witherspoon, P.A.
1983-01-01
A parameter study has been carried out to investigate the interdependence of mechanical and fluid flow properties of fractures with fracture roughness and sample size. A rough fracture can be defined mathematically in terms of its aperture density distribution. Correlations were found between the shapes of the aperture density distribution function and the specific fractures of the stress-strain behavior and fluid flow characteristics. Well-matched fractures had peaked aperture distributions that resulted in very nonlinear stress-strain behavior. With an increasing degree of mismatching between the top and bottom of a fracture, the aperture density distribution broadened and the nonlinearity of the stress-strain behavior became less accentuated. The different aperture density distributions also gave rise to qualitatively different fluid flow behavior. Findings from this investigation make it possible to estimate the stress-strain and fluid flow behavior when the roughness characteristics of the fracture are known and, conversely, to estimate the fracture roughness from an examination of the hydraulic and mechanical data. Results from this study showed that both the mechanical and hydraulic properties of the fracture are controlled by the large-scale roughness of the joint surface. This suggests that when the stress-flow behavior of a fracture is being investigated, the size of the rock sample should be larger than the typical wave length of the roughness undulations
Directory of Open Access Journals (Sweden)
John A Sved
Full Text Available There is a substantial literature on the use of linkage disequilibrium (LD to estimate effective population size using unlinked loci. The Ne estimates are extremely sensitive to the sampling process, and there is currently no theory to cope with the possible biases. We derive formulae for the analysis of idealised populations mating at random with multi-allelic (microsatellite loci. The 'Burrows composite index' is introduced in a novel way with a 'composite haplotype table'. We show that in a sample of diploid size S, the mean value of x2 or r2 from the composite haplotype table is biased by a factor of 1-1/(2S-12, rather than the usual factor 1+1/(2S-1 for a conventional haplotype table. But analysis of population data using these formulae leads to Ne estimates that are unrealistically low. We provide theory and simulation to show that this bias towards low Ne estimates is due to null alleles, and introduce a randomised permutation correction to compensate for the bias. We also consider the effect of introducing a within-locus disequilibrium factor to r2, and find that this factor leads to a bias in the Ne estimate. However this bias can be overcome using the same randomised permutation correction, to yield an altered r2 with lower variance than the original r2, and one that is also insensitive to null alleles. The resulting formulae are used to provide Ne estimates on 40 samples of the Queensland fruit fly, Bactrocera tryoni, from populations with widely divergent Ne expectations. Linkage relationships are known for most of the microsatellite loci in this species. We find that there is little difference in the estimated Ne values from using known unlinked loci as compared to using all loci, which is important for conservation studies where linkage relationships are unknown.
Sved, John A; Cameron, Emilie C; Gilchrist, A Stuart
2013-01-01
There is a substantial literature on the use of linkage disequilibrium (LD) to estimate effective population size using unlinked loci. The Ne estimates are extremely sensitive to the sampling process, and there is currently no theory to cope with the possible biases. We derive formulae for the analysis of idealised populations mating at random with multi-allelic (microsatellite) loci. The 'Burrows composite index' is introduced in a novel way with a 'composite haplotype table'. We show that in a sample of diploid size S, the mean value of x2 or r2 from the composite haplotype table is biased by a factor of 1-1/(2S-1)2, rather than the usual factor 1+1/(2S-1) for a conventional haplotype table. But analysis of population data using these formulae leads to Ne estimates that are unrealistically low. We provide theory and simulation to show that this bias towards low Ne estimates is due to null alleles, and introduce a randomised permutation correction to compensate for the bias. We also consider the effect of introducing a within-locus disequilibrium factor to r2, and find that this factor leads to a bias in the Ne estimate. However this bias can be overcome using the same randomised permutation correction, to yield an altered r2 with lower variance than the original r2, and one that is also insensitive to null alleles. The resulting formulae are used to provide Ne estimates on 40 samples of the Queensland fruit fly, Bactrocera tryoni, from populations with widely divergent Ne expectations. Linkage relationships are known for most of the microsatellite loci in this species. We find that there is little difference in the estimated Ne values from using known unlinked loci as compared to using all loci, which is important for conservation studies where linkage relationships are unknown.
Hare, Matthew P; Nunney, Leonard; Schwartz, Michael K; Ruzzante, Daniel E; Burford, Martha; Waples, Robin S; Ruegg, Kristen; Palstra, Friso
2011-06-01
Effective population size (N(e)) determines the strength of genetic drift in a population and has long been recognized as an important parameter for evaluating conservation status and threats to genetic health of populations. Specifically, an estimate of N(e) is crucial to management because it integrates genetic effects with the life history of the species, allowing for predictions of a population's current and future viability. Nevertheless, compared with ecological and demographic parameters, N(e) has had limited influence on species management, beyond its application in very small populations. Recent developments have substantially improved N(e) estimation; however, some obstacles remain for the practical application of N(e) estimates. For example, the need to define the spatial and temporal scale of measurement makes the concept complex and sometimes difficult to interpret. We reviewed approaches to estimation of N(e) over both long-term and contemporary time frames, clarifying their interpretations with respect to local populations and the global metapopulation. We describe multiple experimental factors affecting robustness of contemporary N(e) estimates and suggest that different sampling designs can be combined to compare largely independent measures of N(e) for improved confidence in the result. Large populations with moderate gene flow pose the greatest challenges to robust estimation of contemporary N(e) and require careful consideration of sampling and analysis to minimize estimator bias. We emphasize the practical utility of estimating N(e) by highlighting its relevance to the adaptive potential of a population and describing applications in management of marine populations, where the focus is not always on critically endangered populations. Two cases discussed include the mechanisms generating N(e) estimates many orders of magnitude lower than census N in harvested marine fishes and the predicted reduction in N(e) from hatchery-based population
Sample Size and Saturation in PhD Studies Using Qualitative Interviews
Directory of Open Access Journals (Sweden)
Mark Mason
2010-08-01
Full Text Available A number of issues can affect sample size in qualitative research; however, the guiding principle should be the concept of saturation. This has been explored in detail by a number of authors but is still hotly debated, and some say little understood. A sample of PhD studies using qualitative approaches, and qualitative interviews as the method of data collection was taken from theses.com and contents analysed for their sample sizes. Five hundred and sixty studies were identified that fitted the inclusion criteria. Results showed that the mean sample size was 31; however, the distribution was non-random, with a statistically significant proportion of studies, presenting sample sizes that were multiples of ten. These results are discussed in relation to saturation. They suggest a pre-meditated approach that is not wholly congruent with the principles of qualitative research. URN: urn:nbn:de:0114-fqs100387
R. L. Czaplewski
2009-01-01
The minimum variance multivariate composite estimator is a relatively simple sequential estimator for complex sampling designs (Czaplewski 2009). Such designs combine a probability sample of expensive field data with multiple censuses and/or samples of relatively inexpensive multi-sensor, multi-resolution remotely sensed data. Unfortunately, the multivariate composite...
International Nuclear Information System (INIS)
Ferretti, M.; Brambilla, E.; Brunialti, G.; Fornasier, F.; Mazzali, C.; Giordani, P.; Nimis, P.L.
2004-01-01
Sampling requirements related to lichen biomonitoring include optimal sampling density for obtaining precise and unbiased estimates of population parameters and maps of known reliability. Two available datasets on a sub-national scale in Italy were used to determine a cost-effective sampling density to be adopted in medium-to-large-scale biomonitoring studies. As expected, the relative error in the mean Lichen Biodiversity (Italian acronym: BL) values and the error associated with the interpolation of BL values for (unmeasured) grid cells increased as the sampling density decreased. However, the increase in size of the error was not linear and even a considerable reduction (up to 50%) in the original sampling effort led to a far smaller increase in errors in the mean estimates (<6%) and in mapping (<18%) as compared with the original sampling densities. A reduction in the sampling effort can result in considerable savings of resources, which can then be used for a more detailed investigation of potentially problematic areas. It is, however, necessary to decide the acceptable level of precision at the design stage of the investigation, so as to select the proper sampling density. - An acceptable level of precision must be decided before determining a sampling design
Estimation of population mean under systematic sampling
Noor-ul-amin, Muhammad; Javaid, Amjad
2017-11-01
In this study we propose a generalized ratio estimator under non-response for systematic random sampling. We also generate a class of estimators through special cases of generalized estimator using different combinations of coefficients of correlation, kurtosis and variation. The mean square errors and mathematical conditions are also derived to prove the efficiency of proposed estimators. Numerical illustration is included using three populations to support the results.
Network Structure and Biased Variance Estimation in Respondent Driven Sampling.
Verdery, Ashton M; Mouw, Ted; Bauldry, Shawn; Mucha, Peter J
2015-01-01
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance estimation for the construction of confidence intervals and hypothesis tests. In this paper, we show that the estimators of RDS sampling variance rely on a critical assumption that the network is First Order Markov (FOM) with respect to the dependent variable of interest. We demonstrate, through intuitive examples, mathematical generalizations, and computational experiments that current RDS variance estimators will always underestimate the population sampling variance of RDS in empirical networks that do not conform to the FOM assumption. Analysis of 215 observed university and school networks from Facebook and Add Health indicates that the FOM assumption is violated in every empirical network we analyze, and that these violations lead to substantially biased RDS estimators of sampling variance. We propose and test two alternative variance estimators that show some promise for reducing biases, but which also illustrate the limits of estimating sampling variance with only partial information on the underlying population social network.
Prasifka, Jarrad R; Lopez, Miriam D; Hellmich, Richard L; Prasifka, Patricia L
2008-01-01
Estimates of arthropod population size may paradoxically increase following insecticide applications. Research with ground beetles (Coleoptera: Carabidae) suggests that such unusual results reflect increased arthropod movement and capture in traps rather than real changes in population size. However, it is unclear whether direct (hyperactivity) or indirect (prey-mediated) mechanisms produce increased movement. Video tracking of Scarites quadriceps Chaudior indicated that brief exposure to lambda-cyhalothrin or tefluthrin increased total distance moved, maximum velocity and percentage of time moving. Repeated measurements on individual beetles indicated that movement decreased 240 min after initial lambda-cyhalothrin exposure, but increased again following a second exposure, suggesting hyperactivity could lead to increased trap captures in the field. Two field experiments in which ground beetles were collected after lambda-cyhalothrin or permethrin application attempted to detect increases in population size estimates as a result of hyperactivity. Field trials used mark-release-recapture methods in small plots and natural carabid populations in larger plots, but found no significant short-term (<6 day) increases in beetle trap captures. The disagreement between laboratory and field results suggests mechanisms other than hyperactivity may better explain unusual changes in population size estimates. When traps are used as a primary sampling tool, unexpected population-level effects should be interpreted carefully or with additional data less influenced by arthropod activity.
Mean size estimation yields left-side bias: Role of attention on perceptual averaging.
Li, Kuei-An; Yeh, Su-Ling
2017-11-01
The human visual system can estimate mean size of a set of items effectively; however, little is known about whether information on each visual field contributes equally to the mean size estimation. In this study, we examined whether a left-side bias (LSB)-perceptual judgment tends to depend more heavily on left visual field's inputs-affects mean size estimation. Participants were instructed to estimate the mean size of 16 spots. In half of the trials, the mean size of the spots on the left side was larger than that on the right side (the left-larger condition) and vice versa (the right-larger condition). Our results illustrated an LSB: A larger estimated mean size was found in the left-larger condition than in the right-larger condition (Experiment 1), and the LSB vanished when participants' attention was effectively cued to the right side (Experiment 2b). Furthermore, the magnitude of LSB increased with stimulus-onset asynchrony (SOA), when spots on the left side were presented earlier than the right side. In contrast, the LSB vanished and then induced a reversed effect with SOA when spots on the right side were presented earlier (Experiment 3). This study offers the first piece of evidence suggesting that LSB does have a significant influence on mean size estimation of a group of items, which is induced by a leftward attentional bias that enhances the prior entry effect on the left side.
Energy Technology Data Exchange (ETDEWEB)
Larson, David B. [Stanford University School of Medicine, Department of Radiology, Stanford, CA (United States)
2014-10-15
The principle of ALARA (dose as low as reasonably achievable) calls for dose optimization rather than dose reduction, per se. Optimization of CT radiation dose is accomplished by producing images of acceptable diagnostic image quality using the lowest dose method available. Because it is image quality that constrains the dose, CT dose optimization is primarily a problem of image quality rather than radiation dose. Therefore, the primary focus in CT radiation dose optimization should be on image quality. However, no reliable direct measure of image quality has been developed for routine clinical practice. Until such measures become available, size-specific dose estimates (SSDE) can be used as a reasonable image-quality estimate. The SSDE method of radiation dose optimization for CT abdomen and pelvis consists of plotting SSDE for a sample of examinations as a function of patient size, establishing an SSDE threshold curve based on radiologists' assessment of image quality, and modifying protocols to consistently produce doses that are slightly above the threshold SSDE curve. Challenges in operationalizing CT radiation dose optimization include data gathering and monitoring, managing the complexities of the numerous protocols, scanners and operators, and understanding the relationship of the automated tube current modulation (ATCM) parameters to image quality. Because CT manufacturers currently maintain their ATCM algorithms as secret for proprietary reasons, prospective modeling of SSDE for patient populations is not possible without reverse engineering the ATCM algorithm and, hence, optimization by this method requires a trial-and-error approach. (orig.)
Bhaskar, Anand; Wang, Y X Rachel; Song, Yun S
2015-02-01
With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions. © 2015 Bhaskar et al.; Published by Cold Spring Harbor Laboratory Press.
Directory of Open Access Journals (Sweden)
Femke Broekhuis
Full Text Available Many ecological theories and species conservation programmes rely on accurate estimates of population density. Accurate density estimation, especially for species facing rapid declines, requires the application of rigorous field and analytical methods. However, obtaining accurate density estimates of carnivores can be challenging as carnivores naturally exist at relatively low densities and are often elusive and wide-ranging. In this study, we employ an unstructured spatial sampling field design along with a Bayesian sex-specific spatially explicit capture-recapture (SECR analysis, to provide the first rigorous population density estimates of cheetahs (Acinonyx jubatus in the Maasai Mara, Kenya. We estimate adult cheetah density to be between 1.28 ± 0.315 and 1.34 ± 0.337 individuals/100km2 across four candidate models specified in our analysis. Our spatially explicit approach revealed 'hotspots' of cheetah density, highlighting that cheetah are distributed heterogeneously across the landscape. The SECR models incorporated a movement range parameter which indicated that male cheetah moved four times as much as females, possibly because female movement was restricted by their reproductive status and/or the spatial distribution of prey. We show that SECR can be used for spatially unstructured data to successfully characterise the spatial distribution of a low density species and also estimate population density when sample size is small. Our sampling and modelling framework will help determine spatial and temporal variation in cheetah densities, providing a foundation for their conservation and management. Based on our results we encourage other researchers to adopt a similar approach in estimating densities of individually recognisable species.
Broekhuis, Femke; Gopalaswamy, Arjun M
2016-01-01
Many ecological theories and species conservation programmes rely on accurate estimates of population density. Accurate density estimation, especially for species facing rapid declines, requires the application of rigorous field and analytical methods. However, obtaining accurate density estimates of carnivores can be challenging as carnivores naturally exist at relatively low densities and are often elusive and wide-ranging. In this study, we employ an unstructured spatial sampling field design along with a Bayesian sex-specific spatially explicit capture-recapture (SECR) analysis, to provide the first rigorous population density estimates of cheetahs (Acinonyx jubatus) in the Maasai Mara, Kenya. We estimate adult cheetah density to be between 1.28 ± 0.315 and 1.34 ± 0.337 individuals/100km2 across four candidate models specified in our analysis. Our spatially explicit approach revealed 'hotspots' of cheetah density, highlighting that cheetah are distributed heterogeneously across the landscape. The SECR models incorporated a movement range parameter which indicated that male cheetah moved four times as much as females, possibly because female movement was restricted by their reproductive status and/or the spatial distribution of prey. We show that SECR can be used for spatially unstructured data to successfully characterise the spatial distribution of a low density species and also estimate population density when sample size is small. Our sampling and modelling framework will help determine spatial and temporal variation in cheetah densities, providing a foundation for their conservation and management. Based on our results we encourage other researchers to adopt a similar approach in estimating densities of individually recognisable species.
Directory of Open Access Journals (Sweden)
Ming-Yen Tsai
Full Text Available OBJECTIVES: The Meridian Energy Analysis Device is currently a popular tool in the scientific research of meridian electrophysiology. In this field, it is generally believed that measuring the electrical conductivity of meridians provides information about the balance of bioenergy or Qi-blood in the body. METHODS AND RESULTS: PubMed database based on some original articles from 1956 to 2014 and the authoŕs clinical experience. In this short communication, we provide clinical examples of Meridian Energy Analysis Device application, especially in the field of traditional Chinese medicine, discuss the reliability of the measurements, and put the values obtained into context by considering items of considerable variability and by estimating sample size. CONCLUSION: The Meridian Energy Analysis Device is making a valuable contribution to the diagnosis of Qi-blood dysfunction. It can be assessed from short-term and long-term meridian bioenergy recordings. It is one of the few methods that allow outpatient traditional Chinese medicine diagnosis, monitoring the progress, therapeutic effect and evaluation of patient prognosis. The holistic approaches underlying the practice of traditional Chinese medicine and new trends in modern medicine toward the use of objective instruments require in-depth knowledge of the mechanisms of meridian energy, and the Meridian Energy Analysis Device can feasibly be used for understanding and interpreting traditional Chinese medicine theory, especially in view of its expansion in Western countries.
Sample size allocation in multiregional equivalence studies.
Liao, Jason J Z; Yu, Ziji; Li, Yulan
2018-06-17
With the increasing globalization of drug development, the multiregional clinical trial (MRCT) has gained extensive use. The data from MRCTs could be accepted by regulatory authorities across regions and countries as the primary sources of evidence to support global marketing drug approval simultaneously. The MRCT can speed up patient enrollment and drug approval, and it makes the effective therapies available to patients all over the world simultaneously. However, there are many challenges both operationally and scientifically in conducting a drug development globally. One of many important questions to answer for the design of a multiregional study is how to partition sample size into each individual region. In this paper, two systematic approaches are proposed for the sample size allocation in a multiregional equivalence trial. A numerical evaluation and a biosimilar trial are used to illustrate the characteristics of the proposed approaches. Copyright © 2018 John Wiley & Sons, Ltd.
Moustakas, Aristides; Evans, Matthew R
2015-02-28
Plant survival is a key factor in forest dynamics and survival probabilities often vary across life stages. Studies specifically aimed at assessing tree survival are unusual and so data initially designed for other purposes often need to be used; such data are more likely to contain errors than data collected for this specific purpose. We investigate the survival rates of ten tree species in a dataset designed to monitor growth rates. As some individuals were not included in the census at some time points we use capture-mark-recapture methods both to allow us to account for missing individuals, and to estimate relocation probabilities. Growth rates, size, and light availability were included as covariates in the model predicting survival rates. The study demonstrates that tree mortality is best described as constant between years and size-dependent at early life stages and size independent at later life stages for most species of UK hardwood. We have demonstrated that even with a twenty-year dataset it is possible to discern variability both between individuals and between species. Our work illustrates the potential utility of the method applied here for calculating plant population dynamics parameters in time replicated datasets with small sample sizes and missing individuals without any loss of sample size, and including explanatory covariates.
Estimating minimum polycrystalline aggregate size for macroscopic material homogeneity
International Nuclear Information System (INIS)
Kovac, M.; Simonovski, I.; Cizelj, L.
2002-01-01
During severe accidents the pressure boundary of reactor coolant system can be subjected to extreme loadings, which might cause failure. Reliable estimation of the extreme deformations can be crucial to determine the consequences of severe accidents. Important drawback of classical continuum mechanics is idealization of inhomogenous microstructure of materials. Classical continuum mechanics therefore cannot predict accurately the differences between measured responses of specimens, which are different in size but geometrical similar (size effect). A numerical approach, which models elastic-plastic behavior on mesoscopic level, is proposed to estimate minimum size of polycrystalline aggregate above which it can be considered macroscopically homogeneous. The main idea is to divide continuum into a set of sub-continua. Analysis of macroscopic element is divided into modeling the random grain structure (using Voronoi tessellation and random orientation of crystal lattice) and calculation of strain/stress field. Finite element method is used to obtain numerical solutions of strain and stress fields. The analysis is limited to 2D models.(author)
Webometrics: Some Critical Issues of WWW Size Estimation Methods
Directory of Open Access Journals (Sweden)
Srinivasan Mohana Arunachalam
2018-04-01
Full Text Available The number of webpages in the Internet has increased tremendously over the last two decades however only a part of it is indexed by various search engines. This small portion is the indexable web of the Internet and can be usually reachable from a Search Engine. Search engines play a big role in making the World Wide Web accessible to the end user, and how much of the World Wide Web is accessible on the size of the search engine’s index. Researchers have proposed several ways to estimate this size of the indexable web using search engines with and without privileged access to the search engine’s database. Our report provides a summary of methods used in the last two decades to estimate the size of the World Wide Web, as well as describe how this knowledge can be used in other aspects/tasks concerning the World Wide Web.
Directory of Open Access Journals (Sweden)
Jun Wang
Full Text Available Men who have sex with men (MSM are at high risk of HIV infection. For developing proper interventions, it is important to know the size of MSM population. However, size estimation of MSM populations is still a significant public health challenge due to high cost, hard to reach and stigma associated with the population.We aimed to estimate the social network size (c value in general population and the size of MSM population in Shanghai, China by using the net work scale-up method.A multistage random sampling was used to recruit participants aged from 18 to 60 years who had lived in Shanghai for at least 6 months. The "known population method" with adjustment of backward estimation and regression model was applied to estimate the c value. And the MSM population size was further estimated using an adjusted c value taking into account for the transmission effect through social respect level towards MSM.A total of 4017 participants were contacted for an interview, and 3907 participants met the inclusion criterion. The social network size (c value of participants was 236 after adjustment. The estimated size of MSM was 36354 (95% CI: 28489-44219 for the male Shanghaies aged 18 to 60 years, and the proportion of MSM among the total male population aged 18 to 60 years in Shanghai was 0.28%.We employed the network scale-up method and used a wide range of data sources to estimate the size of MSM population in Shanghai, which is useful for HIV prevention and intervention among the target population.
Estimation of myocardial infarct size by vectocardiography
International Nuclear Information System (INIS)
Takimiya, Akihiko
1987-01-01
Correlations between the vectorcardiogram (VCG) indice and infarct size (% defect) obtained from myocardial emission computed tomography with thallium-201 were studied in 45 patients with old infero-posterior myocardial infarction. The patients were divided into two groups, one consisting of eight patients who showed abnormal superior deviation of the QRS loop in a counterclockwise rotation beyond 30 msec in the frontal plane of VCG (referred to hereafter as CCW group), and another a non-CCW group consisting of 37 patients. The results obtained were as follows. (1) In the non-CCW group, there were significant negative correlations between the elevation and the Y-axial component of each instantaneous vector of the QRS loop at 30 msec, 35 msec, 40 msec, 45 msec, and between the Y-axial component of 50 msec instantaneous vector and the % defect. The correlation for both the elevation and the Y-axial component was closest at 40 msec, and there was most significantly close correlation between the elevation of 40 msec instantaneous vector and the % defect. (2) In the non-CCW group, there was also a significant correlation between the elevation of QRS area vector and the % defect. (3) In the CCW group, the infarct size could be estimated by the elevation of 30 msec instantaneous vector. An association with left anterior fascicular block was also indicated in the CCW group. (4) In infero-posterior myocardial infarction, the infarct size can be estimated using these VCG indices. (author)
A simple method for estimating the size of nuclei on fractal surfaces
Zeng, Qiang
2017-10-01
Determining the size of nuclei on complex surfaces remains a big challenge in aspects of biological, material and chemical engineering. Here the author reported a simple method to estimate the size of the nuclei in contact with complex (fractal) surfaces. The established approach was based on the assumptions of contact area proportionality for determining nucleation density and the scaling congruence between nuclei and surfaces for identifying contact regimes. It showed three different regimes governing the equations for estimating the nucleation site density. Nuclei in the size large enough could eliminate the effect of fractal structure. Nuclei in the size small enough could lead to the independence of nucleation site density on fractal parameters. Only when nuclei match the fractal scales, the nucleation site density is associated with the fractal parameters and the size of the nuclei in a coupling pattern. The method was validated by the experimental data reported in the literature. The method may provide an effective way to estimate the size of nuclei on fractal surfaces, through which a number of promising applications in relative fields can be envisioned.
Variance of discharge estimates sampled using acoustic Doppler current profilers from moving boats
Garcia, Carlos M.; Tarrab, Leticia; Oberg, Kevin; Szupiany, Ricardo; Cantero, Mariano I.
2012-01-01
This paper presents a model for quantifying the random errors (i.e., variance) of acoustic Doppler current profiler (ADCP) discharge measurements from moving boats for different sampling times. The model focuses on the random processes in the sampled flow field and has been developed using statistical methods currently available for uncertainty analysis of velocity time series. Analysis of field data collected using ADCP from moving boats from three natural rivers of varying sizes and flow conditions shows that, even though the estimate of the integral time scale of the actual turbulent flow field is larger than the sampling interval, the integral time scale of the sampled flow field is on the order of the sampling interval. Thus, an equation for computing the variance error in discharge measurements associated with different sampling times, assuming uncorrelated flow fields is appropriate. The approach is used to help define optimal sampling strategies by choosing the exposure time required for ADCPs to accurately measure flow discharge.
Sample Size Induced Brittle-to-Ductile Transition of Single-Crystal Aluminum Nitride
2015-08-01
ARL-RP-0528 ● AUG 2015 US Army Research Laboratory Sample Size Induced Brittle-to- Ductile Transition of Single-Crystal Aluminum...originator. ARL-RP-0528 ● AUG 2015 US Army Research Laboratory Sample Size Induced Brittle-to- Ductile Transition of Single-Crystal...Sample Size Induced Brittle-to- Ductile Transition of Single-Crystal Aluminum Nitride 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT
Kamath, Pauline L.; Haroldson, Mark A.; Luikart, Gordon; Paetkau, David; Whitman, Craig L.; van Manen, Frank T.
2015-01-01
Effective population size (Ne) is a key parameter for monitoring the genetic health of threatened populations because it reflects a population's evolutionary potential and risk of extinction due to genetic stochasticity. However, its application to wildlife monitoring has been limited because it is difficult to measure in natural populations. The isolated and well-studied population of grizzly bears (Ursus arctos) in the Greater Yellowstone Ecosystem provides a rare opportunity to examine the usefulness of different Ne estimators for monitoring. We genotyped 729 Yellowstone grizzly bears using 20 microsatellites and applied three single-sample estimators to examine contemporary trends in generation interval (GI), effective number of breeders (Nb) and Ne during 1982–2007. We also used multisample methods to estimate variance (NeV) and inbreeding Ne (NeI). Single-sample estimates revealed positive trajectories, with over a fourfold increase in Ne (≈100 to 450) and near doubling of the GI (≈8 to 14) from the 1980s to 2000s. NeV (240–319) and NeI (256) were comparable with the harmonic mean single-sample Ne (213) over the time period. Reanalysing historical data, we found NeV increased from ≈80 in the 1910s–1960s to ≈280 in the contemporary population. The estimated ratio of effective to total census size (Ne/Nc) was stable and high (0.42–0.66) compared to previous brown bear studies. These results support independent demographic evidence for Yellowstone grizzly bear population growth since the 1980s. They further demonstrate how genetic monitoring of Ne can complement demographic-based monitoring of Nc and vital rates, providing a valuable tool for wildlife managers.
Gregory, T Ryan; Nathwani, Paula; Bonnett, Tiffany R; Huber, Dezene P W
2013-09-01
A study was undertaken to evaluate both a pre-existing method and a newly proposed approach for the estimation of nuclear genome sizes in arthropods. First, concerns regarding the reliability of the well-established method of flow cytometry relating to impacts of rearing conditions on genome size estimates were examined. Contrary to previous reports, a more carefully controlled test found negligible environmental effects on genome size estimates in the fly Drosophila melanogaster. Second, a more recently touted method based on quantitative real-time PCR (qPCR) was examined in terms of ease of use, efficiency, and (most importantly) accuracy using four test species: the flies Drosophila melanogaster and Musca domestica and the beetles Tribolium castaneum and Dendroctonus ponderosa. The results of this analysis demonstrated that qPCR has the tendency to produce substantially different genome size estimates from other established techniques while also being far less efficient than existing methods.
Estimation After a Group Sequential Trial.
Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert
2015-10-01
Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why
Fischer, Jesse R.; Quist, Michael C.
2014-01-01
All freshwater fish sampling methods are biased toward particular species, sizes, and sexes and are further influenced by season, habitat, and fish behavior changes over time. However, little is known about gear-specific biases for many common fish species because few multiple-gear comparison studies exist that have incorporated seasonal dynamics. We sampled six lakes and impoundments representing a diversity of trophic and physical conditions in Iowa, USA, using multiple gear types (i.e., standard modified fyke net, mini-modified fyke net, sinking experimental gill net, bag seine, benthic trawl, boat-mounted electrofisher used diurnally and nocturnally) to determine the influence of sampling methodology and season on fisheries assessments. Specifically, we describe the influence of season on catch per unit effort, proportional size distribution, and the number of samples required to obtain 125 stock-length individuals for 12 species of recreational and ecological importance. Mean catch per unit effort generally peaked in the spring and fall as a result of increased sampling effectiveness in shallow areas and seasonal changes in habitat use (e.g., movement offshore during summer). Mean proportional size distribution decreased from spring to fall for white bass Morone chrysops, largemouth bass Micropterus salmoides, bluegill Lepomis macrochirus, and black crappie Pomoxis nigromaculatus, suggesting selectivity for large and presumably sexually mature individuals in the spring and summer. Overall, the mean number of samples required to sample 125 stock-length individuals was minimized in the fall with sinking experimental gill nets, a boat-mounted electrofisher used at night, and standard modified nets for 11 of the 12 species evaluated. Our results provide fisheries scientists with relative comparisons between several recommended standard sampling methods and illustrate the effects of seasonal variation on estimates of population indices that will be critical to
Sample size determination for logistic regression on a logit-normal distribution.
Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance
2017-06-01
Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.
Particle size analysis in estimating the significance of airborne contamination
International Nuclear Information System (INIS)
1978-01-01
In this report information on pertinent methods and techniques for analysing particle size distributions is compiled. The principles underlying the measurement methods are described, and the merits of different methods in relation to the information being sought and to their usefulness in the laboratory and in the field are explained. Descriptions on sampling methods, gravitational and inertial particle separation methods, electrostatic sizing devices, diffusion batteries, optical sizing techniques and autoradiography are included. Finally, the report considers sampling for respirable activity and problems related to instrument calibration
Nomogram for sample size calculation on a straightforward basis for the kappa statistic.
Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo
2014-09-01
Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high agreement but low kappa. Hence, it seems reasonable in sample size calculation that the level of agreement under a certain marginal prevalence is considered in terms of a simple proportion of agreement rather than a kappa value. Therefore, sample size formulae and nomograms using a simple proportion of agreement rather than a kappa under certain marginal prevalences are proposed. A sample size formula was derived using the kappa statistic under the common correlation model and goodness-of-fit statistic. The nomogram for the sample size formula was developed using SAS 9.3. The sample size formulae using a simple proportion of agreement instead of a kappa statistic and nomograms to eliminate the inconvenience of using a mathematical formula were produced. A nomogram for sample size calculation with a simple proportion of agreement should be useful in the planning stages when the focus of interest is on testing the hypothesis of interobserver agreement involving two raters and nominal outcome measures. Copyright © 2014 Elsevier Inc. All rights reserved.
Plaisance, L.; Knowlton, N.; Paulay, G.; Meyer, C.
2009-12-01
The cryptofauna associated with coral reefs accounts for a major part of the biodiversity in these ecosystems but has been largely overlooked in biodiversity estimates because the organisms are hard to collect and identify. We combine a semi-quantitative sampling design and a DNA barcoding approach to provide metrics for the diversity of reef-associated crustacean. Twenty-two similar-sized dead heads of Pocillopora were sampled at 10 m depth from five central Pacific Ocean localities (four atolls in the Northern Line Islands and in Moorea, French Polynesia). All crustaceans were removed, and partial cytochrome oxidase subunit I was sequenced from 403 individuals, yielding 135 distinct taxa using a species-level criterion of 5% similarity. Most crustacean species were rare; 44% of the OTUs were represented by a single individual, and an additional 33% were represented by several specimens found only in one of the five localities. The Northern Line Islands and Moorea shared only 11 OTUs. Total numbers estimated by species richness statistics (Chao1 and ACE) suggest at least 90 species of crustaceans in Moorea and 150 in the Northern Line Islands for this habitat type. However, rarefaction curves for each region failed to approach an asymptote, and Chao1 and ACE estimators did not stabilize after sampling eight heads in Moorea, so even these diversity figures are underestimates. Nevertheless, even this modest sampling effort from a very limited habitat resulted in surprisingly high species numbers.
An Improvement to Interval Estimation for Small Samples
Directory of Open Access Journals (Sweden)
SUN Hui-Ling
2017-02-01
Full Text Available Because it is difficult and complex to determine the probability distribution of small samples，it is improper to use traditional probability theory to process parameter estimation for small samples. Bayes Bootstrap method is always used in the project. Although，the Bayes Bootstrap method has its own limitation，In this article an improvement is given to the Bayes Bootstrap method，This method extended the amount of samples by numerical simulation without changing the circumstances in a small sample of the original sample. And the new method can give the accurate interval estimation for the small samples. Finally，by using the Monte Carlo simulation to model simulation to the specific small sample problems. The effectiveness and practicability of the Improved-Bootstrap method was proved.
A geostatistical estimation of zinc grade in bore-core samples
International Nuclear Information System (INIS)
Starzec, A.
1987-01-01
Possibilities and preliminary results of geostatistical interpretation of the XRF determination of zinc in bore-core samples are considered. For the spherical model of the variogram the estimation variance of grade in a disk-shape sample (estimated from the grade on the circumference sample) is calculated. Variograms of zinc grade in core samples are presented and examples of the grade estimation are discussed. 4 refs., 7 figs., 1 tab. (author)
Sample size optimization in nuclear material control. 1
International Nuclear Information System (INIS)
Gladitz, J.
1982-01-01
Equations have been derived and exemplified which allow the determination of the minimum variables sample size for given false alarm and detection probabilities of nuclear material losses and diversions, respectively. (author)
Directory of Open Access Journals (Sweden)
Heinz Gallaun
2015-09-01
Full Text Available Land cover change processes are accelerating at the regional to global level. The remote sensing community has developed reliable and robust methods for wall-to-wall mapping of land cover changes; however, land cover changes often occur at rates below the mapping errors. In the current publication, we propose a cost-effective approach to complement wall-to-wall land cover change maps with a sampling approach, which is used for accuracy assessment and accurate estimation of areas undergoing land cover changes, including provision of confidence intervals. We propose a two-stage sampling approach in order to keep accuracy, efficiency, and effort of the estimations in balance. Stratification is applied in both stages in order to gain control over the sample size allocated to rare land cover change classes on the one hand and the cost constraints for very high resolution reference imagery on the other. Bootstrapping is used to complement the accuracy measures and the area estimates with confidence intervals. The area estimates and verification estimations rely on a high quality visual interpretation of the sampling units based on time series of satellite imagery. To demonstrate the cost-effective operational applicability of the approach we applied it for assessment of deforestation in an area characterized by frequent cloud cover and very low change rate in the Republic of Congo, which makes accurate deforestation monitoring particularly challenging.
The international food unit: a new measurement aid that can improve portion size estimation.
Bucher, T; Weltert, M; Rollo, M E; Smith, S P; Jia, W; Collins, C E; Sun, M
2017-09-12
Portion size education tools, aids and interventions can be effective in helping prevent weight gain. However consumers have difficulties in estimating food portion sizes and are confused by inconsistencies in measurement units and terminologies currently used. Visual cues are an important mediator of portion size estimation, but standardized measurement units are required. In the current study, we present a new food volume estimation tool and test the ability of young adults to accurately quantify food volumes. The International Food Unit™ (IFU™) is a 4x4x4 cm cube (64cm 3 ), subdivided into eight 2 cm sub-cubes for estimating smaller food volumes. Compared with currently used measures such as cups and spoons, the IFU™ standardizes estimation of food volumes with metric measures. The IFU™ design is based on binary dimensional increments and the cubic shape facilitates portion size education and training, memory and recall, and computer processing which is binary in nature. The performance of the IFU™ was tested in a randomized between-subject experiment (n = 128 adults, 66 men) that estimated volumes of 17 foods using four methods; the IFU™ cube, a deformable modelling clay cube, a household measuring cup or no aid (weight estimation). Estimation errors were compared between groups using Kruskall-Wallis tests and post-hoc comparisons. Estimation errors differed significantly between groups (H(3) = 28.48, p studies should investigate whether the IFU™ can facilitate portion size training and whether portion size education using the IFU™ is effective and sustainable without the aid. A 3-dimensional IFU™ could serve as a reference object for estimating food volume.
De Keyzer, Willem; Huybrechts, Inge; De Maeyer, Mieke; Ocké, Marga; Slimani, Nadia; van 't Veer, Pieter; De Henauw, Stefaan
2011-04-01
Food photographs are widely used as instruments to estimate portion sizes of consumed foods. Several food atlases are available, all developed to be used in a specific context and for a given study population. Frequently, food photographs are adopted for use in other studies with a different context or another study population. In the present study, errors in portion size estimation of bread, margarine on bread and beverages by two-dimensional models used in the context of a Belgian food consumption survey are investigated. A sample of 111 men and women (age 45-65 years) were invited for breakfast; two test groups were created. One group was asked to estimate portion sizes of consumed foods using photographs 1-2 d after consumption, and a second group was asked the same after 4 d. Also, real-time assessment of portion sizes using photographs was performed. At the group level, large overestimation of margarine, acceptable underestimation of bread and only small estimation errors for beverages were found. Women tended to have smaller estimation errors for bread and margarine compared with men, while the opposite was found for beverages. Surprisingly, no major difference in estimation error was found after 4 d compared with 1-2 d. Individual estimation errors were large for all foods. The results from the present study suggest that the use of food photographs for portion size estimation of bread and beverages is acceptable for use in nutrition surveys. For photographs of margarine on bread, further validation using smaller amounts corresponding to actual consumption is recommended.
Impact of shoe size in a sample of elderly individuals
Directory of Open Access Journals (Sweden)
Daniel López-López
Full Text Available Summary Introduction: The use of an improper shoe size is common in older people and is believed to have a detrimental effect on the quality of life related to foot health. The objective is to describe and compare, in a sample of participants, the impact of shoes that fit properly or improperly, as well as analyze the scores related to foot health and health overall. Method: A sample of 64 participants, with a mean age of 75.3±7.9 years, attended an outpatient center where self-report data was recorded, the measurements of the size of the feet and footwear were determined and the scores compared between the group that wears the correct size of shoes and another group of individuals who do not wear the correct size of shoes, using the Spanish version of the Foot Health Status Questionnaire. Results: The group wearing an improper shoe size showed poorer quality of life regarding overall health and specifically foot health. Differences between groups were evaluated using a t-test for independent samples resulting statistically significant (p<0.05 for the dimension of pain, function, footwear, overall foot health, and social function. Conclusion: Inadequate shoe size has a significant negative impact on quality of life related to foot health. The degree of negative impact seems to be associated with age, sex, and body mass index (BMI.
International Nuclear Information System (INIS)
Albers, D.J.; Hripcsak, George
2012-01-01
Highlights: ► Time-delayed mutual information for irregularly sampled time-series. ► Estimation bias for the time-delayed mutual information calculation. ► Fast, simple, PDF estimator independent, time-delayed mutual information bias estimate. ► Quantification of data-set-size limits of the time-delayed mutual calculation. - Abstract: A method to estimate the time-dependent correlation via an empirical bias estimate of the time-delayed mutual information for a time-series is proposed. In particular, the bias of the time-delayed mutual information is shown to often be equivalent to the mutual information between two distributions of points from the same system separated by infinite time. Thus intuitively, estimation of the bias is reduced to estimation of the mutual information between distributions of data points separated by large time intervals. The proposed bias estimation techniques are shown to work for Lorenz equations data and glucose time series data of three patients from the Columbia University Medical Center database.
Poisson sampling - The adjusted and unadjusted estimator revisited
Michael S. Williams; Hans T. Schreuder; Gerardo H. Terrazas
1998-01-01
The prevailing assumption, that for Poisson sampling the adjusted estimator "Y-hat a" is always substantially more efficient than the unadjusted estimator "Y-hat u" , is shown to be incorrect. Some well known theoretical results are applicable since "Y-hat a" is a ratio-of-means estimator and "Y-hat u" a simple unbiased estimator...
Effect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data.
Li, Johnson Ching-Hong
2016-12-01
In psychological science, the "new statistics" refer to the new statistical practices that focus on effect size (ES) evaluation instead of conventional null-hypothesis significance testing (Cumming, Psychological Science, 25, 7-29, 2014). In a two-independent-samples scenario, Cohen's (1988) standardized mean difference (d) is the most popular ES, but its accuracy relies on two assumptions: normality and homogeneity of variances. Five other ESs-the unscaled robust d (d r * ; Hogarty & Kromrey, 2001), scaled robust d (d r ; Algina, Keselman, & Penfield, Psychological Methods, 10, 317-328, 2005), point-biserial correlation (r pb ; McGrath & Meyer, Psychological Methods, 11, 386-401, 2006), common-language ES (CL; Cliff, Psychological Bulletin, 114, 494-509, 1993), and nonparametric estimator for CL (A w ; Ruscio, Psychological Methods, 13, 19-30, 2008)-may be robust to violations of these assumptions, but no study has systematically evaluated their performance. Thus, in this simulation study the performance of these six ESs was examined across five factors: data distribution, sample, base rate, variance ratio, and sample size. The results showed that A w and d r were generally robust to these violations, and A w slightly outperformed d r . Implications for the use of A w and d r in real-world research are discussed.
First genome size estimations for some eudicot families and genera
Directory of Open Access Journals (Sweden)
Garcia, S.
2010-12-01
Full Text Available Genome size diversity in angiosperms varies roughly 2400-fold, although approximately 45% of angiosperm families lack a single genome size estimation, and therefore, this range could be enlarged. To contribute completing family and genera representation, DNA C-Values are here provided for 19 species from 16 eudicot families, including first values for 6 families, 14 genera and 17 species. The sample of species studied is very diverse, including herbs, weeds, vines, shrubs and trees. Data are discussed regarding previous genome size estimates of closely related species or genera, if any, their chromosome number, growth form or invasive behaviour. The present research contributes approximately 1.5% new values for previously unreported angiosperm families, being the current coverage around 55% of angiosperm families, according to the Plant DNA C-Values Database.
La diversidad del tamaño del genoma en angiospermas es muy amplia, siendo el valor más elevado aproximadamente unas 2400 veces superior al más pequeño. Sin embargo, cerca del 45% de las familias no presentan ni una sola estimación, por lo que el rango real podría ser ampliado. Para contribuir a completar la representación de familias y géneros de angiospermas, este estudio contribuye con valores C para 19 especies de 16 familias de eudicoticotiledóneas, incluyendo los primeros valores para 6 familias, 14 géneros y 17 especies. La muestra estudiada es muy diversa, e incluye hierbas, malezas, enredaderas, arbustos y árboles. Se discuten los resultados en función de estimaciones previas del tamaño del genoma de especies o géneros estrechamente relacionados, del número de cromosomas, la forma de crecimiento o el comportamiento invasor de las especies analizadas. El presente estudio contribuye aproximadamente en un 1,5% de nuevos valores para familias de angiospermas no estudiadas previamente, de las que actualmente existe información para el 55%, según la base de datos
Estimation of Tooth Size Discrepancies among Different Malocclusion Groups
Hasija, Narender; Bala, Madhu; Goyal, Virender
2014-01-01
ABSTRACT Regards and Tribute: Late Dr Narender Hasija was a mentor and visionary in the light of knowledge and experience. We pay our regards with deepest gratitude to the departed soul to rest in peace. Bolton’s ratios help in estimating overbite, overjet relationships, the effects of contemplated extractions on posterior occlusion, incisor relationships and identification of occlusal misfit produced by tooth size discrepancies. Aim: To determine any difference in tooth size discrepancy in a...
Prognostic value of nucleolar size and size pleomorphism in choroidal melanomas
DEFF Research Database (Denmark)
Sørensen, Flemming Brandt; Gamel, J W; Jensen, O A
1993-01-01
Morphometric estimates of nucleolar size have been shown to possess a high prognostic value in patients with uveal melanomas. The authors investigated various quantitative estimators of the mean size and pleomorphism of nucleoli in choroidal melanomas from a consecutive series of 95 Danish patients...... of melanoma, and largest macroscopic tumor dimension (LTD), the following histomorphometric estimates were obtained: mean diameter of the 10 largest nucleoli (MLN), point-sampled mean nucleolar profile area (nucleolar ao) and the associated standard deviation of nucleolar ao, the volume-weighted mean...
Sample size for post-marketing safety studies based on historical controls.
Wu, Yu-te; Makuch, Robert W
2010-08-01
As part of a drug's entire life cycle, post-marketing studies are an important part in the identification of rare, serious adverse events. Recently, the US Food and Drug Administration (FDA) has begun to implement new post-marketing safety mandates as a consequence of increased emphasis on safety. The purpose of this research is to provide exact sample size formula for the proposed hybrid design, based on a two-group cohort study with incorporation of historical external data. Exact sample size formula based on the Poisson distribution is developed, because the detection of rare events is our outcome of interest. Performance of exact method is compared to its approximate large-sample theory counterpart. The proposed hybrid design requires a smaller sample size compared to the standard, two-group prospective study design. In addition, the exact method reduces the number of subjects required in the treatment group by up to 30% compared to the approximate method for the study scenarios examined. The proposed hybrid design satisfies the advantages and rationale of the two-group design with smaller sample sizes generally required. 2010 John Wiley & Sons, Ltd.
Impact of Base Functional Component Types on Software Functional Size based Effort Estimation
Gencel, Cigdem; Buglione, Luigi
2008-01-01
Software effort estimation is still a significant challenge for software management. Although Functional Size Measurement (FSM) methods have been standardized and have become widely used by the software organizations, the relationship between functional size and development effort still needs further investigation. Most of the studies focus on the project cost drivers and consider total software functional size as the primary input to estimation models. In this study, we investigate whether u...
Sample size computation for association studies using case–parents ...
Indian Academy of Sciences (India)
ple size needed to reach a given power (Knapp 1999; Schaid. 1999; Chen and Deng 2001; Brown 2004). In their seminal paper, Risch and Merikangas (1996) showed that for a mul- tiplicative mode of inheritance (MOI) for the susceptibility gene, sample size depends on two parameters: the frequency of the risk allele at the ...
Kamath, Pauline L; Haroldson, Mark A; Luikart, Gordon; Paetkau, David; Whitman, Craig; van Manen, Frank T
2015-11-01
Effective population size (N(e)) is a key parameter for monitoring the genetic health of threatened populations because it reflects a population's evolutionary potential and risk of extinction due to genetic stochasticity. However, its application to wildlife monitoring has been limited because it is difficult to measure in natural populations. The isolated and well-studied population of grizzly bears (Ursus arctos) in the Greater Yellowstone Ecosystem provides a rare opportunity to examine the usefulness of different N(e) estimators for monitoring. We genotyped 729 Yellowstone grizzly bears using 20 microsatellites and applied three single-sample estimators to examine contemporary trends in generation interval (GI), effective number of breeders (N(b)) and N(e) during 1982-2007. We also used multisample methods to estimate variance (N(eV)) and inbreeding N(e) (N(eI)). Single-sample estimates revealed positive trajectories, with over a fourfold increase in N(e) (≈100 to 450) and near doubling of the GI (≈8 to 14) from the 1980s to 2000s. N(eV) (240-319) and N(eI) (256) were comparable with the harmonic mean single-sample N(e) (213) over the time period. Reanalysing historical data, we found N(eV) increased from ≈80 in the 1910s-1960s to ≈280 in the contemporary population. The estimated ratio of effective to total census size (N(e) /N(c)) was stable and high (0.42-0.66) compared to previous brown bear studies. These results support independent demographic evidence for Yellowstone grizzly bear population growth since the 1980s. They further demonstrate how genetic monitoring of N(e) can complement demographic-based monitoring of N(c) and vital rates, providing a valuable tool for wildlife managers. © 2015 John Wiley & Sons Ltd.
Estimating the size of non-observed economy in Croatia using the MIMIC approach
Directory of Open Access Journals (Sweden)
Vjekoslav Klarić
2011-03-01
Full Text Available This paper gives a quick overview of the approaches that have been used in the research of shadow economy, starting with the definitions of the terms “shadow economy” and “non-observed economy”, with the accent on the ISTAT/Eurostat framework. Several methods for estimating the size of the shadow economy and the non-observed economy are then presented. The emphasis is placed on the MIMIC approach, one of the methods used to estimate the size of the nonobserved economy. After a glance at the theory behind it, the MIMIC model is then applied to the Croatian economy. Considering the described characteristics of different methods, a previous estimate of the size of the non-observed economy in Croatia is chosen to provide benchmark values for the MIMIC model. Using those, the estimates of the size of non-observed economy in Croatia during the period 1998-2009 are obtained.
Directory of Open Access Journals (Sweden)
Silje Steinsbekk
2017-11-01
Full Text Available Individuals who are overweight are more likely to underestimate their body size than those who are normal weight, and overweight underestimators are less likely to engage in weight loss efforts. Underestimation of body size might represent a barrier to prevention and treatment of overweight; thus insight in how underestimation of body size develops and tracks through the childhood years is needed. The aim of the present study was therefore to examine stability in children’s underestimation of body size, exploring predictors of underestimation over time. The prospective path from underestimation to BMI was also tested. In a Norwegian cohort of 6 year olds, followed up at ages 8 and 10 (analysis sample: n = 793 body size estimation was captured by the Children’s Body Image Scale, height and weight were measured and BMI calculated. Overall, children were more likely to underestimate than overestimate their body size. Individual stability in underestimation was modest, but significant. Higher BMI predicted future underestimation, even when previous underestimation was adjusted for, but there was no evidence for the opposite direction of influence. Boys were more likely than girls to underestimate their body size at ages 8 and 10 (age 8: 38.0% vs. 24.1%; Age 10: 57.9% vs. 30.8% and showed a steeper increase in underestimation with age compared to girls. In conclusion, the majority of 6, 8, and 10-year olds correctly estimate their body size (prevalence ranging from 40 to 70% depending on age and gender, although a substantial portion perceived themselves to be thinner than they actually were. Higher BMI forecasted future underestimation, but underestimation did not increase the risk for excessive weight gain in middle childhood.
Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization
Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A.
2017-01-01
The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the co...
Estimating population size of Saddle-billed Storks Ephippiorhynchus ...
African Journals Online (AJOL)
The aim of this study was to estimate the population size within associated confidence limits using a modified mark–recapture field method. The vehicle survey, conducted shortly after rainfall in the area, did not produce results with known precision under these conditions. A repeat of this census in spring, after the peak ...
Power Spectrum Estimation of Randomly Sampled Signals
DEFF Research Database (Denmark)
Velte, C. M.; Buchhave, P.; K. George, W.
algorithms; sample and-hold and the direct spectral estimator without residence time weighting. The computer generated signal is a Poisson process with a sample rate proportional to velocity magnitude that consist of well-defined frequency content, which makes bias easy to spot. The idea...
International Nuclear Information System (INIS)
Mullen, R.; Thompson, J.M.; Moussa, O.; Vinnicombe, S.; Evans, A.
2014-01-01
Aim: To assess whether the size of peritumoural stiffness (PTS) on shear-wave elastography (SWE) for small primary breast cancers (≤15 mm) was associated with size discrepancies between grey-scale ultrasound (GSUS) and final histological size and whether the addition of PTS size to GSUS size might result in more accurate tumour size estimation when compared to final histological size. Materials and methods: A retrospective analysis of 86 consecutive patients between August 2011 and February 2013 who underwent breast-conserving surgery for tumours of size ≤15 mm at ultrasound was carried out. The size of PTS stiffness was compared to mean GSUS size, mean histological size, and the extent of size discrepancy between GSUS and histology. PTS size and GSUS were combined and compared to the final histological size. Results: PTS of >3 mm was associated with a larger mean final histological size (16 versus 11.3 mm, p < 0.001). PTS size of >3 mm was associated with a higher frequency of underestimation of final histological size by GSUS of >5 mm (63% versus 18%, p < 0.001). The combination of PTS and GSUS size led to accurate estimation of the final histological size (p = 0.03). The size of PTS was not associated with margin involvement (p = 0.27). Conclusion: PTS extending beyond 3 mm from the grey-scale abnormality is significantly associated with underestimation of tumour size of >5 mm for small invasive breast cancers. Taking into account the size of PTS also led to accurate estimation of the final histological size. Further studies are required to assess the relationship of the extent of SWE stiffness and margin status. - Highlights: • Peritumoural stiffness of greater than 3 mm was associated with larger tumour size. • Underestimation of tumour size by ultrasound was associated with peri-tumoural stiffness size. • Combining peri-tumoural stiffness size to ultrasound produced accurate tumour size estimation
Jirapatnakul, Artit C; Fotin, Sergei V; Reeves, Anthony P; Biancardi, Alberto M; Yankelevitz, David F; Henschke, Claudia I
2009-01-01
Estimation of nodule location and size is an important pre-processing step in some nodule segmentation algorithms to determine the size and location of the region of interest. Ideally, such estimation methods will consistently find the same nodule location regardless of where the the seed point (provided either manually or by a nodule detection algorithm) is placed relative to the "true" center of the nodule, and the size should be a reasonable estimate of the true nodule size. We developed a method that estimates nodule location and size using multi-scale Laplacian of Gaussian (LoG) filtering. Nodule candidates near a given seed point are found by searching for blob-like regions with high filter response. The candidates are then pruned according to filter response and location, and the remaining candidates are sorted by size and the largest candidate selected. This method was compared to a previously published template-based method. The methods were evaluated on the basis of stability of the estimated nodule location to changes in the initial seed point and how well the size estimates agreed with volumes determined by a semi-automated nodule segmentation method. The LoG method exhibited better stability to changes in the seed point, with 93% of nodules having the same estimated location even when the seed point was altered, compared to only 52% of nodules for the template-based method. Both methods also showed good agreement with sizes determined by a nodule segmentation method, with an average relative size difference of 5% and -5% for the LoG and template-based methods respectively.
Efficient estimation for ergodic diffusions sampled at high frequency
DEFF Research Database (Denmark)
Sørensen, Michael
A general theory of efficient estimation for ergodic diffusions sampled at high fre- quency is presented. High frequency sampling is now possible in many applications, in particular in finance. The theory is formulated in term of approximate martingale estimating functions and covers a large class...
Sample size in psychological research over the past 30 years.
Marszalek, Jacob M; Barber, Carolyn; Kohlhart, Julie; Holmes, Cooper B
2011-04-01
The American Psychological Association (APA) Task Force on Statistical Inference was formed in 1996 in response to a growing body of research demonstrating methodological issues that threatened the credibility of psychological research, and made recommendations to address them. One issue was the small, even dramatically inadequate, size of samples used in studies published by leading journals. The present study assessed the progress made since the Task Force's final report in 1999. Sample sizes reported in four leading APA journals in 1955, 1977, 1995, and 2006 were compared using nonparametric statistics, while data from the last two waves were fit to a hierarchical generalized linear growth model for more in-depth analysis. Overall, results indicate that the recommendations for increasing sample sizes have not been integrated in core psychological research, although results slightly vary by field. This and other implications are discussed in the context of current methodological critique and practice.
A flexible method for multi-level sample size determination
International Nuclear Information System (INIS)
Lu, Ming-Shih; Sanborn, J.B.; Teichmann, T.
1997-01-01
This paper gives a flexible method to determine sample sizes for both systematic and random error models (this pertains to sampling problems in nuclear safeguard questions). In addition, the method allows different attribute rejection limits. The new method could assist achieving a higher detection probability and enhance inspection effectiveness
A Heuristic Probabilistic Approach to Estimating Size-Dependent Mobility of Nonuniform Sediment
Woldegiorgis, B. T.; Wu, F. C.; van Griensven, A.; Bauwens, W.
2017-12-01
Simulating the mechanism of bed sediment mobility is essential for modelling sediment dynamics. Despite the fact that many studies are carried out on this subject, they use complex mathematical formulations that are computationally expensive, and are often not easy for implementation. In order to present a simple and computationally efficient complement to detailed sediment mobility models, we developed a heuristic probabilistic approach to estimating the size-dependent mobilities of nonuniform sediment based on the pre- and post-entrainment particle size distributions (PSDs), assuming that the PSDs are lognormally distributed. The approach fits a lognormal probability density function (PDF) to the pre-entrainment PSD of bed sediment and uses the threshold particle size of incipient motion and the concept of sediment mixture to estimate the PSDs of the entrained sediment and post-entrainment bed sediment. The new approach is simple in physical sense and significantly reduces the complexity and computation time and resource required by detailed sediment mobility models. It is calibrated and validated with laboratory and field data by comparing to the size-dependent mobilities predicted with the existing empirical lognormal cumulative distribution function (CDF) approach. The novel features of the current approach are: (1) separating the entrained and non-entrained sediments by a threshold particle size, which is a modified critical particle size of incipient motion by accounting for the mixed-size effects, and (2) using the mixture-based pre- and post-entrainment PSDs to provide a continuous estimate of the size-dependent sediment mobility.
International Nuclear Information System (INIS)
Yu, Lingda; Wang, Guangfu; Zhang, Renjiang
2013-01-01
Full text: During 2008-2012, size-segregated aerosol samples were collected using an eight-stage cascade impactor at Beijing Normal University (BNU) Site, China. These samples were analyzed using particle induced X-ray emission (PIXE) analysis for concentrations of 21 elements consisting of Mg, AI, Si, P, S, CI, K, Ca, Ti, V, Cr, Mn, Fe, Ni, Cu, Zn, As, Se, Br, Ba and Pb. The size-resolved data sets were then analyzed using the positive matrix factorization (PMF) technique in order to identify possible sources and estimate their contribution to particulate matter mass. Nine sources were resolved in eight size ranges (025 ∼ 16μm) and included secondary sulphur, motor vehicles, coal combustion; oil combustion, road dust, biomass burning, soil dust, diesel vehicles and metal processing. PMF analysis of size-resolved source contributions showed that natural sources represented by soil dust and road dust contributed about 57% to the predicted primary particulate matter (PM) mass in the coarse size range(>2μm). On the other hand, anthropogenic sources such as secondary sulphur, coal and oil combustion, biomass burning and motor vehicle contributed about 73% in the fine size range <2μm). The diesel vehicles and secondary sulphur source contributed the most in the ultra-fine size range (<0.25μm) and was responsible for about 52% of the primary PM mass. (author)
Energy Technology Data Exchange (ETDEWEB)
Yu, Lingda [Key Laboratory of Beam Technology and Materiais Modification of Ministry of Education, College of Nuclear Science and Technology, Beijing Normal University, Beijing (China); Wang, Guangfu, E-mail: guangfuw@bnu.edu.cn [Beijing Radiation Center, Beijing (China); Zhang, Renjiang [Key Laboratory of Regional Climate-Environment Research for Temperate Eas tAsia (RCE-TEA), Institute of Atmospheric Physics, Chinese Academy of Science, Beijing (China)
2013-07-01
Full text: During 2008-2012, size-segregated aerosol samples were collected using an eight-stage cascade impactor at Beijing Normal University (BNU) Site, China. These samples were analyzed using particle induced X-ray emission (PIXE) analysis for concentrations of 21 elements consisting of Mg, AI, Si, P, S, CI, K, Ca, Ti, V, Cr, Mn, Fe, Ni, Cu, Zn, As, Se, Br, Ba and Pb. The size-resolved data sets were then analyzed using the positive matrix factorization (PMF) technique in order to identify possible sources and estimate their contribution to particulate matter mass. Nine sources were resolved in eight size ranges (025 ∼ 16μm) and included secondary sulphur, motor vehicles, coal combustion; oil combustion, road dust, biomass burning, soil dust, diesel vehicles and metal processing. PMF analysis of size-resolved source contributions showed that natural sources represented by soil dust and road dust contributed about 57% to the predicted primary particulate matter (PM) mass in the coarse size range(>2μm). On the other hand, anthropogenic sources such as secondary sulphur, coal and oil combustion, biomass burning and motor vehicle contributed about 73% in the fine size range <2μm). The diesel vehicles and secondary sulphur source contributed the most in the ultra-fine size range (<0.25μm) and was responsible for about 52% of the primary PM mass. (author)
Determination of subcellular compartment sizes for estimating dose variations in radiotherapy
International Nuclear Information System (INIS)
Poole, Christopher M.; Ahnesjo, Anders; Enger, Shirin A.
2015-01-01
The variation in specific energy absorbed to different cell compartments caused by variations in size and chemical composition is poorly investigated in radiotherapy. The aim of this study was to develop an algorithm to derive cell and cell nuclei size distributions from 2D histology samples, and build 3D cellular geometries to provide Monte Carlo (MC)-based dose calculation engines with a morphologically relevant input geometry. Stained and unstained regions of the histology samples are segmented using a Gaussian mixture model, and individual cell nuclei are identified via thresholding. Delaunay triangulation is applied to determine the distribution of distances between the centroids of nearest neighbour cells. A pouring simulation is used to build a 3D virtual tissue sample, with cell radii randomised according to the cell size distribution determined from the histology samples. A slice with the same thickness as the histology sample is cut through the 3D data and characterised in the same way as the measured histology. The comparison between this virtual slice and the measured histology is used to adjust the initial cell size distribution into the pouring simulation. This iterative approach of a pouring simulation with adjustments guided by comparison is continued until an input cell size distribution is found that yields a distribution in the sliced geometry that agrees with the measured histology samples. The thus obtained morphologically realistic 3D cellular geometry can be used as input to MC-based dose calculation programs for studies of dose response due to variations in morphology and size of tumour/healthy tissue cells/nuclei, and extracellular material. (authors)
Directory of Open Access Journals (Sweden)
Pitchaiah Mandava
Full Text Available OBJECTIVE: Clinical trial outcomes often involve an ordinal scale of subjective functional assessments but the optimal way to quantify results is not clear. In stroke, the most commonly used scale, the modified Rankin Score (mRS, a range of scores ("Shift" is proposed as superior to dichotomization because of greater information transfer. The influence of known uncertainties in mRS assessment has not been quantified. We hypothesized that errors caused by uncertainties could be quantified by applying information theory. Using Shannon's model, we quantified errors of the "Shift" compared to dichotomized outcomes using published distributions of mRS uncertainties and applied this model to clinical trials. METHODS: We identified 35 randomized stroke trials that met inclusion criteria. Each trial's mRS distribution was multiplied with the noise distribution from published mRS inter-rater variability to generate an error percentage for "shift" and dichotomized cut-points. For the SAINT I neuroprotectant trial, considered positive by "shift" mRS while the larger follow-up SAINT II trial was negative, we recalculated sample size required if classification uncertainty was taken into account. RESULTS: Considering the full mRS range, error rate was 26.1%±5.31 (Mean±SD. Error rates were lower for all dichotomizations tested using cut-points (e.g. mRS 1; 6.8%±2.89; overall p<0.001. Taking errors into account, SAINT I would have required 24% more subjects than were randomized. CONCLUSION: We show when uncertainty in assessments is considered, the lowest error rates are with dichotomization. While using the full range of mRS is conceptually appealing, a gain of information is counter-balanced by a decrease in reliability. The resultant errors need to be considered since sample size may otherwise be underestimated. In principle, we have outlined an approach to error estimation for any condition in which there are uncertainties in outcome assessment. We
Thompson, J K; Dolce, J J
1989-05-01
Thirty-two asymptomatic college females were assessed on multiple aspects of body image. Subjects' estimation of the size of three body sites (waist, hips, thighs) was affected by instructional protocol. Emotional ratings, based on how they "felt" about their body, elicited ratings that were larger than actual and ideal size measures. Size ratings based on rational instructions were no different from actual sizes, but were larger than ideal ratings. There were no differences between actual and ideal sizes. The results are discussed with regard to methodological issues involved in body image research. In addition, a working hypothesis that differentiates affective/emotional from cognitive/rational aspects of body size estimation is offered to complement current theories of body image. Implications of the findings for the understanding of body image and its relationship to eating disorders are discussed.
Using the ''Epiquant'' automatic analyzer for quantitative estimation of grain size
Energy Technology Data Exchange (ETDEWEB)
Tsivirko, E I; Ulitenko, A N; Stetsenko, I A; Burova, N M [Zaporozhskij Mashinostroitel' nyj Inst. (Ukrainian SSR)
1979-01-01
Application possibility of the ''Epiquant'' automatic analyzer to estimate qualitatively austenite grain in the 18Kh2N4VA steel has been investigated. Austenite grain has been clarified using the methods of cementation, oxidation and etching of the grain boundaries. Average linear size of grain at the length of 15 mm has been determined according to the total length of grain intersection line and the number of intersections at the boundaries. It is shown that the ''Epiquant'' analyzer ensures quantitative estimation of austenite grain size with relative error of 2-4 %.
A normative inference approach for optimal sample sizes in decisions from experience
Ostwald, Dirk; Starke, Ludger; Hertwig, Ralph
2015-01-01
“Decisions from experience” (DFE) refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experience-based choice is the “sampling paradigm,” which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the “optimal” sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical study, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for DFE. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE. PMID:26441720
Estimating Most Productive Scale Size in Data Envelopment Analysis with Integer Value Data
Dwi Sari, Yunita; Angria S, Layla; Efendi, Syahril; Zarlis, Muhammad
2018-01-01
The most productive scale size (MPSS) is a measurement that states how resources should be organized and utilized to achieve optimal results. The most productive scale size (MPSS) can be used as a benchmark for the success of an industry or company in producing goods or services. To estimate the most productive scale size (MPSS), each decision making unit (DMU) should pay attention the level of input-output efficiency, by data envelopment analysis (DEA) method decision making unit (DMU) can identify units used as references that can help to find the cause and solution from inefficiencies can optimize productivity that main advantage in managerial applications. Therefore, data envelopment analysis (DEA) is chosen to estimating most productive scale size (MPSS) that will focus on the input of integer value data with the CCR model and the BCC model. The purpose of this research is to find the best solution for estimating most productive scale size (MPSS) with input of integer value data in data envelopment analysis (DEA) method.
Estimation of the size of the female sex worker population in Rwanda using three different methods.
Mutagoma, Mwumvaneza; Kayitesi, Catherine; Gwiza, Aimé; Ruton, Hinda; Koleros, Andrew; Gupta, Neil; Balisanga, Helene; Riedel, David J; Nsanzimana, Sabin
2015-10-01
HIV prevalence is disproportionately high among female sex workers compared to the general population. Many African countries lack useful data on the size of female sex worker populations to inform national HIV programmes. A female sex worker size estimation exercise using three different venue-based methodologies was conducted among female sex workers in all provinces of Rwanda in August 2010. The female sex worker national population size was estimated using capture-recapture and enumeration methods, and the multiplier method was used to estimate the size of the female sex worker population in Kigali. A structured questionnaire was also used to supplement the data. The estimated number of female sex workers by the capture-recapture method was 3205 (95% confidence interval: 2998-3412). The female sex worker size was estimated at 3348 using the enumeration method. In Kigali, the female sex worker size was estimated at 2253 (95% confidence interval: 1916-2524) using the multiplier method. Nearly 80% of all female sex workers in Rwanda were found to be based in the capital, Kigali. This study provided a first-time estimate of the female sex worker population size in Rwanda using capture-recapture, enumeration, and multiplier methods. The capture-recapture and enumeration methods provided similar estimates of female sex worker in Rwanda. Combination of such size estimation methods is feasible and productive in low-resource settings and should be considered vital to inform national HIV programmes. © The Author(s) 2015.
Directory of Open Access Journals (Sweden)
Jing eWang
2015-05-01
Full Text Available Flow cytometry (FCM is a commonly used method for estimating genome size in many organisms. The use of flow cytometry in plants is influenced by endogenous fluorescence inhibitors and may cause an inaccurate estimation of genome size; thus, falsifying the relationship between genome size and phenotypic traits/ecological performance. Quantitative optimization of FCM methodology minimizes such errors, yet there are few studies detailing this methodology. We selected the genus Primulina, one of the most representative and diverse genera of the Old World Gesneriaceae, to evaluate the methodology effect on determining genome size. Our results showed that buffer choice significantly affected genome size estimation in six out of the eight species examined and altered the 2C-value (DNA content by as much as 21.4%. The staining duration and propidium iodide (PI concentration slightly affected the 2C-value. Our experiments showed better histogram quality when the samples were stained for 40 minutes at a PI concentration of 100 µg ml-1. The quality of the estimates was not improved by one-day incubation in the dark at 4 °C or by centrifugation. Thus, our study determined an optimum protocol for genome size measurement in Primulina: LB01 buffer supplemented with 100 µg ml-1 PI and stained for 40 minutes. This protocol also demonstrated a high universality in other Gesneriaceae genera. We report the genome size of nine Gesneriaceae species for the first time. The results showed substantial genome size variation both within and among the species, with the 2C-value ranging between 1.62 and 2.71 pg. Our study highlights the necessity of optimizing the FCM methodology prior to obtaining reliable genome size estimates in a given taxon.
Estimation of Tooth Size Discrepancies among Different Malocclusion Groups.
Hasija, Narender; Bala, Madhu; Goyal, Virender
2014-05-01
Regards and Tribute: Late Dr Narender Hasija was a mentor and visionary in the light of knowledge and experience. We pay our regards with deepest gratitude to the departed soul to rest in peace. Bolton's ratios help in estimating overbite, overjet relationships, the effects of contemplated extractions on posterior occlusion, incisor relationships and identification of occlusal misfit produced by tooth size discrepancies. To determine any difference in tooth size discrepancy in anterior as well as overall ratio in different malocclusions and comparison with Bolton's study. After measuring the teeth on all 100 patients, Bolton's analysis was performed. Results were compared with Bolton's means and standard deviations. The results were also subjected to statistical analysis. Results show that the mean and standard deviations of ideal occlusion cases are comparable with those Bolton but, when the mean and standard deviation of malocclusion groups are compared with those of Bolton, the values of standard deviation are higher, though the mean is comparable. How to cite this article: Hasija N, Bala M, Goyal V. Estimation of Tooth Size Discrepancies among Different Malocclusion Groups. Int J Clin Pediatr Dent 2014;7(2):82-85.
Xu, Huijun; Gordon, J James; Siebers, Jeffrey V
2011-02-01
A dosimetric margin (DM) is the margin in a specified direction between a structure and a specified isodose surface, corresponding to a prescription or tolerance dose. The dosimetric margin distribution (DMD) is the distribution of DMs over all directions. Given a geometric uncertainty model, representing inter- or intrafraction setup uncertainties or internal organ motion, the DMD can be used to calculate coverage Q, which is the probability that a realized target or organ-at-risk (OAR) dose metric D, exceeds the corresponding prescription or tolerance dose. Postplanning coverage evaluation quantifies the percentage of uncertainties for which target and OAR structures meet their intended dose constraints. The goal of the present work is to evaluate coverage probabilities for 28 prostate treatment plans to determine DMD sampling parameters that ensure adequate accuracy for postplanning coverage estimates. Normally distributed interfraction setup uncertainties were applied to 28 plans for localized prostate cancer, with prescribed dose of 79.2 Gy and 10 mm clinical target volume to planning target volume (CTV-to-PTV) margins. Using angular or isotropic sampling techniques, dosimetric margins were determined for the CTV, bladder and rectum, assuming shift invariance of the dose distribution. For angular sampling, DMDs were sampled at fixed angular intervals w (e.g., w = 1 degree, 2 degrees, 5 degrees, 10 degrees, 20 degrees). Isotropic samples were uniformly distributed on the unit sphere resulting in variable angular increments, but were calculated for the same number of sampling directions as angular DMDs, and accordingly characterized by the effective angular increment omega eff. In each direction, the DM was calculated by moving the structure in radial steps of size delta (=0.1, 0.2, 0.5, 1 mm) until the specified isodose was crossed. Coverage estimation accuracy deltaQ was quantified as a function of the sampling parameters omega or omega eff and delta. The
Rock sampling. [method for controlling particle size distribution
Blum, P. (Inventor)
1971-01-01
A method for sampling rock and other brittle materials and for controlling resultant particle sizes is described. The method involves cutting grooves in the rock surface to provide a grouping of parallel ridges and subsequently machining the ridges to provide a powder specimen. The machining step may comprise milling, drilling, lathe cutting or the like; but a planing step is advantageous. Control of the particle size distribution is effected primarily by changing the height and width of these ridges. This control exceeds that obtainable by conventional grinding.
Effects of sample size on the second magnetization peak in ...
Indian Academy of Sciences (India)
8+ crystals are observed at low temperatures, above the temperature where the SMP totally disappears. In particular, the onset of the SMP shifts to lower fields as the sample size decreases - a result that could be interpreted as a size effect in ...
Estimation of creatinine in Urine sample by Jaffe's method
International Nuclear Information System (INIS)
Wankhede, Sonal; Arunkumar, Suja; Sawant, Pramilla D.; Rao, B.B.
2012-01-01
In-vitro bioassay monitoring is based on the determination of activity concentrations in biological samples excreted from the body and is most suitable for alpha and beta emitters. A truly representative bioassay sample is the one having all the voids collected during a 24-h period however, this being technically difficult, overnight urine samples collected by the workers are analyzed. These overnight urine samples are collected for 10-16 h, however in the absence of any specific information, 12 h duration is assumed and the observed results are then corrected accordingly obtain the daily excretion rate. To reduce the uncertainty due to unknown duration of sample collection, IAEA has recommended two methods viz., measurement of specific gravity and creatinine excretion rate in urine sample. Creatinine is a final metabolic product creatinine phosphate in the body and is excreted at a steady rate for people with normally functioning kidneys. It is, therefore, often used as a normalization factor for estimation of duration of sample collection. The present study reports the chemical procedure standardized and its application for the estimation of creatinine in urine samples collected from occupational workers. Chemical procedure for estimation of creatinine in bioassay samples was standardized and applied successfully for its estimation in bioassay samples collected from the workers. The creatinine excretion rate observed for these workers is lower than observed in literature. Further, work is in progress to generate a data bank of creatinine excretion rate for most of the workers and also to study the variability in creatinine coefficient for the same individual based on the analysis of samples collected for different duration
Padilla, Alberto
2009-01-01
Systematic sampling is a commonly used technique due to its simplicity and ease of implementation. The drawback of this simplicity is that it is not possible to estimate the design variance without bias. There are several ways to circumvent this problem. One method is to suppose that the variable of interest has a random order in the population, so the sample variance of simple random sampling without replacement is used. By means of a mixed random - systematic sample, an unbiased estimator o...
Effect of sample size on bias correction performance
Reiter, Philipp; Gutjahr, Oliver; Schefczyk, Lukas; Heinemann, Günther; Casper, Markus C.
2014-05-01
The output of climate models often shows a bias when compared to observed data, so that a preprocessing is necessary before using it as climate forcing in impact modeling (e.g. hydrology, species distribution). A common bias correction method is the quantile matching approach, which adapts the cumulative distribution function of the model output to the one of the observed data by means of a transfer function. Especially for precipitation we expect the bias correction performance to strongly depend on sample size, i.e. the length of the period used for calibration of the transfer function. We carry out experiments using the precipitation output of ten regional climate model (RCM) hindcast runs from the EU-ENSEMBLES project and the E-OBS observational dataset for the period 1961 to 2000. The 40 years are split into a 30 year calibration period and a 10 year validation period. In the first step, for each RCM transfer functions are set up cell-by-cell, using the complete 30 year calibration period. The derived transfer functions are applied to the validation period of the respective RCM precipitation output and the mean absolute errors in reference to the observational dataset are calculated. These values are treated as "best fit" for the respective RCM. In the next step, this procedure is redone using subperiods out of the 30 year calibration period. The lengths of these subperiods are reduced from 29 years down to a minimum of 1 year, only considering subperiods of consecutive years. This leads to an increasing number of repetitions for smaller sample sizes (e.g. 2 for a length of 29 years). In the last step, the mean absolute errors are statistically tested against the "best fit" of the respective RCM to compare the performances. In order to analyze if the intensity of the effect of sample size depends on the chosen correction method, four variations of the quantile matching approach (PTF, QUANT/eQM, gQM, GQM) are applied in this study. The experiments are further
Comparison of Four Estimators under sampling without Replacement
African Journals Online (AJOL)
The results were obtained using a program written in Microsoft Visual C++ programming language. It was observed that the two-stage sampling under unequal probabilities without replacement is always better than the other three estimators considered. Keywords: Unequal probability sampling, two-stage sampling, ...
Effect of CT image size and resolution on the accuracy of rock property estimates
Bazaikin, Y.; Gurevich, B.; Iglauer, S.; Khachkova, T.; Kolyukhin, D.; Lebedev, M.; Lisitsa, V.; Reshetova, G.
2017-05-01
In order to study the effect of the micro-CT scan resolution and size on the accuracy of upscaled digital rock property estimation of core samples Bentheimer sandstone images with the resolution varying from 0.9 μm to 24 μm are used. We statistically show that the correlation length of the pore-to-matrix distribution can be reliably determined for the images with the resolution finer than 9 voxels per correlation length and the representative volume for this property is about 153 correlation length. Similar resolution values for the statistically representative volume are also valid for the estimation of the total porosity, specific surface area, mean curvature, and topology of the pore space. Only the total porosity and the number of isolated pores are stably recovered, whereas geometry and the topological measures of the pore space are strongly affected by the resolution change. We also simulate fluid flow in the pore space and estimate permeability and tortuosity of the sample. The results demonstrate that the representative volume for the transport property calculation should be greater than 50 correlation lengths of pore-to-matrix distribution. On the other hand, permeability estimation based on the statistical analysis of equivalent realizations shows some weak influence of the resolution on the transport properties. The reason for this might be that the characteristic scale of the particular physical processes may affect the result stronger than the model (image) scale.
Overestimation of test performance by ROC analysis: Effect of small sample size
International Nuclear Information System (INIS)
Seeley, G.W.; Borgstrom, M.C.; Patton, D.D.; Myers, K.J.; Barrett, H.H.
1984-01-01
New imaging systems are often observer-rated by ROC techniques. For practical reasons the number of different images, or sample size (SS), is kept small. Any systematic bias due to small SS would bias system evaluation. The authors set about to determine whether the area under the ROC curve (AUC) would be systematically biased by small SS. Monte Carlo techniques were used to simulate observer performance in distinguishing signal (SN) from noise (N) on a 6-point scale; P(SN) = P(N) = .5. Four sample sizes (15, 25, 50 and 100 each of SN and N), three ROC slopes (0.8, 1.0 and 1.25), and three intercepts (0.8, 1.0 and 1.25) were considered. In each of the 36 combinations of SS, slope and intercept, 2000 runs were simulated. Results showed a systematic bias: the observed AUC exceeded the expected AUC in every one of the 36 combinations for all sample sizes, with the smallest sample sizes having the largest bias. This suggests that evaluations of imaging systems using ROC curves based on small sample size systematically overestimate system performance. The effect is consistent but subtle (maximum 10% of AUC standard deviation), and is probably masked by the s.d. in most practical settings. Although there is a statistically significant effect (F = 33.34, P<0.0001) due to sample size, none was found for either the ROC curve slope or intercept. Overestimation of test performance by small SS seems to be an inherent characteristic of the ROC technique that has not previously been described
Short Communication Estimation of size at first maturity in two South ...
African Journals Online (AJOL)
Short Communication Estimation of size at first maturity in two South African coral species. ... African Journal of Marine Science ... PH Montoya-Maya, AHH Macdonald, MH Schleyer ... to differentiate juveniles from adult sizes of corals, an important factor for assessing the condition of scleractinian communities in reefs. Here ...
Test of methods for retrospective activity size distribution determination from filter samples
International Nuclear Information System (INIS)
Meisenberg, Oliver; Tschiersch, Jochen
2015-01-01
Determining the activity size distribution of radioactive aerosol particles requires sophisticated and heavy equipment, which makes measurements at large number of sites difficult and expensive. Therefore three methods for a retrospective determination of size distributions from aerosol filter samples in the laboratory were tested for their applicability. Extraction into a carrier liquid with subsequent nebulisation showed size distributions with a slight but correctable bias towards larger diameters compared with the original size distribution. Yields in the order of magnitude of 1% could be achieved. Sonication-assisted extraction into a carrier liquid caused a coagulation mode to appear in the size distribution. Sonication-assisted extraction into the air did not show acceptable results due to small yields. The method of extraction into a carrier liquid without sonication was applied to aerosol samples from Chernobyl in order to calculate inhalation dose coefficients for 137 Cs based on the individual size distribution. The effective dose coefficient is about half of that calculated with a default reference size distribution. - Highlights: • Activity size distributions can be recovered after aerosol sampling on filters. • Extraction into a carrier liquid and subsequent nebulisation is appropriate. • This facilitates the determination of activity size distributions for individuals. • Size distributions from this method can be used for individual dose coefficients. • Dose coefficients were calculated for the workers at the new Chernobyl shelter
Estimated spatial requirements of the medium- to large-sized ...
African Journals Online (AJOL)
Conservation planning in the Cape Floristic Region (CFR) of South Africa, a recognised world plant diversity hotspot, required information on the estimated spatial requirements of selected medium- to large-sized mammals within each of 102 Broad Habitat Units (BHUs) delineated according to key biophysical parameters.
Estimation of particle size distribution of nanoparticles from electrical ...
Indian Academy of Sciences (India)
... blockade (CB) phenomena of electrical conduction through atiny nanoparticle. Considering the ZnO nanocomposites to be spherical, Coulomb-blockade model of quantum dot isapplied here. The size distribution of particle is estimated from that model and compared with the results obtainedfrom AFM and XRD analyses.
Chen, Henian; Zhang, Nanhua; Lu, Xiaosun; Chen, Sophie
2013-08-01
The method used to determine choice of standard deviation (SD) is inadequately reported in clinical trials. Underestimations of the population SD may result in underpowered clinical trials. This study demonstrates how using the wrong method to determine population SD can lead to inaccurate sample sizes and underpowered studies, and offers recommendations to maximize the likelihood of achieving adequate statistical power. We review the practice of reporting sample size and its effect on the power of trials published in major journals. Simulated clinical trials were used to compare the effects of different methods of determining SD on power and sample size calculations. Prior to 1996, sample size calculations were reported in just 1%-42% of clinical trials. This proportion increased from 38% to 54% after the initial Consolidated Standards of Reporting Trials (CONSORT) was published in 1996, and from 64% to 95% after the revised CONSORT was published in 2001. Nevertheless, underpowered clinical trials are still common. Our simulated data showed that all minimal and 25th-percentile SDs fell below 44 (the population SD), regardless of sample size (from 5 to 50). For sample sizes 5 and 50, the minimum sample SDs underestimated the population SD by 90.7% and 29.3%, respectively. If only one sample was available, there was less than 50% chance that the actual power equaled or exceeded the planned power of 80% for detecting a median effect size (Cohen's d = 0.5) when using the sample SD to calculate the sample size. The proportions of studies with actual power of at least 80% were about 95%, 90%, 85%, and 80% when we used the larger SD, 80% upper confidence limit (UCL) of SD, 70% UCL of SD, and 60% UCL of SD to calculate the sample size, respectively. When more than one sample was available, the weighted average SD resulted in about 50% of trials being underpowered; the proportion of trials with power of 80% increased from 90% to 100% when the 75th percentile and the
Improving the accuracy of livestock distribution estimates through spatial interpolation.
Bryssinckx, Ward; Ducheyne, Els; Muhwezi, Bernard; Godfrey, Sunday; Mintiens, Koen; Leirs, Herwig; Hendrickx, Guy
2012-11-01
Animal distribution maps serve many purposes such as estimating transmission risk of zoonotic pathogens to both animals and humans. The reliability and usability of such maps is highly dependent on the quality of the input data. However, decisions on how to perform livestock surveys are often based on previous work without considering possible consequences. A better understanding of the impact of using different sample designs and processing steps on the accuracy of livestock distribution estimates was acquired through iterative experiments using detailed survey. The importance of sample size, sample design and aggregation is demonstrated and spatial interpolation is presented as a potential way to improve cattle number estimates. As expected, results show that an increasing sample size increased the precision of cattle number estimates but these improvements were mainly seen when the initial sample size was relatively low (e.g. a median relative error decrease of 0.04% per sampled parish for sample sizes below 500 parishes). For higher sample sizes, the added value of further increasing the number of samples declined rapidly (e.g. a median relative error decrease of 0.01% per sampled parish for sample sizes above 500 parishes. When a two-stage stratified sample design was applied to yield more evenly distributed samples, accuracy levels were higher for low sample densities and stabilised at lower sample sizes compared to one-stage stratified sampling. Aggregating the resulting cattle number estimates yielded significantly more accurate results because of averaging under- and over-estimates (e.g. when aggregating cattle number estimates from subcounty to district level, P interpolation to fill in missing values in non-sampled areas, accuracy is improved remarkably. This counts especially for low sample sizes and spatially even distributed samples (e.g. P <0.001 for a sample of 170 parishes using one-stage stratified sampling and aggregation on district level
A comparison study of size-specific dose estimate calculation methods
Energy Technology Data Exchange (ETDEWEB)
Parikh, Roshni A. [Rainbow Babies and Children' s Hospital, University Hospitals Cleveland Medical Center, Case Western Reserve University School of Medicine, Department of Radiology, Cleveland, OH (United States); University of Michigan Health System, Department of Radiology, Ann Arbor, MI (United States); Wien, Michael A.; Jordan, David W.; Ciancibello, Leslie; Berlin, Sheila C. [Rainbow Babies and Children' s Hospital, University Hospitals Cleveland Medical Center, Case Western Reserve University School of Medicine, Department of Radiology, Cleveland, OH (United States); Novak, Ronald D. [Rainbow Babies and Children' s Hospital, University Hospitals Cleveland Medical Center, Case Western Reserve University School of Medicine, Department of Radiology, Cleveland, OH (United States); Rebecca D. Considine Research Institute, Children' s Hospital Medical Center of Akron, Center for Mitochondrial Medicine Research, Akron, OH (United States); Klahr, Paul [CT Clinical Science, Philips Healthcare, Highland Heights, OH (United States); Soriano, Stephanie [Rainbow Babies and Children' s Hospital, University Hospitals Cleveland Medical Center, Case Western Reserve University School of Medicine, Department of Radiology, Cleveland, OH (United States); University of Washington, Department of Radiology, Seattle, WA (United States)
2018-01-15
The size-specific dose estimate (SSDE) has emerged as an improved metric for use by medical physicists and radiologists for estimating individual patient dose. Several methods of calculating SSDE have been described, ranging from patient thickness or attenuation-based (automated and manual) measurements to weight-based techniques. To compare the accuracy of thickness vs. weight measurement of body size to allow for the calculation of the size-specific dose estimate (SSDE) in pediatric body CT. We retrospectively identified 109 pediatric body CT examinations for SSDE calculation. We examined two automated methods measuring a series of level-specific diameters of the patient's body: method A used the effective diameter and method B used the water-equivalent diameter. Two manual methods measured patient diameter at two predetermined levels: the superior endplate of L2, where body width is typically most thin, and the superior femoral head or iliac crest (for scans that did not include the pelvis), where body width is typically most thick; method C averaged lateral measurements at these two levels from the CT projection scan, and method D averaged lateral and anteroposterior measurements at the same two levels from the axial CT images. Finally, we used body weight to characterize patient size, method E, and compared this with the various other measurement methods. Methods were compared across the entire population as well as by subgroup based on body width. Concordance correlation (ρ{sub c}) between each of the SSDE calculation methods (methods A-E) was greater than 0.92 across the entire population, although the range was wider when analyzed by subgroup (0.42-0.99). When we compared each SSDE measurement method with CTDI{sub vol,} there was poor correlation, ρ{sub c}<0.77, with percentage differences between 20.8% and 51.0%. Automated computer algorithms are accurate and efficient in the calculation of SSDE. Manual methods based on patient thickness provide
Hierarchical modeling of cluster size in wildlife surveys
Royle, J. Andrew
2008-01-01
Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).
Estimating fluvial wood discharge from timelapse photography with varying sampling intervals
Anderson, N. K.
2013-12-01
There is recent focus on calculating wood budgets for streams and rivers to help inform management decisions, ecological studies and carbon/nutrient cycling models. Most work has measured in situ wood in temporary storage along stream banks or estimated wood inputs from banks. Little effort has been employed monitoring and quantifying wood in transport during high flows. This paper outlines a procedure for estimating total seasonal wood loads using non-continuous coarse interval sampling and examines differences in estimation between sampling at 1, 5, 10 and 15 minutes. Analysis is performed on wood transport for the Slave River in Northwest Territories, Canada. Relative to the 1 minute dataset, precision decreased by 23%, 46% and 60% for the 5, 10 and 15 minute datasets, respectively. Five and 10 minute sampling intervals provided unbiased equal variance estimates of 1 minute sampling, whereas 15 minute intervals were biased towards underestimation by 6%. Stratifying estimates by day and by discharge increased precision over non-stratification by 4% and 3%, respectively. Not including wood transported during ice break-up, the total minimum wood load estimated at this site is 3300 × 800$ m3 for the 2012 runoff season. The vast majority of the imprecision in total wood volumes came from variance in estimating average volume per log. Comparison of proportions and variance across sample intervals using bootstrap sampling to achieve equal n. Each trial was sampled for n=100, 10,000 times and averaged. All trials were then averaged to obtain an estimate for each sample interval. Dashed lines represent values from the one minute dataset.
Sample sizes and model comparison metrics for species distribution models
B.B. Hanberry; H.S. He; D.C. Dey
2012-01-01
Species distribution models use small samples to produce continuous distribution maps. The question of how small a sample can be to produce an accurate model generally has been answered based on comparisons to maximum sample sizes of 200 observations or fewer. In addition, model comparisons often are made with the kappa statistic, which has become controversial....
Optimum strata boundaries and sample sizes in health surveys using auxiliary variables.
Reddy, Karuna Garan; Khan, Mohammad G M; Khan, Sabiha
2018-01-01
Using convenient stratification criteria such as geographical regions or other natural conditions like age, gender, etc., is not beneficial in order to maximize the precision of the estimates of variables of interest. Thus, one has to look for an efficient stratification design to divide the whole population into homogeneous strata that achieves higher precision in the estimation. In this paper, a procedure for determining Optimum Stratum Boundaries (OSB) and Optimum Sample Sizes (OSS) for each stratum of a variable of interest in health surveys is developed. The determination of OSB and OSS based on the study variable is not feasible in practice since the study variable is not available prior to the survey. Since many variables in health surveys are generally skewed, the proposed technique considers the readily-available auxiliary variables to determine the OSB and OSS. This stratification problem is formulated into a Mathematical Programming Problem (MPP) that seeks minimization of the variance of the estimated population parameter under Neyman allocation. It is then solved for the OSB by using a dynamic programming (DP) technique. A numerical example with a real data set of a population, aiming to estimate the Haemoglobin content in women in a national Iron Deficiency Anaemia survey, is presented to illustrate the procedure developed in this paper. Upon comparisons with other methods available in literature, results reveal that the proposed approach yields a substantial gain in efficiency over the other methods. A simulation study also reveals similar results.
Fraley, R. Chris; Vazire, Simine
2014-01-01
The authors evaluate the quality of research reported in major journals in social-personality psychology by ranking those journals with respect to their N-pact Factors (NF)—the statistical power of the empirical studies they publish to detect typical effect sizes. Power is a particularly important attribute for evaluating research quality because, relative to studies that have low power, studies that have high power are more likely to (a) to provide accurate estimates of effects, (b) to produce literatures with low false positive rates, and (c) to lead to replicable findings. The authors show that the average sample size in social-personality research is 104 and that the power to detect the typical effect size in the field is approximately 50%. Moreover, they show that there is considerable variation among journals in sample sizes and power of the studies they publish, with some journals consistently publishing higher power studies than others. The authors hope that these rankings will be of use to authors who are choosing where to submit their best work, provide hiring and promotion committees with a superior way of quantifying journal quality, and encourage competition among journals to improve their NF rankings. PMID:25296159
Sample size determination for disease prevalence studies with partially validated data.
Qiu, Shi-Fang; Poon, Wai-Yin; Tang, Man-Lai
2016-02-01
Disease prevalence is an important topic in medical research, and its study is based on data that are obtained by classifying subjects according to whether a disease has been contracted. Classification can be conducted with high-cost gold standard tests or low-cost screening tests, but the latter are subject to the misclassification of subjects. As a compromise between the two, many research studies use partially validated datasets in which all data points are classified by fallible tests, and some of the data points are validated in the sense that they are also classified by the completely accurate gold-standard test. In this article, we investigate the determination of sample sizes for disease prevalence studies with partially validated data. We use two approaches. The first is to find sample sizes that can achieve a pre-specified power of a statistical test at a chosen significance level, and the second is to find sample sizes that can control the width of a confidence interval with a pre-specified confidence level. Empirical studies have been conducted to demonstrate the performance of various testing procedures with the proposed sample sizes. The applicability of the proposed methods are illustrated by a real-data example. © The Author(s) 2012.
Optimal Sample Size for Probability of Detection Curves
International Nuclear Information System (INIS)
Annis, Charles; Gandossi, Luca; Martin, Oliver
2012-01-01
The use of Probability of Detection (POD) curves to quantify NDT reliability is common in the aeronautical industry, but relatively less so in the nuclear industry. The European Network for Inspection Qualification's (ENIQ) Inspection Qualification Methodology is based on the concept of Technical Justification, a document assembling all the evidence to assure that the NDT system in focus is indeed capable of finding the flaws for which it was designed. This methodology has become widely used in many countries, but the assurance it provides is usually of qualitative nature. The need to quantify the output of inspection qualification has become more important, especially as structural reliability modelling and quantitative risk-informed in-service inspection methodologies become more widely used. To credit the inspections in structural reliability evaluations, a measure of the NDT reliability is necessary. A POD curve provides such metric. In 2010 ENIQ developed a technical report on POD curves, reviewing the statistical models used to quantify inspection reliability. Further work was subsequently carried out to investigate the issue of optimal sample size for deriving a POD curve, so that adequate guidance could be given to the practitioners of inspection reliability. Manufacturing of test pieces with cracks that are representative of real defects found in nuclear power plants (NPP) can be very expensive. Thus there is a tendency to reduce sample sizes and in turn reduce the conservatism associated with the POD curve derived. Not much guidance on the correct sample size can be found in the published literature, where often qualitative statements are given with no further justification. The aim of this paper is to summarise the findings of such work. (author)
Shieh, Gwowen
2013-01-01
The a priori determination of a proper sample size necessary to achieve some specified power is an important problem encountered frequently in practical studies. To establish the needed sample size for a two-sample "t" test, researchers may conduct the power analysis by specifying scientifically important values as the underlying population means…
Sampling the Mouse Hippocampal Dentate Gyrus
Directory of Open Access Journals (Sweden)
Lisa Basler
2017-12-01
Full Text Available Sampling is a critical step in procedures that generate quantitative morphological data in the neurosciences. Samples need to be representative to allow statistical evaluations, and samples need to deliver a precision that makes statistical evaluations not only possible but also meaningful. Sampling generated variability should, e.g., not be able to hide significant group differences from statistical detection if they are present. Estimators of the coefficient of error (CE have been developed to provide tentative answers to the question if sampling has been “good enough” to provide meaningful statistical outcomes. We tested the performance of the commonly used Gundersen-Jensen CE estimator, using the layers of the mouse hippocampal dentate gyrus as an example (molecular layer, granule cell layer and hilus. We found that this estimator provided useful estimates of the precision that can be expected from samples of different sizes. For all layers, we found that a smoothness factor (m of 0 generally provided better estimates than an m of 1. Only for the combined layers, i.e., the entire dentate gyrus, better CE estimates could be obtained using an m of 1. The orientation of the sections impacted on CE sizes. Frontal (coronal sections are typically most efficient by providing the smallest CEs for a given amount of work. Applying the estimator to 3D-reconstructed layers and using very intense sampling, we observed CE size plots with m = 0 to m = 1 transitions that should also be expected but are not often observed in real section series. The data we present also allows the reader to approximate the sampling intervals in frontal, horizontal or sagittal sections that provide CEs of specified sizes for the layers of the mouse dentate gyrus.
Sevelius, Jae M.
2017-01-01
Background. Transgender individuals have a gender identity that differs from the sex they were assigned at birth. The population size of transgender individuals in the United States is not well-known, in part because official records, including the US Census, do not include data on gender identity. Population surveys today more often collect transgender-inclusive gender-identity data, and secular trends in culture and the media have created a somewhat more favorable environment for transgender people. Objectives. To estimate the current population size of transgender individuals in the United States and evaluate any trend over time. Search methods. In June and July 2016, we searched PubMed, Cumulative Index to Nursing and Allied Health Literature, and Web of Science for national surveys, as well as “gray” literature, through an Internet search. We limited the search to 2006 through 2016. Selection criteria. We selected population-based surveys that used probability sampling and included self-reported transgender-identity data. Data collection and analysis. We used random-effects meta-analysis to pool eligible surveys and used meta-regression to address our hypothesis that the transgender population size estimate would increase over time. We used subsample and leave-one-out analysis to assess for bias. Main results. Our meta-regression model, based on 12 surveys covering 2007 to 2015, explained 62.5% of model heterogeneity, with a significant effect for each unit increase in survey year (F = 17.122; df = 1,10; b = 0.026%; P = .002). Extrapolating these results to 2016 suggested a current US population size of 390 adults per 100 000, or almost 1 million adults nationally. This estimate may be more indicative for younger adults, who represented more than 50% of the respondents in our analysis. Authors’ conclusions. Future national surveys are likely to observe higher numbers of transgender people. The large variety in questions used to ask
Bound on the estimation grid size for sparse reconstruction in direction of arrival estimation
Coutiño Minguez, M.A.; Pribic, R; Leus, G.J.T.
2016-01-01
A bound for sparse reconstruction involving both the signal-to-noise ratio (SNR) and the estimation grid size is presented. The bound is illustrated for the case of a uniform linear array (ULA). By reducing the number of possible sparse vectors present in the feasible set of a constrained ℓ1-norm
Reliability of fish size estimates obtained from multibeam imaging sonar
Hightower, Joseph E.; Magowan, Kevin J.; Brown, Lori M.; Fox, Dewayne A.
2013-01-01
Multibeam imaging sonars have considerable potential for use in fisheries surveys because the video-like images are easy to interpret, and they contain information about fish size, shape, and swimming behavior, as well as characteristics of occupied habitats. We examined images obtained using a dual-frequency identification sonar (DIDSON) multibeam sonar for Atlantic sturgeon Acipenser oxyrinchus oxyrinchus, striped bass Morone saxatilis, white perch M. americana, and channel catfish Ictalurus punctatus of known size (20–141 cm) to determine the reliability of length estimates. For ranges up to 11 m, percent measurement error (sonar estimate – total length)/total length × 100 varied by species but was not related to the fish's range or aspect angle (orientation relative to the sonar beam). Least-square mean percent error was significantly different from 0.0 for Atlantic sturgeon (x̄ = −8.34, SE = 2.39) and white perch (x̄ = 14.48, SE = 3.99) but not striped bass (x̄ = 3.71, SE = 2.58) or channel catfish (x̄ = 3.97, SE = 5.16). Underestimating lengths of Atlantic sturgeon may be due to difficulty in detecting the snout or the longer dorsal lobe of the heterocercal tail. White perch was the smallest species tested, and it had the largest percent measurement errors (both positive and negative) and the lowest percentage of images classified as good or acceptable. Automated length estimates for the four species using Echoview software varied with position in the view-field. Estimates tended to be low at more extreme azimuthal angles (fish's angle off-axis within the view-field), but mean and maximum estimates were highly correlated with total length. Software estimates also were biased by fish images partially outside the view-field and when acoustic crosstalk occurred (when a fish perpendicular to the sonar and at relatively close range is detected in the side lobes of adjacent beams). These sources of
Shah, R; Worner, S P; Chapman, R B
2012-10-01
Pesticide resistance monitoring includes resistance detection and subsequent documentation/ measurement. Resistance detection would require at least one (≥1) resistant individual(s) to be present in a sample to initiate management strategies. Resistance documentation, on the other hand, would attempt to get an estimate of the entire population (≥90%) of the resistant individuals. A computer simulation model was used to compare the efficiency of simple random and systematic sampling plans to detect resistant individuals and to document their frequencies when the resistant individuals were randomly or patchily distributed. A patchy dispersion pattern of resistant individuals influenced the sampling efficiency of systematic sampling plans while the efficiency of random sampling was independent of such patchiness. When resistant individuals were randomly distributed, sample sizes required to detect at least one resistant individual (resistance detection) with a probability of 0.95 were 300 (1%) and 50 (10% and 20%); whereas, when resistant individuals were patchily distributed, using systematic sampling, sample sizes required for such detection were 6000 (1%), 600 (10%) and 300 (20%). Sample sizes of 900 and 400 would be required to detect ≥90% of resistant individuals (resistance documentation) with a probability of 0.95 when resistant individuals were randomly dispersed and present at a frequency of 10% and 20%, respectively; whereas, when resistant individuals were patchily distributed, using systematic sampling, a sample size of 3000 and 1500, respectively, was necessary. Small sample sizes either underestimated or overestimated the resistance frequency. A simple random sampling plan is, therefore, recommended for insecticide resistance detection and subsequent documentation.
What is the optimum sample size for the study of peatland testate amoeba assemblages?
Mazei, Yuri A; Tsyganov, Andrey N; Esaulov, Anton S; Tychkov, Alexander Yu; Payne, Richard J
2017-10-01
Testate amoebae are widely used in ecological and palaeoecological studies of peatlands, particularly as indicators of surface wetness. To ensure data are robust and comparable it is important to consider methodological factors which may affect results. One significant question which has not been directly addressed in previous studies is how sample size (expressed here as number of Sphagnum stems) affects data quality. In three contrasting locations in a Russian peatland we extracted samples of differing size, analysed testate amoebae and calculated a number of widely-used indices: species richness, Simpson diversity, compositional dissimilarity from the largest sample and transfer function predictions of water table depth. We found that there was a trend for larger samples to contain more species across the range of commonly-used sample sizes in ecological studies. Smaller samples sometimes failed to produce counts of testate amoebae often considered minimally adequate. It seems likely that analyses based on samples of different sizes may not produce consistent data. Decisions about sample size need to reflect trade-offs between logistics, data quality, spatial resolution and the disturbance involved in sample extraction. For most common ecological applications we suggest that samples of more than eight Sphagnum stems are likely to be desirable. Copyright © 2017 Elsevier GmbH. All rights reserved.
[Sample size calculation in clinical post-marketing evaluation of traditional Chinese medicine].
Fu, Yingkun; Xie, Yanming
2011-10-01
In recent years, as the Chinese government and people pay more attention on the post-marketing research of Chinese Medicine, part of traditional Chinese medicine breed has or is about to begin after the listing of post-marketing evaluation study. In the post-marketing evaluation design, sample size calculation plays a decisive role. It not only ensures the accuracy and reliability of post-marketing evaluation. but also assures that the intended trials will have a desired power for correctly detecting a clinically meaningful difference of different medicine under study if such a difference truly exists. Up to now, there is no systemic method of sample size calculation in view of the traditional Chinese medicine. In this paper, according to the basic method of sample size calculation and the characteristic of the traditional Chinese medicine clinical evaluation, the sample size calculation methods of the Chinese medicine efficacy and safety are discussed respectively. We hope the paper would be beneficial to medical researchers, and pharmaceutical scientists who are engaged in the areas of Chinese medicine research.
Estimating abundance of mountain lions from unstructured spatial sampling
Russell, Robin E.; Royle, J. Andrew; Desimone, Richard; Schwartz, Michael K.; Edwards, Victoria L.; Pilgrim, Kristy P.; Mckelvey, Kevin S.
2012-01-01
Mountain lions (Puma concolor) are often difficult to monitor because of their low capture probabilities, extensive movements, and large territories. Methods for estimating the abundance of this species are needed to assess population status, determine harvest levels, evaluate the impacts of management actions on populations, and derive conservation and management strategies. Traditional mark–recapture methods do not explicitly account for differences in individual capture probabilities due to the spatial distribution of individuals in relation to survey effort (or trap locations). However, recent advances in the analysis of capture–recapture data have produced methods estimating abundance and density of animals from spatially explicit capture–recapture data that account for heterogeneity in capture probabilities due to the spatial organization of individuals and traps. We adapt recently developed spatial capture–recapture models to estimate density and abundance of mountain lions in western Montana. Volunteers and state agency personnel collected mountain lion DNA samples in portions of the Blackfoot drainage (7,908 km2) in west-central Montana using 2 methods: snow back-tracking mountain lion tracks to collect hair samples and biopsy darting treed mountain lions to obtain tissue samples. Overall, we recorded 72 individual capture events, including captures both with and without tissue sample collection and hair samples resulting in the identification of 50 individual mountain lions (30 females, 19 males, and 1 unknown sex individual). We estimated lion densities from 8 models containing effects of distance, sex, and survey effort on detection probability. Our population density estimates ranged from a minimum of 3.7 mountain lions/100 km2 (95% Cl 2.3–5.7) under the distance only model (including only an effect of distance on detection probability) to 6.7 (95% Cl 3.1–11.0) under the full model (including effects of distance, sex, survey effort, and
International Nuclear Information System (INIS)
Haven, Kyle; Majda, Andrew; Abramov, Rafail
2005-01-01
Many situations in complex systems require quantitative estimates of the lack of information in one probability distribution relative to another. In short term climate and weather prediction, examples of these issues might involve the lack of information in the historical climate record compared with an ensemble prediction, or the lack of information in a particular Gaussian ensemble prediction strategy involving the first and second moments compared with the non-Gaussian ensemble itself. The relative entropy is a natural way to quantify the predictive utility in this information, and recently a systematic computationally feasible hierarchical framework has been developed. In practical systems with many degrees of freedom, computational overhead limits ensemble predictions to relatively small sample sizes. Here the notion of predictive utility, in a relative entropy framework, is extended to small random samples by the definition of a sample utility, a measure of the unlikeliness that a random sample was produced by a given prediction strategy. The sample utility is the minimum predictability, with a statistical level of confidence, which is implied by the data. Two practical algorithms for measuring such a sample utility are developed here. The first technique is based on the statistical method of null-hypothesis testing, while the second is based upon a central limit theorem for the relative entropy of moment-based probability densities. These techniques are tested on known probability densities with parameterized bimodality and skewness, and then applied to the Lorenz '96 model, a recently developed 'toy' climate model with chaotic dynamics mimicking the atmosphere. The results show a detection of non-Gaussian tendencies of prediction densities at small ensemble sizes with between 50 and 100 members, with a 95% confidence level
Final Report: Sampling-Based Algorithms for Estimating Structure in Big Data.
Energy Technology Data Exchange (ETDEWEB)
Matulef, Kevin Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-02-01
The purpose of this project was to develop sampling-based algorithms to discover hidden struc- ture in massive data sets. Inferring structure in large data sets is an increasingly common task in many critical national security applications. These data sets come from myriad sources, such as network traffic, sensor data, and data generated by large-scale simulations. They are often so large that traditional data mining techniques are time consuming or even infeasible. To address this problem, we focus on a class of algorithms that do not compute an exact answer, but instead use sampling to compute an approximate answer using fewer resources. The particular class of algorithms that we focus on are streaming algorithms , so called because they are designed to handle high-throughput streams of data. Streaming algorithms have only a small amount of working storage - much less than the size of the full data stream - so they must necessarily use sampling to approximate the correct answer. We present two results: * A streaming algorithm called HyperHeadTail , that estimates the degree distribution of a graph (i.e., the distribution of the number of connections for each node in a network). The degree distribution is a fundamental graph property, but prior work on estimating the degree distribution in a streaming setting was impractical for many real-world application. We improve upon prior work by developing an algorithm that can handle streams with repeated edges, and graph structures that evolve over time. * An algorithm for the task of maintaining a weighted subsample of items in a stream, when the items must be sampled according to their weight, and the weights are dynamically changing. To our knowledge, this is the first such algorithm designed for dynamically evolving weights. We expect it may be useful as a building block for other streaming algorithms on dynamic data sets.
Directory of Open Access Journals (Sweden)
Franciane Andrade de PÁDUA
2015-06-01
Full Text Available As diversas formas de se amostrar a madeira para o estudo de suas propriedades levam em consideração a acurácia, o tempo e o custo de processamento e coleta do material. No entanto, a forma e intensidade da amostragem considerada pode não captar corretamente a variabilidade dessas propriedades ou até mesmo negligenciá-la. O objetivo deste trabalho foi estimar o número de árvores necessárias para a estimativa da densidade básica média da árvore em um clone de híbrido de Eucalyptus urophylla x Eucalyptus grandis considerando diferentes formas de amostragem e classes de diâmetro. Foram utilizadas 50 árvores de um clone do hibrido, aos 5,6 anos. As árvores foram distribuídas em três classes de diâmetro e amostradas na forma de discos, a partir de três propostas: tradicional (0%, 25%, 50%,75% e 100% da altura comercial Hc; alternativa (2%, 10%, 30% e 70% Hc e de metro em metro a partir do DAP. Não houve diferença entre o número de árvores requeridas para a estimativa da densidade do clone por forma de amostragem, admitindo-se um erro de 5% e intervalo de confiança de 95%. A amostragem alternativa foi a mais eficiente considerando a intensidade da amostragem no tronco e o coeficiente de variação. A classificação diamétrica resultou em um número maior de árvores para estimar a densidade média, em função da maior variação da propriedade dentro de classes do que dentro do método de amostragem. There are several methods of collecting wood samples for the study of their properties, which consider the accuracy, time and cost of collecting and processing the material. However, often the variation pattern of ownership in the tree is neglected. Depending on the shape and size of the sample in the study the variability of the properties of the wood cannot be properly captured. The aim of this study was to estimate the number of trees needed to estimate the average basic density of the tree in a Eucalyptus urophylla x
Determining sample size for assessing species composition in ...
African Journals Online (AJOL)
Species composition is measured in grasslands for a variety of reasons. Commonly, observations are made using the wheel-point apparatus, but the problem of determining optimum sample size has not yet been satisfactorily resolved. In this study the wheel-point apparatus was used to record 2 000 observations in each of ...
Performances Of Estimators Of Linear Models With Autocorrelated ...
African Journals Online (AJOL)
The performances of five estimators of linear models with Autocorrelated error terms are compared when the independent variable is autoregressive. The results reveal that the properties of the estimators when the sample size is finite is quite similar to the properties of the estimators when the sample size is infinite although ...
Spatial pattern corrections and sample sizes for forest density estimates of historical tree surveys
Brice B. Hanberry; Shawn Fraver; Hong S. He; Jian Yang; Dan C. Dey; Brian J. Palik
2011-01-01
The U.S. General Land Office land surveys document trees present during European settlement. However, use of these surveys for calculating historical forest density and other derived metrics is limited by uncertainty about the performance of plotless density estimators under a range of conditions. Therefore, we tested two plotless density estimators, developed by...
Hedt-Gauthier, Bethany L; Mitsunaga, Tisha; Hund, Lauren; Olives, Casey; Pagano, Marcello
2013-10-26
Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations.The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs.
An alternative procedure for estimating the population mean in simple random sampling
Directory of Open Access Journals (Sweden)
Housila P. Singh
2012-03-01
Full Text Available This paper deals with the problem of estimating the finite population mean using auxiliary information in simple random sampling. Firstly we have suggested a correction to the mean squared error of the estimator proposed by Gupta and Shabbir [On improvement in estimating the population mean in simple random sampling. Jour. Appl. Statist. 35(5 (2008, pp. 559-566]. Later we have proposed a ratio type estimator and its properties are studied in simple random sampling. Numerically we have shown that the proposed class of estimators is more efficient than different known estimators including Gupta and Shabbir (2008 estimator.
Gould, Matthew J.; Cain, James W.; Roemer, Gary W.; Gould, William R.
2016-01-01
During the 2004–2005 to 2015–2016 hunting seasons, the New Mexico Department of Game and Fish (NMDGF) estimated black bear abundance (Ursus americanus) across the state by coupling density estimates with the distribution of primary habitat generated by Costello et al. (2001). These estimates have been used to set harvest limits. For example, a density of 17 bears/100 km2 for the Sangre de Cristo and Sacramento Mountains and 13.2 bears/100 km2 for the Sandia Mountains were used to set harvest levels. The advancement and widespread acceptance of non-invasive sampling and mark-recapture methods, prompted the NMDGF to collaborate with the New Mexico Cooperative Fish and Wildlife Research Unit and New Mexico State University to update their density estimates for black bear populations in select mountain ranges across the state.We established 5 study areas in 3 mountain ranges: the northern (NSC; sampled in 2012) and southern Sangre de Cristo Mountains (SSC; sampled in 2013), the Sandia Mountains (Sandias; sampled in 2014), and the northern (NSacs) and southern Sacramento Mountains (SSacs; both sampled in 2014). We collected hair samples from black bears using two concurrent non-invasive sampling methods, hair traps and bear rubs. We used a gender marker and a suite of microsatellite loci to determine the individual identification of hair samples that were suitable for genetic analysis. We used these data to generate mark-recapture encounter histories for each bear and estimated density in a spatially explicit capture-recapture framework (SECR). We constructed a suite of SECR candidate models using sex, elevation, land cover type, and time to model heterogeneity in detection probability and the spatial scale over which detection probability declines. We used Akaike’s Information Criterion corrected for small sample size (AICc) to rank and select the most supported model from which we estimated density.We set 554 hair traps, 117 bear rubs and collected 4,083 hair
Chaudhuri, Arijit
2014-01-01
Exposure to SamplingAbstract Introduction Concepts of Population, Sample, and SamplingInitial RamificationsAbstract Introduction Sampling Design, Sampling SchemeRandom Numbers and Their Uses in Simple RandomSampling (SRS)Drawing Simple Random Samples with and withoutReplacementEstimation of Mean, Total, Ratio of Totals/Means:Variance and Variance EstimationDetermination of Sample SizesA.2 Appendix to Chapter 2 A.More on Equal Probability Sampling A.Horvitz-Thompson EstimatorA.SufficiencyA.LikelihoodA.Non-Existence Theorem More Intricacies Abstract Introduction Unequal Probability Sampling StrategiesPPS Sampling Exploring Improved WaysAbstract Introduction Stratified Sampling Cluster SamplingMulti-Stage SamplingMulti-Phase Sampling: Ratio and RegressionEstimationviiviii ContentsControlled SamplingModeling Introduction Super-Population ModelingPrediction Approach Model-Assisted Approach Bayesian Methods Spatial SmoothingSampling on Successive Occasions: Panel Rotation Non-Response and Not-at-Homes Weighting Adj...
Mevik, Kjersti; Griffin, Frances A; Hansen, Tonje E; Deilkås, Ellen T; Vonen, Barthold
2016-04-25
To investigate the impact of increasing sample of records reviewed bi-weekly with the Global Trigger Tool method to identify adverse events in hospitalised patients. Retrospective observational study. A Norwegian 524-bed general hospital trust. 1920 medical records selected from 1 January to 31 December 2010. Rate, type and severity of adverse events identified in two different samples sizes of records selected as 10 and 70 records, bi-weekly. In the large sample, 1.45 (95% CI 1.07 to 1.97) times more adverse events per 1000 patient days (39.3 adverse events/1000 patient days) were identified than in the small sample (27.2 adverse events/1000 patient days). Hospital-acquired infections were the most common category of adverse events in both the samples, and the distributions of the other categories of adverse events did not differ significantly between the samples. The distribution of severity level of adverse events did not differ between the samples. The findings suggest that while the distribution of categories and severity are not dependent on the sample size, the rate of adverse events is. Further studies are needed to conclude if the optimal sample size may need to be adjusted based on the hospital size in order to detect a more accurate rate of adverse events. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
International Nuclear Information System (INIS)
Akram, M.; Aftab, F.
2016-01-01
In the present study, fruits (drupes) were collected from Changa Manga Forest Plus Trees (CMF-PT), Changa Manga Forest Teak Stand (CMF-TS) and Punjab University Botanical Gardens (PUBG) and categorized into very large (= 17 mm dia.), large (12-16 mm dia.), medium (9-11 mm dia.) or small (6-8 mm dia.) fruit size grades. Fresh water as well as mechanical scarification and stratification were tested for breaking seed dormancy. Viability status of seeds was estimated by cutting test, X-rays and In vitro seed germination. Out of 2595 fruits from CMF-PT, 500 fruits were of very large grade. This fruit category also had highest individual fruit weight (0.58 g) with more number of 4-seeded fruits (5.29 percent) and fair germination potential (35.32 percent). Generally, most of the fruits were 1-seeded irrespective of size grades and sampling sites. Fresh water scarification had strong effect on germination (44.30 percent) as compared to mechanical scarification and cold stratification after 40 days of sowing. Similarly, sampling sites and fruit size grades also had significant influence on germination. Highest germination (82.33 percent) was obtained on MS (Murashige and Skoog) agar-solidified medium as compared to Woody Plant Medium (WPM) (69.22 percent). Seedlings from all the media were transferred to ex vitro conditions in the greenhouse and achieved highest survival (28.6 percent) from seedlings previously raised on MS agar-solidified medium after 40 days. There was an association between the studied parameters of teak seeds and the sampling sites and fruit size. (author)
Lachin, John M; McGee, Paula L; Greenbaum, Carla J; Palmer, Jerry; Pescovitz, Mark D; Gottlieb, Peter; Skyler, Jay
2011-01-01
Preservation of β-cell function as measured by stimulated C-peptide has recently been accepted as a therapeutic target for subjects with newly diagnosed type 1 diabetes. In recently completed studies conducted by the Type 1 Diabetes Trial Network (TrialNet), repeated 2-hour Mixed Meal Tolerance Tests (MMTT) were obtained for up to 24 months from 156 subjects with up to 3 months duration of type 1 diabetes at the time of study enrollment. These data provide the information needed to more accurately determine the sample size needed for future studies of the effects of new agents on the 2-hour area under the curve (AUC) of the C-peptide values. The natural log(x), log(x+1) and square-root (√x) transformations of the AUC were assessed. In general, a transformation of the data is needed to better satisfy the normality assumptions for commonly used statistical tests. Statistical analysis of the raw and transformed data are provided to estimate the mean levels over time and the residual variation in untreated subjects that allow sample size calculations for future studies at either 12 or 24 months of follow-up and among children 8-12 years of age, adolescents (13-17 years) and adults (18+ years). The sample size needed to detect a given relative (percentage) difference with treatment versus control is greater at 24 months than at 12 months of follow-up, and differs among age categories. Owing to greater residual variation among those 13-17 years of age, a larger sample size is required for this age group. Methods are also described for assessment of sample size for mixtures of subjects among the age categories. Statistical expressions are presented for the presentation of analyses of log(x+1) and √x transformed values in terms of the original units of measurement (pmol/ml). Analyses using different transformations are described for the TrialNet study of masked anti-CD20 (rituximab) versus masked placebo. These results provide the information needed to accurately
Directory of Open Access Journals (Sweden)
John M Lachin
Full Text Available Preservation of β-cell function as measured by stimulated C-peptide has recently been accepted as a therapeutic target for subjects with newly diagnosed type 1 diabetes. In recently completed studies conducted by the Type 1 Diabetes Trial Network (TrialNet, repeated 2-hour Mixed Meal Tolerance Tests (MMTT were obtained for up to 24 months from 156 subjects with up to 3 months duration of type 1 diabetes at the time of study enrollment. These data provide the information needed to more accurately determine the sample size needed for future studies of the effects of new agents on the 2-hour area under the curve (AUC of the C-peptide values. The natural log(x, log(x+1 and square-root (√x transformations of the AUC were assessed. In general, a transformation of the data is needed to better satisfy the normality assumptions for commonly used statistical tests. Statistical analysis of the raw and transformed data are provided to estimate the mean levels over time and the residual variation in untreated subjects that allow sample size calculations for future studies at either 12 or 24 months of follow-up and among children 8-12 years of age, adolescents (13-17 years and adults (18+ years. The sample size needed to detect a given relative (percentage difference with treatment versus control is greater at 24 months than at 12 months of follow-up, and differs among age categories. Owing to greater residual variation among those 13-17 years of age, a larger sample size is required for this age group. Methods are also described for assessment of sample size for mixtures of subjects among the age categories. Statistical expressions are presented for the presentation of analyses of log(x+1 and √x transformed values in terms of the original units of measurement (pmol/ml. Analyses using different transformations are described for the TrialNet study of masked anti-CD20 (rituximab versus masked placebo. These results provide the information needed to
Chaudry, Beenish Moalla; Connelly, Kay; Siek, Katie A; Welch, Janet L
2013-12-01
Chronically ill people, especially those with low literacy skills, often have difficulty estimating portion sizes of liquids to help them stay within their recommended fluid limits. There is a plethora of mobile applications that can help people monitor their nutritional intake but unfortunately these applications require the user to have high literacy and numeracy skills for portion size recording. In this paper, we present two studies in which the low- and the high-fidelity versions of a portion size estimation interface, designed using the cognitive strategies adults employ for portion size estimation during diet recall studies, was evaluated by a chronically ill population with varying literacy skills. The low fidelity interface was evaluated by ten patients who were all able to accurately estimate portion sizes of various liquids with the interface. Eighteen participants did an in situ evaluation of the high-fidelity version incorporated in a diet and fluid monitoring mobile application for 6 weeks. Although the accuracy of the estimation cannot be confirmed in the second study but the participants who actively interacted with the interface showed better health outcomes by the end of the study. Based on these findings, we provide recommendations for designing the next iteration of an accurate and low literacy-accessible liquid portion size estimation mobile interface.
Predictors of Citation Rate in Psychology: Inconclusive Influence of Effect and Sample Size.
Hanel, Paul H P; Haase, Jennifer
2017-01-01
In the present article, we investigate predictors of how often a scientific article is cited. Specifically, we focus on the influence of two often neglected predictors of citation rate: effect size and sample size, using samples from two psychological topical areas. Both can be considered as indicators of the importance of an article and post hoc (or observed) statistical power, and should, especially in applied fields, predict citation rates. In Study 1, effect size did not have an influence on citation rates across a topical area, both with and without controlling for numerous variables that have been previously linked to citation rates. In contrast, sample size predicted citation rates, but only while controlling for other variables. In Study 2, sample and partly effect sizes predicted citation rates, indicating that the relations vary even between scientific topical areas. Statistically significant results had more citations in Study 2 but not in Study 1. The results indicate that the importance (or power) of scientific findings may not be as strongly related to citation rate as is generally assumed.
Nelson, M; Atkinson, M; Darbyshire, S
1996-07-01
The aim of the present study was to determine the errors in the conceptualization of portion size using photographs. Male and female volunteers aged 18-90 years (n 136) from a wide variety of social and occupational backgrounds completed 602 assessments of portion size in relation to food photographs. Subjects served themselves between four and six foods at one meal (breakfast, lunch or dinner). Portion sizes were weighed by the investigators at the time of serving, and any waste was weighed at the end of the meal. Within 5 min of the end of the meal, subjects were shown photographs depicting each of the foods just consumed. For each food there were eight photographs showing portion sizes in equal increments from the 5th to the 95th centile of the distribution of portion weights observed in The Dietary and Nutritional Survey of British Adults (Gregory et al. 1990). Subjects were asked to indicate on a visual analogue scale the size of the portion consumed in relation to the eight photographs. The nutrient contents of meals were estimated from food composition tables. There were large variations in the estimation of portion sizes from photographs. Butter and margarine portion sizes tended to be substantially overestimated. In general, small portion sizes tended to be overestimated, and large portion sizes underestimated. Older subjects overestimated portion size more often than younger subjects. Excluding butter and margarine, the nutrient content of meals based on estimated portion sizes was on average within +/- 7% of the nutrient content based on the amounts consumed, except for vitamin C (21% overestimate), and for subjects over 65 years (15-20% overestimate for energy and fat). In subjects whose BMI was less than 25 kg/m2, the energy and fat contents of meals calculated from food composition tables and based on estimated portion size (excluding butter and margarine) were 5-10% greater than the nutrient content calculated using actual portion size, but for those
Why liquid displacement methods are sometimes wrong in estimating the pore-size distribution
Gijsbertsen-Abrahamse, A.J.; Boom, R.M.; Padt, van der A.
2004-01-01
The liquid displacement method is a commonly used method to determine the pore size distribution of micro- and ultrafiltration membranes. One of the assumptions for the calculation of the pore sizes is that the pores are parallel and thus are not interconnected. To show that the estimated pore size
Comparison of sampling techniques for Bayesian parameter estimation
Allison, Rupert; Dunkley, Joanna
2014-02-01
The posterior probability distribution for a set of model parameters encodes all that the data have to tell us in the context of a given model; it is the fundamental quantity for Bayesian parameter estimation. In order to infer the posterior probability distribution we have to decide how to explore parameter space. Here we compare three prescriptions for how parameter space is navigated, discussing their relative merits. We consider Metropolis-Hasting sampling, nested sampling and affine-invariant ensemble Markov chain Monte Carlo (MCMC) sampling. We focus on their performance on toy-model Gaussian likelihoods and on a real-world cosmological data set. We outline the sampling algorithms themselves and elaborate on performance diagnostics such as convergence time, scope for parallelization, dimensional scaling, requisite tunings and suitability for non-Gaussian distributions. We find that nested sampling delivers high-fidelity estimates for posterior statistics at low computational cost, and should be adopted in favour of Metropolis-Hastings in many cases. Affine-invariant MCMC is competitive when computing clusters can be utilized for massive parallelization. Affine-invariant MCMC and existing extensions to nested sampling naturally probe multimodal and curving distributions.
Estimation of lattice strain in nanocrystalline RuO2 by Williamson-Hall and size-strain plot methods
Sivakami, R.; Dhanuskodi, S.; Karvembu, R.
2016-01-01
RuO2 nanoparticles (RuO2 NPs) have been successfully synthesized by the hydrothermal method. Structure and the particle size have been determined by X-ray diffraction (XRD), scanning electron microscopy (SEM), atomic force microscopy (AFM) and transmission electron microscopy (TEM). UV-Vis spectra reveal that the optical band gap of RuO2 nanoparticles is red shifted from 3.95 to 3.55 eV. BET measurements show a high specific surface area (SSA) of 118-133 m2/g and pore diameter (10-25 nm) has been estimated by Barret-Joyner-Halenda (BJH) method. The crystallite size and lattice strain in the samples have been investigated by Williamson-Hall (W-H) analysis assuming uniform deformation, deformation stress and deformation energy density, and the size-strain plot method. All other relevant physical parameters including stress, strain and energy density have been calculated. The average crystallite size and the lattice strain evaluated from XRD measurements are in good agreement with the results of TEM.
ON ESTIMATION AND HYPOTHESIS TESTING OF THE GRAIN SIZE DISTRIBUTION BY THE SALTYKOV METHOD
Directory of Open Access Journals (Sweden)
Yuri Gulbin
2011-05-01
Full Text Available The paper considers the problem of validity of unfolding the grain size distribution with the back-substitution method. Due to the ill-conditioned nature of unfolding matrices, it is necessary to evaluate the accuracy and precision of parameter estimation and to verify the possibility of expected grain size distribution testing on the basis of intersection size histogram data. In order to review these questions, the computer modeling was used to compare size distributions obtained stereologically with those possessed by three-dimensional model aggregates of grains with a specified shape and random size. Results of simulations are reported and ways of improving the conventional stereological techniques are suggested. It is shown that new improvements in estimating and testing procedures enable grain size distributions to be unfolded more efficiently.
Directory of Open Access Journals (Sweden)
Smedslund Geir
2013-02-01
Full Text Available Abstract Background Patient reported outcomes are accepted as important outcome measures in rheumatology. The fluctuating symptoms in patients with rheumatic diseases have serious implications for sample size in clinical trials. We estimated the effects of measuring the outcome 1-5 times on the sample size required in a two-armed trial. Findings In a randomized controlled trial that evaluated the effects of a mindfulness-based group intervention for patients with inflammatory arthritis (n=71, the outcome variables Numerical Rating Scales (NRS (pain, fatigue, disease activity, self-care ability, and emotional wellbeing and General Health Questionnaire (GHQ-20 were measured five times before and after the intervention. For each variable we calculated the necessary sample sizes for obtaining 80% power (α=.05 for one up to five measurements. Two, three, and four measures reduced the required sample sizes by 15%, 21%, and 24%, respectively. With three (and five measures, the required sample size per group was reduced from 56 to 39 (32 for the GHQ-20, from 71 to 60 (55 for pain, 96 to 71 (73 for fatigue, 57 to 51 (48 for disease activity, 59 to 44 (45 for self-care, and 47 to 37 (33 for emotional wellbeing. Conclusions Measuring the outcomes five times rather than once reduced the necessary sample size by an average of 27%. When planning a study, researchers should carefully compare the advantages and disadvantages of increasing sample size versus employing three to five repeated measurements in order to obtain the required statistical power.
Bayesian Simultaneous Estimation for Means in k Sample Problems
Imai, Ryo; Kubokawa, Tatsuya; Ghosh, Malay
2017-01-01
This paper is concerned with the simultaneous estimation of k population means when one suspects that the k means are nearly equal. As an alternative to the preliminary test estimator based on the test statistics for testing hypothesis of equal means, we derive Bayesian and minimax estimators which shrink individual sample means toward a pooled mean estimator given under the hypothesis. Interestingly, it is shown that both the preliminary test estimator and the Bayesian minimax shrinkage esti...
Turbidity-controlled sampling for suspended sediment load estimation
Jack Lewis
2003-01-01
Abstract - Automated data collection is essential to effectively measure suspended sediment loads in storm events, particularly in small basins. Continuous turbidity measurements can be used, along with discharge, in an automated system that makes real-time sampling decisions to facilitate sediment load estimation. The Turbidity Threshold Sampling method distributes...
Accounting for One-Group Clustering in Effect-Size Estimation
Citkowicz, Martyna; Hedges, Larry V.
2013-01-01
In some instances, intentionally or not, study designs are such that there is clustering in one group but not in the other. This paper describes methods for computing effect size estimates and their variances when there is clustering in only one group and the analysis has not taken that clustering into account. The authors provide the effect size…
NDE errors and their propagation in sizing and growth estimates
International Nuclear Information System (INIS)
Horn, D.; Obrutsky, L.; Lakhan, R.
2009-01-01
The accuracy attributed to eddy current flaw sizing determines the amount of conservativism required in setting tube-plugging limits. Several sources of error contribute to the uncertainty of the measurements, and the way in which these errors propagate and interact affects the overall accuracy of the flaw size and flaw growth estimates. An example of this calculation is the determination of an upper limit on flaw growth over one operating period, based on the difference between two measurements. Signal-to-signal comparison involves a variety of human, instrumental, and environmental error sources; of these, some propagate additively and some multiplicatively. In a difference calculation, specific errors in the first measurement may be correlated with the corresponding errors in the second; others may be independent. Each of the error sources needs to be identified and quantified individually, as does its distribution in the field data. A mathematical framework for the propagation of the errors can then be used to assess the sensitivity of the overall uncertainty to each individual error component. This paper quantifies error sources affecting eddy current sizing estimates and presents analytical expressions developed for their effect on depth estimates. A simple case study is used to model the analysis process. For each error source, the distribution of the field data was assessed and propagated through the analytical expressions. While the sizing error obtained was consistent with earlier estimates and with deviations from ultrasonic depth measurements, the error on growth was calculated as significantly smaller than that obtained assuming uncorrelated errors. An interesting result of the sensitivity analysis in the present case study is the quantification of the error reduction available from post-measurement compensation of magnetite effects. With the absolute and difference error equations, variance-covariance matrices, and partial derivatives developed in
Size selective isocyanate aerosols personal air sampling using porous plastic foams
International Nuclear Information System (INIS)
Cong Khanh Huynh; Trinh Vu Duc
2009-01-01
As part of a European project (SMT4-CT96-2137), various European institutions specialized in occupational hygiene (BGIA, HSL, IOM, INRS, IST, Ambiente e Lavoro) have established a program of scientific collaboration to develop one or more prototypes of European personal samplers for the collection of simultaneous three dust fractions: inhalable, thoracic and respirable. These samplers based on existing sampling heads (IOM, GSP and cassettes) use Polyurethane Plastic Foam (PUF) according to their porosity to support sampling and separator size of the particles. In this study, the authors present an original application of size selective personal air sampling using chemical impregnated PUF to perform isocyanate aerosols capturing and derivatizing in industrial spray-painting shops.
An integrated approach for multi-level sample size determination
International Nuclear Information System (INIS)
Lu, M.S.; Teichmann, T.; Sanborn, J.B.
1997-01-01
Inspection procedures involving the sampling of items in a population often require steps of increasingly sensitive measurements, with correspondingly smaller sample sizes; these are referred to as multilevel sampling schemes. In the case of nuclear safeguards inspections verifying that there has been no diversion of Special Nuclear Material (SNM), these procedures have been examined often and increasingly complex algorithms have been developed to implement them. The aim in this paper is to provide an integrated approach, and, in so doing, to describe a systematic, consistent method that proceeds logically from level to level with increasing accuracy. The authors emphasize that the methods discussed are generally consistent with those presented in the references mentioned, and yield comparable results when the error models are the same. However, because of its systematic, integrated approach the proposed method elucidates the conceptual understanding of what goes on, and, in many cases, simplifies the calculations. In nuclear safeguards inspections, an important aspect of verifying nuclear items to detect any possible diversion of nuclear fissile materials is the sampling of such items at various levels of sensitivity. The first step usually is sampling by ''attributes'' involving measurements of relatively low accuracy, followed by further levels of sampling involving greater accuracy. This process is discussed in some detail in the references given; also, the nomenclature is described. Here, the authors outline a coordinated step-by-step procedure for achieving such multilevel sampling, and they develop the relationships between the accuracy of measurement and the sample size required at each stage, i.e., at the various levels. The logic of the underlying procedures is carefully elucidated; the calculations involved and their implications, are clearly described, and the process is put in a form that allows systematic generalization
Estimating mean change in population salt intake using spot urine samples.
Petersen, Kristina S; Wu, Jason H Y; Webster, Jacqui; Grimes, Carley; Woodward, Mark; Nowson, Caryl A; Neal, Bruce
2017-10-01
Spot urine samples are easier to collect than 24-h urine samples and have been used with estimating equations to derive the mean daily salt intake of a population. Whether equations using data from spot urine samples can also be used to estimate change in mean daily population salt intake over time is unknown. We compared estimates of change in mean daily population salt intake based upon 24-h urine collections with estimates derived using equations based on spot urine samples. Paired and unpaired 24-h urine samples and spot urine samples were collected from individuals in two Australian populations, in 2011 and 2014. Estimates of change in daily mean population salt intake between 2011 and 2014 were obtained directly from the 24-h urine samples and by applying established estimating equations (Kawasaki, Tanaka, Mage, Toft, INTERSALT) to the data from spot urine samples. Differences between 2011 and 2014 were calculated using mixed models. A total of 1000 participants provided a 24-h urine sample and a spot urine sample in 2011, and 1012 did so in 2014 (paired samples n = 870; unpaired samples n = 1142). The participants were community-dwelling individuals living in the State of Victoria or the town of Lithgow in the State of New South Wales, Australia, with a mean age of 55 years in 2011. The mean (95% confidence interval) difference in population salt intake between 2011 and 2014 determined from the 24-h urine samples was -0.48g/day (-0.74 to -0.21; P spot urine samples was -0.24 g/day (-0.42 to -0.06; P = 0.01) using the Tanaka equation, -0.42 g/day (-0.70 to -0.13; p = 0.004) using the Kawasaki equation, -0.51 g/day (-1.00 to -0.01; P = 0.046) using the Mage equation, -0.26 g/day (-0.42 to -0.10; P = 0.001) using the Toft equation, -0.20 g/day (-0.32 to -0.09; P = 0.001) using the INTERSALT equation and -0.27 g/day (-0.39 to -0.15; P 0.058). Separate analysis of the unpaired and paired data showed that detection of
Directory of Open Access Journals (Sweden)
Elias Chaibub Neto
Full Text Available In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson's sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling.
Estimating waste disposal quantities from raw waste samples
International Nuclear Information System (INIS)
Negin, C.A.; Urland, C.S.; Hitz, C.G.; GPU Nuclear Corp., Middletown, PA)
1985-01-01
Estimating the disposal quantity of waste resulting from stabilization of radioactive sludge is complex because of the many factors relating to sample analysis results, radioactive decay, allowable disposal concentrations, and options for disposal containers. To facilitate this estimation, a microcomputer spread sheet template was created. The spread sheet has saved considerable engineering hours. 1 fig., 3 tabs
Iterative importance sampling algorithms for parameter estimation
Morzfeld, Matthias; Day, Marcus S.; Grout, Ray W.; Pau, George Shu Heng; Finsterle, Stefan A.; Bell, John B.
2016-01-01
In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov Chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing systems because samples are drawn independently. However, finding a suitable proposal distribution is ...
Reliable calculation in probabilistic logic: Accounting for small sample size and model uncertainty
Energy Technology Data Exchange (ETDEWEB)
Ferson, S. [Applied Biomathematics, Setauket, NY (United States)
1996-12-31
A variety of practical computational problems arise in risk and safety assessments, forensic statistics and decision analyses in which the probability of some event or proposition E is to be estimated from the probabilities of a finite list of related subevents or propositions F,G,H,.... In practice, the analyst`s knowledge may be incomplete in two ways. First, the probabilities of the subevents may be imprecisely known from statistical estimations, perhaps based on very small sample sizes. Second, relationships among the subevents may be known imprecisely. For instance, there may be only limited information about their stochastic dependencies. Representing probability estimates as interval ranges on has been suggested as a way to address the first source of imprecision. A suite of AND, OR and NOT operators defined with reference to the classical Frochet inequalities permit these probability intervals to be used in calculations that address the second source of imprecision, in many cases, in a best possible way. Using statistical confidence intervals as inputs unravels the closure properties of this approach however, requiring that probability estimates be characterized by a nested stack of intervals for all possible levels of statistical confidence, from a point estimate (0% confidence) to the entire unit interval (100% confidence). The corresponding logical operations implied by convolutive application of the logical operators for every possible pair of confidence intervals reduces by symmetry to a manageably simple level-wise iteration. The resulting calculus can be implemented in software that allows users to compute comprehensive and often level-wise best possible bounds on probabilities for logical functions of events.
Asian elephants in China: estimating population size and evaluating habitat suitability.
Directory of Open Access Journals (Sweden)
Li Zhang
Full Text Available We monitored the last remaining Asian elephant populations in China over the past decade. Using DNA tools and repeat genotyping, we estimated the population sizes from 654 dung samples collected from various areas. Combined with morphological individual identifications from over 6,300 elephant photographs taken in the wild, we estimated that the total Asian elephant population size in China is between 221 and 245. Population genetic structure and diversity were examined using a 556-bp fragment of mitochondrial DNA, and 24 unique haplotypes were detected from DNA analysis of 178 individuals. A phylogenetic analysis revealed two highly divergent clades of Asian elephants, α and β, present in Chinese populations. Four populations (Mengla, Shangyong, Mengyang, and Pu'Er carried mtDNA from the α clade, and only one population (Nangunhe carried mtDNA belonging to the β clade. Moreover, high genetic divergence was observed between the Nangunhe population and the other four populations; however, genetic diversity among the five populations was low, possibly due to limited gene flow because of habitat fragmentation. The expansion of rubber plantations, crop cultivation, and villages along rivers and roads had caused extensive degradation of natural forest in these areas. This had resulted in the loss and fragmentation of elephant habitats and had formed artificial barriers that inhibited elephant migration. Using Geographic Information System, Global Positioning System, and Remote Sensing technology, we found that the area occupied by rubber plantations, tea farms, and urban settlements had dramatically increased over the past 40 years, resulting in the loss and fragmentation of elephant habitats and forming artificial barriers that inhibit elephant migration. The restoration of ecological corridors to facilitate gene exchange among isolated elephant populations and the establishment of cross-boundary protected areas between China and Laos to secure
Capello, Katia; Bortolotti, Laura; Lanari, Manuela; Baioni, Elisa; Mutinelli, Franco; Vascellari, Marta
2015-01-01
The knowledge of the size and demographic structure of animal populations is a necessary prerequisite for any population-based epidemiological study, especially to ascertain and interpret prevalence data, to implement surveillance plans in controlling zoonotic diseases and, moreover, to provide accurate estimates of tumours incidence data obtained by population-based registries. The main purpose of this study was to provide an accurate estimate of the size and structure of the canine population in Veneto region (north-eastern Italy), using the Lincoln-Petersen version of the capture-recapture methodology. The Regional Canine Demographic Registry (BAC) and a sample survey of households of Veneto Region were the capture and recapture sources, respectively. The secondary purpose was to estimate the size and structure of the feline population in the same region, using the same survey applied for dog population. A sample of 2465 randomly selected households was drawn and submitted to a questionnaire using the CATI technique, in order to obtain information about the ownership of dogs and cats. If the dog was declared to be identified, owner's information was used to recapture the dog in the BAC. The study was conducted in Veneto Region during 2011, when the dog population recorded in the BAC was 605,537. Overall, 616 households declared to possess at least one dog (25%), with a total of 805 dogs and an average per household of 1.3. The capture-recapture analysis showed that 574 dogs (71.3%, 95% CI: 68.04-74.40%) had been recaptured in both sources, providing a dog population estimate of 849,229 (95% CI: 814,747-889,394), 40% higher than that registered in the BAC. Concerning cats, 455 of 2465 (18%, 95% CI: 17-20%) households declared to possess at least one cat at the time of the telephone interview, with a total of 816 cats. The mean number of cats per household was equal to 1.8, providing an estimate of the cat population in Veneto region equal to 663,433 (95% CI: 626
Size estimates of nobel gas clusters by Rayleigh scattering experiments
Institute of Scientific and Technical Information of China (English)
Pinpin Zhu (朱频频); Guoquan Ni (倪国权); Zhizhan Xu (徐至展)
2003-01-01
Noble gases (argon, krypton, and xenon) are puffed into vacuum through a nozzle to produce clusters for studying laser-cluster interactions. Good estimates of the average size of the argon, krypton and xenon clusters are made by carrying out a series of Rayleigh scattering experiments. In the experiments, we have found that the scattered signal intensity varied greatly with the opening area of the pulsed valve. A new method is put forward to choose the appropriate scattered signal and measure the size of Kr cluster.
Harry T. Valentine; David L. R. Affleck; Timothy G. Gregoire
2009-01-01
Systematic sampling is easy, efficient, and widely used, though it is not generally recognized that a systematic sample may be drawn from the population of interest with or without restrictions on randomization. The restrictions or the lack of them determine which estimators are unbiased, when using the sampling design as the basis for inference. We describe the...
Estimating the size of the solution space of metabolic networks
Directory of Open Access Journals (Sweden)
Mulet Roberto
2008-05-01
novel efficient distributed algorithmic strategy to estimate the size and shape of the affine space of a non full-dimensional convex polytope in high dimensions. The method is shown to obtain, quantitatively and qualitatively compatible results with the ones of standard algorithms (where this comparison is possible being still efficient on the analysis of large biological systems, where exact deterministic methods experience an explosion in algorithmic time. The algorithm we propose can be considered as an alternative to Monte Carlo sampling methods.
Estimation of body-size traits by photogrammetry in large mammals to inform conservation.
Berger, Joel
2012-10-01
Photography, including remote imagery and camera traps, has contributed substantially to conservation. However, the potential to use photography to understand demography and inform policy is limited. To have practical value, remote assessments must be reasonably accurate and widely deployable. Prior efforts to develop noninvasive methods of estimating trait size have been motivated by a desire to answer evolutionary questions, measure physiological growth, or, in the case of illegal trade, assess economics of horn sizes; but rarely have such methods been directed at conservation. Here I demonstrate a simple, noninvasive photographic technique and address how knowledge of values of individual-specific metrics bears on conservation policy. I used 10 years of data on juvenile moose (Alces alces) to examine whether body size and probability of survival are positively correlated in cold climates. I investigated whether the presence of mothers improved juvenile survival. The posited latter relation is relevant to policy because harvest of adult females has been permitted in some Canadian and American jurisdictions under the assumption that probability of survival of young is independent of maternal presence. The accuracy of estimates of head sizes made from photographs exceeded 98%. The estimates revealed that overwinter juvenile survival had no relation to the juvenile's estimated mass (p < 0.64) and was more strongly associated with maternal presence (p < 0.02) than winter snow depth (p < 0.18). These findings highlight the effects on survival of a social dynamic (the mother-young association) rather than body size and suggest a change in harvest policy will increase survival. Furthermore, photographic imaging of growth of individual juvenile muskoxen (Ovibos moschatus) over 3 Arctic winters revealed annual variability in size, which supports the idea that noninvasive monitoring may allow one to detect how some environmental conditions ultimately affect body growth.
Rabideau, Dustin J; Pei, Pamela P; Walensky, Rochelle P; Zheng, Amy; Parker, Robert A
2018-02-01
The expected value of sample information (EVSI) can help prioritize research but its application is hampered by computational infeasibility, especially for complex models. We investigated an approach by Strong and colleagues to estimate EVSI by applying generalized additive models (GAM) to results generated from a probabilistic sensitivity analysis (PSA). For 3 potential HIV prevention and treatment strategies, we estimated life expectancy and lifetime costs using the Cost-effectiveness of Preventing AIDS Complications (CEPAC) model, a complex patient-level microsimulation model of HIV progression. We fitted a GAM-a flexible regression model that estimates the functional form as part of the model fitting process-to the incremental net monetary benefits obtained from the CEPAC PSA. For each case study, we calculated the expected value of partial perfect information (EVPPI) using both the conventional nested Monte Carlo approach and the GAM approach. EVSI was calculated using the GAM approach. For all 3 case studies, the GAM approach consistently gave similar estimates of EVPPI compared with the conventional approach. The EVSI behaved as expected: it increased and converged to EVPPI for larger sample sizes. For each case study, generating the PSA results for the GAM approach required 3 to 4 days on a shared cluster, after which EVPPI and EVSI across a range of sample sizes were evaluated in minutes. The conventional approach required approximately 5 weeks for the EVPPI calculation alone. Estimating EVSI using the GAM approach with results from a PSA dramatically reduced the time required to conduct a computationally intense project, which would otherwise have been impractical. Using the GAM approach, we can efficiently provide policy makers with EVSI estimates, even for complex patient-level microsimulation models.
Mayer, B; Muche, R
2013-01-01
Animal studies are highly relevant for basic medical research, although their usage is discussed controversially in public. Thus, an optimal sample size for these projects should be aimed at from a biometrical point of view. Statistical sample size calculation is usually the appropriate methodology in planning medical research projects. However, required information is often not valid or only available during the course of an animal experiment. This article critically discusses the validity of formal sample size calculation for animal studies. Within the discussion, some requirements are formulated to fundamentally regulate the process of sample size determination for animal experiments.
Weighted piecewise LDA for solving the small sample size problem in face verification.
Kyperountas, Marios; Tefas, Anastasios; Pitas, Ioannis
2007-03-01
A novel algorithm that can be used to boost the performance of face-verification methods that utilize Fisher's criterion is presented and evaluated. The algorithm is applied to similarity, or matching error, data and provides a general solution for overcoming the "small sample size" (SSS) problem, where the lack of sufficient training samples causes improper estimation of a linear separation hyperplane between the classes. Two independent phases constitute the proposed method. Initially, a set of weighted piecewise discriminant hyperplanes are used in order to provide a more accurate discriminant decision than the one produced by the traditional linear discriminant analysis (LDA) methodology. The expected classification ability of this method is investigated throughout a series of simulations. The second phase defines proper combinations for person-specific similarity scores and describes an outlier removal process that further enhances the classification ability. The proposed technique has been tested on the M2VTS and XM2VTS frontal face databases. Experimental results indicate that the proposed framework greatly improves the face-verification performance.
Estimation of AUC or Partial AUC under Test-Result-Dependent Sampling.
Wang, Xiaofei; Ma, Junling; George, Stephen; Zhou, Haibo
2012-01-01
The area under the ROC curve (AUC) and partial area under the ROC curve (pAUC) are summary measures used to assess the accuracy of a biomarker in discriminating true disease status. The standard sampling approach used in biomarker validation studies is often inefficient and costly, especially when ascertaining the true disease status is costly and invasive. To improve efficiency and reduce the cost of biomarker validation studies, we consider a test-result-dependent sampling (TDS) scheme, in which subject selection for determining the disease state is dependent on the result of a biomarker assay. We first estimate the test-result distribution using data arising from the TDS design. With the estimated empirical test-result distribution, we propose consistent nonparametric estimators for AUC and pAUC and establish the asymptotic properties of the proposed estimators. Simulation studies show that the proposed estimators have good finite sample properties and that the TDS design yields more efficient AUC and pAUC estimates than a simple random sampling (SRS) design. A data example based on an ongoing cancer clinical trial is provided to illustrate the TDS design and the proposed estimators. This work can find broad applications in design and analysis of biomarker validation studies.
Ruf, B.; Erdnuess, B.; Weinmann, M.
2017-08-01
With the emergence of small consumer Unmanned Aerial Vehicles (UAVs), the importance and interest of image-based depth estimation and model generation from aerial images has greatly increased in the photogrammetric society. In our work, we focus on algorithms that allow an online image-based dense depth estimation from video sequences, which enables the direct and live structural analysis of the depicted scene. Therefore, we use a multi-view plane-sweep algorithm with a semi-global matching (SGM) optimization which is parallelized for general purpose computation on a GPU (GPGPU), reaching sufficient performance to keep up with the key-frames of input sequences. One important aspect to reach good performance is the way to sample the scene space, creating plane hypotheses. A small step size between consecutive planes, which is needed to reconstruct details in the near vicinity of the camera may lead to ambiguities in distant regions, due to the perspective projection of the camera. Furthermore, an equidistant sampling with a small step size produces a large number of plane hypotheses, leading to high computational effort. To overcome these problems, we present a novel methodology to directly determine the sampling points of plane-sweep algorithms in image space. The use of the perspective invariant cross-ratio allows us to derive the location of the sampling planes directly from the image data. With this, we efficiently sample the scene space, achieving higher sampling density in areas which are close to the camera and a lower density in distant regions. We evaluate our approach on a synthetic benchmark dataset for quantitative evaluation and on a real-image dataset consisting of aerial imagery. The experiments reveal that an inverse sampling achieves equal and better results than a linear sampling, with less sampling points and thus less runtime. Our algorithm allows an online computation of depth maps for subsequences of five frames, provided that the relative
Directory of Open Access Journals (Sweden)
B. Ruf
2017-08-01
Full Text Available With the emergence of small consumer Unmanned Aerial Vehicles (UAVs, the importance and interest of image-based depth estimation and model generation from aerial images has greatly increased in the photogrammetric society. In our work, we focus on algorithms that allow an online image-based dense depth estimation from video sequences, which enables the direct and live structural analysis of the depicted scene. Therefore, we use a multi-view plane-sweep algorithm with a semi-global matching (SGM optimization which is parallelized for general purpose computation on a GPU (GPGPU, reaching sufficient performance to keep up with the key-frames of input sequences. One important aspect to reach good performance is the way to sample the scene space, creating plane hypotheses. A small step size between consecutive planes, which is needed to reconstruct details in the near vicinity of the camera may lead to ambiguities in distant regions, due to the perspective projection of the camera. Furthermore, an equidistant sampling with a small step size produces a large number of plane hypotheses, leading to high computational effort. To overcome these problems, we present a novel methodology to directly determine the sampling points of plane-sweep algorithms in image space. The use of the perspective invariant cross-ratio allows us to derive the location of the sampling planes directly from the image data. With this, we efficiently sample the scene space, achieving higher sampling density in areas which are close to the camera and a lower density in distant regions. We evaluate our approach on a synthetic benchmark dataset for quantitative evaluation and on a real-image dataset consisting of aerial imagery. The experiments reveal that an inverse sampling achieves equal and better results than a linear sampling, with less sampling points and thus less runtime. Our algorithm allows an online computation of depth maps for subsequences of five frames, provided that
Generating Random Samples of a Given Size Using Social Security Numbers.
Erickson, Richard C.; Brauchle, Paul E.
1984-01-01
The purposes of this article are (1) to present a method by which social security numbers may be used to draw cluster samples of a predetermined size and (2) to describe procedures used to validate this method of drawing random samples. (JOW)
Estimating the Grain Size Distribution of Mars based on Fragmentation Theory and Observations
Charalambous, C.; Pike, W. T.; Golombek, M.
2017-12-01
We present here a fundamental extension to the fragmentation theory [1] which yields estimates of the distribution of particle sizes of a planetary surface. The model is valid within the size regimes of surfaces whose genesis is best reflected by the evolution of fragmentation phenomena governed by either the process of meteoritic impacts, or by a mixture with aeolian transportation at the smaller sizes. The key parameter of the model, the regolith maturity index, can be estimated as an average of that observed at a local site using cratering size-frequency measurements, orbital and surface image-detected rock counts and observations of sub-mm particles at landing sites. Through validation of ground truth from previous landed missions, the basis of this approach has been used at the InSight landing ellipse on Mars to extrapolate rock size distributions in HiRISE images down to 5 cm rock size, both to determine the landing safety risk and the subsequent probability of obstruction by a rock of the deployed heat flow mole down to 3-5 m depth [2]. Here we focus on a continuous extrapolation down to 600 µm coarse sand particles, the upper size limit that may be present through aeolian processes [3]. The parameters of the model are first derived for the fragmentation process that has produced the observable rocks via meteorite impacts over time, and therefore extrapolation into a size regime that is affected by aeolian processes has limited justification without further refinement. Incorporating thermal inertia estimates, size distributions observed by the Spirit and Opportunity Microscopic Imager [4] and Atomic Force and Optical Microscopy from the Phoenix Lander [5], the model's parameters in combination with synthesis methods are quantitatively refined further to allow transition within the aeolian transportation size regime. In addition, due to the nature of the model emerging in fractional mass abundance, the percentage of material by volume or mass that resides
Greenbaum, Gili; Renan, Sharon; Templeton, Alan R; Bouskila, Amos; Saltz, David; Rubenstein, Daniel I; Bar-David, Shirli
2017-12-22
Effective population size, a central concept in conservation biology, is now routinely estimated from genetic surveys and can also be theoretically predicted from demographic, life-history, and mating-system data. By evaluating the consistency of theoretical predictions with empirically estimated effective size, insights can be gained regarding life-history characteristics and the relative impact of different life-history traits on genetic drift. These insights can be used to design and inform management strategies aimed at increasing effective population size. We demonstrated this approach by addressing the conservation of a reintroduced population of Asiatic wild ass (Equus hemionus). We estimated the variance effective size (N ev ) from genetic data (N ev =24.3) and formulated predictions for the impacts on N ev of demography, polygyny, female variance in lifetime reproductive success (RS), and heritability of female RS. By contrasting the genetic estimation with theoretical predictions, we found that polygyny was the strongest factor affecting genetic drift because only when accounting for polygyny were predictions consistent with the genetically measured N ev . The comparison of effective-size estimation and predictions indicated that 10.6% of the males mated per generation when heritability of female RS was unaccounted for (polygyny responsible for 81% decrease in N ev ) and 19.5% mated when female RS was accounted for (polygyny responsible for 67% decrease in N ev ). Heritability of female RS also affected N ev ; hf2=0.91 (heritability responsible for 41% decrease in N ev ). The low effective size is of concern, and we suggest that management actions focus on factors identified as strongly affecting Nev, namely, increasing the availability of artificial water sources to increase number of dominant males contributing to the gene pool. This approach, evaluating life-history hypotheses in light of their impact on effective population size, and contrasting
Estimating the size of non-observed economy in Croatia using the MIMIC approach
Vjekoslav Klaric
2011-01-01
This paper gives a quick overview of the approaches that have been used in the research of shadow economy, starting with the defi nitions of the terms “shadow economy” and “non-observed economy”, with the accent on the ISTAT/Eurostat framework. Several methods for estimating the size of the shadow economy and the non-observed economy are then presented. The emphasis is placed on the MIMIC approach, one of the methods used to estimate the size of the nonobserved economy. After a glance at the ...
Jeffrey H. Gove
2003-01-01
Many of the most popular sampling schemes used in forestry are probability proportional to size methods. These methods are also referred to as size biased because sampling is actually from a weighted form of the underlying population distribution. Length- and area-biased sampling are special cases of size-biased sampling where the probability weighting comes from a...
Estimating population salt intake in India using spot urine samples.
Petersen, Kristina S; Johnson, Claire; Mohan, Sailesh; Rogers, Kris; Shivashankar, Roopa; Thout, Sudhir Raj; Gupta, Priti; He, Feng J; MacGregor, Graham A; Webster, Jacqui; Santos, Joseph Alvin; Krishnan, Anand; Maulik, Pallab K; Reddy, K Srinath; Gupta, Ruby; Prabhakaran, Dorairaj; Neal, Bruce
2017-11-01
To compare estimates of mean population salt intake in North and South India derived from spot urine samples versus 24-h urine collections. In a cross-sectional survey, participants were sampled from slum, urban and rural communities in North and in South India. Participants provided 24-h urine collections, and random morning spot urine samples. Salt intake was estimated from the spot urine samples using a series of established estimating equations. Salt intake data from the 24-h urine collections and spot urine equations were weighted to provide estimates of salt intake for Delhi and Haryana, and Andhra Pradesh. A total of 957 individuals provided a complete 24-h urine collection and a spot urine sample. Weighted mean salt intake based on the 24-h urine collection, was 8.59 (95% confidence interval 7.73-9.45) and 9.46 g/day (8.95-9.96) in Delhi and Haryana, and Andhra Pradesh, respectively. Corresponding estimates based on the Tanaka equation [9.04 (8.63-9.45) and 9.79 g/day (9.62-9.96) for Delhi and Haryana, and Andhra Pradesh, respectively], the Mage equation [8.80 (7.67-9.94) and 10.19 g/day (95% CI 9.59-10.79)], the INTERSALT equation [7.99 (7.61-8.37) and 8.64 g/day (8.04-9.23)] and the INTERSALT equation with potassium [8.13 (7.74-8.52) and 8.81 g/day (8.16-9.46)] were all within 1 g/day of the estimate based upon 24-h collections. For the Toft equation, estimates were 1-2 g/day higher [9.94 (9.24-10.64) and 10.69 g/day (9.44-11.93)] and for the Kawasaki equation they were 3-4 g/day higher [12.14 (11.30-12.97) and 13.64 g/day (13.15-14.12)]. In urban and rural areas in North and South India, most spot urine-based equations provided reasonable estimates of mean population salt intake. Equations that did not provide good estimates may have failed because specimen collection was not aligned with the original method.
Directory of Open Access Journals (Sweden)
Ina C Ansmann
Full Text Available Moreton Bay, Queensland, Australia is an area of high biodiversity and conservation value and home to two sympatric sub-populations of Indo-Pacific bottlenose dolphins (Tursiops aduncus. These dolphins live in close proximity to major urban developments. Successful management requires information regarding their abundance. Here, we estimate total and effective population sizes of bottlenose dolphins in Moreton Bay using photo-identification and genetic data collected during boat-based surveys in 2008-2010. Abundance (N was estimated using open population mark-recapture models based on sighting histories of distinctive individuals. Effective population size (Ne was estimated using the linkage disequilibrium method based on nuclear genetic data at 20 microsatellite markers in skin samples, and corrected for bias caused by overlapping generations (Ne c. A total of 174 sightings of dolphin groups were recorded and 365 different individuals identified. Over the whole of Moreton Bay, a population size N of 554 ± 22.2 (SE (95% CI: 510-598 was estimated. The southern bay sub-population was small at an estimated N = 193 ± 6.4 (SE (95% CI: 181-207, while the North sub-population was more numerous, with 446 ± 56 (SE (95% CI: 336-556 individuals. The small estimated effective population size of the southern sub-population (Ne c = 56, 95% CI: 33-128 raises conservation concerns. A power analysis suggested that to reliably detect small (5% declines in size of this population would require substantial survey effort (>4 years of annual mark-recapture surveys at the precision levels achieved here. To ensure that ecological as well as genetic diversity within this population of bottlenose dolphins is preserved, we consider that North and South sub-populations should be treated as separate management units. Systematic surveys over smaller areas holding locally-adapted sub-populations are suggested as an alternative method for increasing ability to detect
Networked Estimation for Event-Based Sampling Systems with Packet Dropouts
Directory of Open Access Journals (Sweden)
Young Soo Suh
2009-04-01
Full Text Available This paper is concerned with a networked estimation problem in which sensor data are transmitted over the network. In the event-based sampling scheme known as level-crossing or send-on-delta (SOD, sensor data are transmitted to the estimator node if the difference between the current sensor value and the last transmitted one is greater than a given threshold. Event-based sampling has been shown to be more efficient than the time-triggered one in some situations, especially in network bandwidth improvement. However, it cannot detect packet dropout situations because data transmission and reception do not use a periodical time-stamp mechanism as found in time-triggered sampling systems. Motivated by this issue, we propose a modified event-based sampling scheme called modified SOD in which sensor data are sent when either the change of sensor output exceeds a given threshold or the time elapses more than a given interval. Through simulation results, we show that the proposed modified SOD sampling significantly improves estimation performance when packet dropouts happen.
Modeling grain-size dependent bias in estimating forest area: a regional application
Daolan Zheng; Linda S. Heath; Mark J. Ducey
2008-01-01
A better understanding of scaling-up effects on estimating important landscape characteristics (e.g. forest percentage) is critical for improving ecological applications over large areas. This study illustrated effects of changing grain sizes on regional forest estimates in Minnesota, Wisconsin, and Michigan of the USA using 30-m land-cover maps (1992 and 2001)...
On sample size and different interpretations of snow stability datasets
Schirmer, M.; Mitterer, C.; Schweizer, J.
2009-04-01
Interpretations of snow stability variations need an assessment of the stability itself, independent of the scale investigated in the study. Studies on stability variations at a regional scale have often chosen stability tests such as the Rutschblock test or combinations of various tests in order to detect differences in aspect and elevation. The question arose: ‘how capable are such stability interpretations in drawing conclusions'. There are at least three possible errors sources: (i) the variance of the stability test itself; (ii) the stability variance at an underlying slope scale, and (iii) that the stability interpretation might not be directly related to the probability of skier triggering. Various stability interpretations have been proposed in the past that provide partly different results. We compared a subjective one based on expert knowledge with a more objective one based on a measure derived from comparing skier-triggered slopes vs. slopes that have been skied but not triggered. In this study, the uncertainties are discussed and their effects on regional scale stability variations will be quantified in a pragmatic way. An existing dataset with very large sample sizes was revisited. This dataset contained the variance of stability at a regional scale for several situations. The stability in this dataset was determined using the subjective interpretation scheme based on expert knowledge. The question to be answered was how many measurements were needed to obtain similar results (mainly stability differences in aspect or elevation) as with the complete dataset. The optimal sample size was obtained in several ways: (i) assuming a nominal data scale the sample size was determined with a given test, significance level and power, and by calculating the mean and standard deviation of the complete dataset. With this method it can also be determined if the complete dataset consists of an appropriate sample size. (ii) Smaller subsets were created with similar
Directory of Open Access Journals (Sweden)
Satoshi Ezoe
Full Text Available BACKGROUND: Men who have sex with men (MSM are one of the groups most at risk for HIV infection in Japan. However, size estimates of MSM populations have not been conducted with sufficient frequency and rigor because of the difficulty, high cost and stigma associated with reaching such populations. This study examined an innovative and simple method for estimating the size of the MSM population in Japan. We combined an internet survey with the network scale-up method, a social network method for estimating the size of hard-to-reach populations, for the first time in Japan. METHODS AND FINDINGS: An internet survey was conducted among 1,500 internet users who registered with a nationwide internet-research agency. The survey participants were asked how many members of particular groups with known population sizes (firepersons, police officers, and military personnel they knew as acquaintances. The participants were also asked to identify the number of their acquaintances whom they understood to be MSM. Using these survey results with the network scale-up method, the personal network size and MSM population size were estimated. The personal network size was estimated to be 363.5 regardless of the sex of the acquaintances and 174.0 for only male acquaintances. The estimated MSM prevalence among the total male population in Japan was 0.0402% without adjustment, and 2.87% after adjusting for the transmission error of MSM. CONCLUSIONS: The estimated personal network size and MSM prevalence seen in this study were comparable to those from previous survey results based on the direct-estimation method. Estimating population sizes through combining an internet survey with the network scale-up method appeared to be an effective method from the perspectives of rapidity, simplicity, and low cost as compared with more-conventional methods.
Estimation of mean grain size of seafloor sediments using neural network
Digital Repository Service at National Institute of Oceanography (India)
De, C.; Chakraborty, B.
The feasibility of an artificial neural network based approach is investigated to estimate the values of mean grain size of seafloor sediments using four dominant echo features, extracted from acoustic backscatter data. The acoustic backscatter data...
International Nuclear Information System (INIS)
Ku, L.; Kolibal, J.G.
1982-06-01
The neutron induced material activation dose rate data are summarized for the TFTR operation. This report marks the completion of the second phase of the systematic study of the activation problem on the TFTR. The estimations of the neutron induced activation dose rates were made for spherical and slab objects, based on a point kernel method, for a wide range of materials. The dose rates as a function of cooling time for standard samples are presented for a number of typical neutron spectrum expected during TFTR DD and DT operations. The factors which account for the variations of the pulsing history, the characteristic size of the object and the distance of observation relative to the standard samples are also presented
Brus, D.J.; Gruijter, de J.J.
2003-01-01
In estimating spatial means of environmental variables of a region from data collected by convenience or purposive sampling, validity of the results can be ensured by collecting additional data through probability sampling. The precision of the pi estimator that uses the probability sample can be
Lee, Paul H; Tse, Andy C Y
2017-05-01
There are limited data on the quality of reporting of information essential for replication of the calculation as well as the accuracy of the sample size calculation. We examine the current quality of reporting of the sample size calculation in randomized controlled trials (RCTs) published in PubMed and to examine the variation in reporting across study design, study characteristics, and journal impact factor. We also reviewed the targeted sample size reported in trial registries. We reviewed and analyzed all RCTs published in December 2014 with journals indexed in PubMed. The 2014 Impact Factors for the journals were used as proxies for their quality. Of the 451 analyzed papers, 58.1% reported an a priori sample size calculation. Nearly all papers provided the level of significance (97.7%) and desired power (96.6%), and most of the papers reported the minimum clinically important effect size (73.3%). The median (inter-quartile range) of the percentage difference of the reported and calculated sample size calculation was 0.0% (IQR -4.6%;3.0%). The accuracy of the reported sample size was better for studies published in journals that endorsed the CONSORT statement and journals with an impact factor. A total of 98 papers had provided targeted sample size on trial registries and about two-third of these papers (n=62) reported sample size calculation, but only 25 (40.3%) had no discrepancy with the reported number in the trial registries. The reporting of the sample size calculation in RCTs published in PubMed-indexed journals and trial registries were poor. The CONSORT statement should be more widely endorsed. Copyright © 2016 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
Sivakami, R; Dhanuskodi, S; Karvembu, R
2016-01-05
RuO2 nanoparticles (RuO2 NPs) have been successfully synthesized by the hydrothermal method. Structure and the particle size have been determined by X-ray diffraction (XRD), scanning electron microscopy (SEM), atomic force microscopy (AFM) and transmission electron microscopy (TEM). UV-Vis spectra reveal that the optical band gap of RuO2 nanoparticles is red shifted from 3.95 to 3.55eV. BET measurements show a high specific surface area (SSA) of 118-133m(2)/g and pore diameter (10-25nm) has been estimated by Barret-Joyner-Halenda (BJH) method. The crystallite size and lattice strain in the samples have been investigated by Williamson-Hall (W-H) analysis assuming uniform deformation, deformation stress and deformation energy density, and the size-strain plot method. All other relevant physical parameters including stress, strain and energy density have been calculated. The average crystallite size and the lattice strain evaluated from XRD measurements are in good agreement with the results of TEM. Copyright © 2015 Elsevier B.V. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Price, Oliver R., E-mail: oliver.price@unilever.co [Warwick-HRI, University of Warwick, Wellesbourne, Warwick, CV32 6EF (United Kingdom); University of Reading, Soil Science Department, Whiteknights, Reading, RG6 6UR (United Kingdom); Oliver, Margaret A. [University of Reading, Soil Science Department, Whiteknights, Reading, RG6 6UR (United Kingdom); Walker, Allan [Warwick-HRI, University of Warwick, Wellesbourne, Warwick, CV32 6EF (United Kingdom); Wood, Martin [University of Reading, Soil Science Department, Whiteknights, Reading, RG6 6UR (United Kingdom)
2009-05-15
An unbalanced nested sampling design was used to investigate the spatial scale of soil and herbicide interactions at the field scale. A hierarchical analysis of variance based on residual maximum likelihood (REML) was used to analyse the data and provide a first estimate of the variogram. Soil samples were taken at 108 locations at a range of separating distances in a 9 ha field to explore small and medium scale spatial variation. Soil organic matter content, pH, particle size distribution, microbial biomass and the degradation and sorption of the herbicide, isoproturon, were determined for each soil sample. A large proportion of the spatial variation in isoproturon degradation and sorption occurred at sampling intervals less than 60 m, however, the sampling design did not resolve the variation present at scales greater than this. A sampling interval of 20-25 m should ensure that the main spatial structures are identified for isoproturon degradation rate and sorption without too great a loss of information in this field. - Estimating the spatial scale of herbicide and soil interactions by nested sampling.
International Nuclear Information System (INIS)
Price, Oliver R.; Oliver, Margaret A.; Walker, Allan; Wood, Martin
2009-01-01
An unbalanced nested sampling design was used to investigate the spatial scale of soil and herbicide interactions at the field scale. A hierarchical analysis of variance based on residual maximum likelihood (REML) was used to analyse the data and provide a first estimate of the variogram. Soil samples were taken at 108 locations at a range of separating distances in a 9 ha field to explore small and medium scale spatial variation. Soil organic matter content, pH, particle size distribution, microbial biomass and the degradation and sorption of the herbicide, isoproturon, were determined for each soil sample. A large proportion of the spatial variation in isoproturon degradation and sorption occurred at sampling intervals less than 60 m, however, the sampling design did not resolve the variation present at scales greater than this. A sampling interval of 20-25 m should ensure that the main spatial structures are identified for isoproturon degradation rate and sorption without too great a loss of information in this field. - Estimating the spatial scale of herbicide and soil interactions by nested sampling.
Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA
Kelly, Brendan J.; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D.; Collman, Ronald G.; Bushman, Frederic D.; Li, Hongzhe
2015-01-01
Motivation: The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence–absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-...
Spatially explicit population estimates for black bears based on cluster sampling
Humm, J.; McCown, J. Walter; Scheick, B.K.; Clark, Joseph D.
2017-01-01
We estimated abundance and density of the 5 major black bear (Ursus americanus) subpopulations (i.e., Eglin, Apalachicola, Osceola, Ocala-St. Johns, Big Cypress) in Florida, USA with spatially explicit capture-mark-recapture (SCR) by extracting DNA from hair samples collected at barbed-wire hair sampling sites. We employed a clustered sampling configuration with sampling sites arranged in 3 × 3 clusters spaced 2 km apart within each cluster and cluster centers spaced 16 km apart (center to center). We surveyed all 5 subpopulations encompassing 38,960 km2 during 2014 and 2015. Several landscape variables, most associated with forest cover, helped refine density estimates for the 5 subpopulations we sampled. Detection probabilities were affected by site-specific behavioral responses coupled with individual capture heterogeneity associated with sex. Model-averaged bear population estimates ranged from 120 (95% CI = 59–276) bears or a mean 0.025 bears/km2 (95% CI = 0.011–0.44) for the Eglin subpopulation to 1,198 bears (95% CI = 949–1,537) or 0.127 bears/km2 (95% CI = 0.101–0.163) for the Ocala-St. Johns subpopulation. The total population estimate for our 5 study areas was 3,916 bears (95% CI = 2,914–5,451). The clustered sampling method coupled with information on land cover was efficient and allowed us to estimate abundance across extensive areas that would not have been possible otherwise. Clustered sampling combined with spatially explicit capture-recapture methods has the potential to provide rigorous population estimates for a wide array of species that are extensive and heterogeneous in their distribution.
Gim, Jungsoo; Won, Sungho; Park, Taesung
2016-10-01
High throughput sequencing technology in transcriptomics studies contribute to the understanding of gene regulation mechanism and its cellular function, but also increases a need for accurate statistical methods to assess quantitative differences between experiments. Many methods have been developed to account for the specifics of count data: non-normality, a dependence of the variance on the mean, and small sample size. Among them, the small number of samples in typical experiments is still a challenge. Here we present a method for differential analysis of count data, using conditional estimation of local pooled dispersion parameters. A comprehensive evaluation of our proposed method in the aspect of differential gene expression analysis using both simulated and real data sets shows that the proposed method is more powerful than other existing methods while controlling the false discovery rates. By introducing conditional estimation of local pooled dispersion parameters, we successfully overcome the limitation of small power and enable a powerful quantitative analysis focused on differential expression test with the small number of samples.
Grulke, Eric A.; Wu, Xiaochun; Ji, Yinglu; Buhr, Egbert; Yamamoto, Kazuhiro; Song, Nam Woong; Stefaniak, Aleksandr B.; Schwegler-Berry, Diane; Burchett, Woodrow W.; Lambert, Joshua; Stromberg, Arnold J.
2018-04-01
Size and shape distributions of gold nanorod samples are critical to their physico-chemical properties, especially their longitudinal surface plasmon resonance. This interlaboratory comparison study developed methods for measuring and evaluating size and shape distributions for gold nanorod samples using transmission electron microscopy (TEM) images. The objective was to determine whether two different samples, which had different performance attributes in their application, were different with respect to their size and/or shape descriptor distributions. Touching particles in the captured images were identified using a ruggedness shape descriptor. Nanorods could be distinguished from nanocubes using an elongational shape descriptor. A non-parametric statistical test showed that cumulative distributions of an elongational shape descriptor, that is, the aspect ratio, were statistically different between the two samples for all laboratories. While the scale parameters of size and shape distributions were similar for both samples, the width parameters of size and shape distributions were statistically different. This protocol fulfills an important need for a standardized approach to measure gold nanorod size and shape distributions for applications in which quantitative measurements and comparisons are important. Furthermore, the validated protocol workflow can be automated, thus providing consistent and rapid measurements of nanorod size and shape distributions for researchers, regulatory agencies, and industry.
Arnup, Sarah J; McKenzie, Joanne E; Hemming, Karla; Pilcher, David; Forbes, Andrew B
2017-08-15
In a cluster randomised crossover (CRXO) design, a sequence of interventions is assigned to a group, or 'cluster' of individuals. Each cluster receives each intervention in a separate period of time, forming 'cluster-periods'. Sample size calculations for CRXO trials need to account for both the cluster randomisation and crossover aspects of the design. Formulae are available for the two-period, two-intervention, cross-sectional CRXO design, however implementation of these formulae is known to be suboptimal. The aims of this tutorial are to illustrate the intuition behind the design; and provide guidance on performing sample size calculations. Graphical illustrations are used to describe the effect of the cluster randomisation and crossover aspects of the design on the correlation between individual responses in a CRXO trial. Sample size calculations for binary and continuous outcomes are illustrated using parameters estimated from the Australia and New Zealand Intensive Care Society - Adult Patient Database (ANZICS-APD) for patient mortality and length(s) of stay (LOS). The similarity between individual responses in a CRXO trial can be understood in terms of three components of variation: variation in cluster mean response; variation in the cluster-period mean response; and variation between individual responses within a cluster-period; or equivalently in terms of the correlation between individual responses in the same cluster-period (within-cluster within-period correlation, WPC), and between individual responses in the same cluster, but in different periods (within-cluster between-period correlation, BPC). The BPC lies between zero and the WPC. When the WPC and BPC are equal the precision gained by crossover aspect of the CRXO design equals the precision lost by cluster randomisation. When the BPC is zero there is no advantage in a CRXO over a parallel-group cluster randomised trial. Sample size calculations illustrate that small changes in the specification of
Low-sampling-rate ultra-wideband channel estimation using a bounded-data-uncertainty approach
Ballal, Tarig
2014-01-01
This paper proposes a low-sampling-rate scheme for ultra-wideband channel estimation. In the proposed scheme, P pulses are transmitted to produce P observations. These observations are exploited to produce channel impulse response estimates at a desired sampling rate, while the ADC operates at a rate that is P times less. To avoid loss of fidelity, the interpulse interval, given in units of sampling periods of the desired rate, is restricted to be co-prime with P. This condition is affected when clock drift is present and the transmitted pulse locations change. To handle this situation and to achieve good performance without using prior information, we derive an improved estimator based on the bounded data uncertainty (BDU) model. This estimator is shown to be related to the Bayesian linear minimum mean squared error (LMMSE) estimator. The performance of the proposed sub-sampling scheme was tested in conjunction with the new estimator. It is shown that high reduction in sampling rate can be achieved. The proposed estimator outperforms the least squares estimator in most cases; while in the high SNR regime, it also outperforms the LMMSE estimator. © 2014 IEEE.
Bayesian sample size determination for cost-effectiveness studies with censored data.
Directory of Open Access Journals (Sweden)
Daniel P Beavers
Full Text Available Cost-effectiveness models are commonly utilized to determine the combined clinical and economic impact of one treatment compared to another. However, most methods for sample size determination of cost-effectiveness studies assume fully observed costs and effectiveness outcomes, which presents challenges for survival-based studies in which censoring exists. We propose a Bayesian method for the design and analysis of cost-effectiveness data in which costs and effectiveness may be censored, and the sample size is approximated for both power and assurance. We explore two parametric models and demonstrate the flexibility of the approach to accommodate a variety of modifications to study assumptions.
Development of sample size allocation program using hypergeometric distribution
International Nuclear Information System (INIS)
Kim, Hyun Tae; Kwack, Eun Ho; Park, Wan Soo; Min, Kyung Soo; Park, Chan Sik
1996-01-01
The objective of this research is the development of sample allocation program using hypergeometric distribution with objected-oriented method. When IAEA(International Atomic Energy Agency) performs inspection, it simply applies a standard binomial distribution which describes sampling with replacement instead of a hypergeometric distribution which describes sampling without replacement in sample allocation to up to three verification methods. The objective of the IAEA inspection is the timely detection of diversion of significant quantities of nuclear material, therefore game theory is applied to its sampling plan. It is necessary to use hypergeometric distribution directly or approximate distribution to secure statistical accuracy. Improved binomial approximation developed by Mr. J. L. Jaech and correctly applied binomial approximation are more closer to hypergeometric distribution in sample size calculation than the simply applied binomial approximation of the IAEA. Object-oriented programs of 1. sample approximate-allocation with correctly applied standard binomial approximation, 2. sample approximate-allocation with improved binomial approximation, and 3. sample approximate-allocation with hypergeometric distribution were developed with Visual C ++ and corresponding programs were developed with EXCEL(using Visual Basic for Application). 8 tabs., 15 refs. (Author)
Uncertainties in effective dose estimates of adult CT head scans: The effect of head size
International Nuclear Information System (INIS)
Gregory, Kent J.; Bibbo, Giovanni; Pattison, John E.
2009-01-01
Purpose: This study is an extension of a previous study where the uncertainties in effective dose estimates from adult CT head scans were calculated using four CT effective dose estimation methods, three of which were computer programs (CT-EXPO, CTDOSIMETRY, and IMPACTDOSE) and one that involved the dose length product (DLP). However, that study did not include the uncertainty contribution due to variations in head sizes. Methods: The uncertainties due to head size variations were estimated by first using the computer program data to calculate doses to small and large heads. These doses were then compared with doses calculated for the phantom heads used by the computer programs. An uncertainty was then assigned based on the difference between the small and large head doses and the doses of the phantom heads. Results: The uncertainties due to head size variations alone were found to be between 4% and 26% depending on the method used and the patient gender. When these uncertainties were included with the results of the previous study, the overall uncertainties in effective dose estimates (stated at the 95% confidence interval) were 20%-31% (CT-EXPO), 15%-30% (CTDOSIMETRY), 20%-36% (IMPACTDOSE), and 31%-40% (DLP). Conclusions: For the computer programs, the lower overall uncertainties were still achieved when measured values of CT dose index were used rather than tabulated values. For DLP dose estimates, head size variations made the largest (for males) and second largest (for females) contributions to effective dose uncertainty. An improvement in the uncertainty of the DLP method dose estimates will be achieved if head size variation can be taken into account.
Uncertainties in effective dose estimates of adult CT head scans: The effect of head size
Energy Technology Data Exchange (ETDEWEB)
Gregory, Kent J.; Bibbo, Giovanni; Pattison, John E. [Department of Medical Physics, Royal Adelaide Hospital, Adelaide, South Australia 5000 (Australia) and School of Electrical and Information Engineering (Applied Physics), University of South Australia, Mawson Lakes, South Australia 5095 (Australia); Division of Medical Imaging, Women' s and Children' s Hospital, North Adelaide, South Australia 5006 (Australia) and School of Electrical and Information Engineering (Applied Physics), University of South Australia, Mawson Lakes, South Australia 5095 (Australia); School of Electrical and Information Engineering (Applied Physics), University of South Australia, Mawson Lakes, South Australia 5095 (Australia)
2009-09-15
Purpose: This study is an extension of a previous study where the uncertainties in effective dose estimates from adult CT head scans were calculated using four CT effective dose estimation methods, three of which were computer programs (CT-EXPO, CTDOSIMETRY, and IMPACTDOSE) and one that involved the dose length product (DLP). However, that study did not include the uncertainty contribution due to variations in head sizes. Methods: The uncertainties due to head size variations were estimated by first using the computer program data to calculate doses to small and large heads. These doses were then compared with doses calculated for the phantom heads used by the computer programs. An uncertainty was then assigned based on the difference between the small and large head doses and the doses of the phantom heads. Results: The uncertainties due to head size variations alone were found to be between 4% and 26% depending on the method used and the patient gender. When these uncertainties were included with the results of the previous study, the overall uncertainties in effective dose estimates (stated at the 95% confidence interval) were 20%-31% (CT-EXPO), 15%-30% (CTDOSIMETRY), 20%-36% (IMPACTDOSE), and 31%-40% (DLP). Conclusions: For the computer programs, the lower overall uncertainties were still achieved when measured values of CT dose index were used rather than tabulated values. For DLP dose estimates, head size variations made the largest (for males) and second largest (for females) contributions to effective dose uncertainty. An improvement in the uncertainty of the DLP method dose estimates will be achieved if head size variation can be taken into account.
The use of Thompson sampling to increase estimation precision
Kaptein, M.C.
2015-01-01
In this article, we consider a sequential sampling scheme for efficient estimation of the difference between the means of two independent treatments when the population variances are unequal across groups. The sampling scheme proposed is based on a solution to bandit problems called Thompson
Dispersion and sampling of adult Dermacentor andersoni in rangeland in Western North America.
Rochon, K; Scoles, G A; Lysyk, T J
2012-03-01
A fixed precision sampling plan was developed for off-host populations of adult Rocky Mountain wood tick, Dermacentor andersoni (Stiles) based on data collected by dragging at 13 locations in Alberta, Canada; Washington; and Oregon. In total, 222 site-date combinations were sampled. Each site-date combination was considered a sample, and each sample ranged in size from 86 to 250 10 m2 quadrats. Analysis of simulated quadrats ranging in size from 10 to 50 m2 indicated that the most precise sample unit was the 10 m2 quadrat. Samples taken when abundance mean-variance relationships were fit and used to predict sample sizes for a fixed level of precision. Sample sizes predicted using the Taylor model tended to underestimate actual sample sizes, while sample sizes estimated using the Iwao model tended to overestimate actual sample sizes. Using a negative binomial with common k provided estimates of required sample sizes closest to empirically calculated sample sizes.
Estimation of river and stream temperature trends under haphazard sampling
Gray, Brian R.; Lyubchich, Vyacheslav; Gel, Yulia R.; Rogala, James T.; Robertson, Dale M.; Wei, Xiaoqiao
2015-01-01
Long-term temporal trends in water temperature in rivers and streams are typically estimated under the assumption of evenly-spaced space-time measurements. However, sampling times and dates associated with historical water temperature datasets and some sampling designs may be haphazard. As a result, trends in temperature may be confounded with trends in time or space of sampling which, in turn, may yield biased trend estimators and thus unreliable conclusions. We address this concern using multilevel (hierarchical) linear models, where time effects are allowed to vary randomly by day and date effects by year. We evaluate the proposed approach by Monte Carlo simulations with imbalance, sparse data and confounding by trend in time and date of sampling. Simulation results indicate unbiased trend estimators while results from a case study of temperature data from the Illinois River, USA conform to river thermal assumptions. We also propose a new nonparametric bootstrap inference on multilevel models that allows for a relatively flexible and distribution-free quantification of uncertainties. The proposed multilevel modeling approach may be elaborated to accommodate nonlinearities within days and years when sampling times or dates typically span temperature extremes.
Effects of sample size on robustness and prediction accuracy of a prognostic gene signature
Directory of Open Access Journals (Sweden)
Kim Seon-Young
2009-05-01
Full Text Available Abstract Background Few overlap between independently developed gene signatures and poor inter-study applicability of gene signatures are two of major concerns raised in the development of microarray-based prognostic gene signatures. One recent study suggested that thousands of samples are needed to generate a robust prognostic gene signature. Results A data set of 1,372 samples was generated by combining eight breast cancer gene expression data sets produced using the same microarray platform and, using the data set, effects of varying samples sizes on a few performances of a prognostic gene signature were investigated. The overlap between independently developed gene signatures was increased linearly with more samples, attaining an average overlap of 16.56% with 600 samples. The concordance between predicted outcomes by different gene signatures also was increased with more samples up to 94.61% with 300 samples. The accuracy of outcome prediction also increased with more samples. Finally, analysis using only Estrogen Receptor-positive (ER+ patients attained higher prediction accuracy than using both patients, suggesting that sub-type specific analysis can lead to the development of better prognostic gene signatures Conclusion Increasing sample sizes generated a gene signature with better stability, better concordance in outcome prediction, and better prediction accuracy. However, the degree of performance improvement by the increased sample size was different between the degree of overlap and the degree of concordance in outcome prediction, suggesting that the sample size required for a study should be determined according to the specific aims of the study.
International Nuclear Information System (INIS)
Bavio, José; Marrón, Beatriz
2014-01-01
Quality of service (QoS) for internet traffic management requires good traffic models and good estimation of sharing network resource. A link of a network processes all traffic and it is designed with certain capacity C and buffer size B. A Generalized Markov Fluid model (GMFM), introduced by Marrón (2011), is assumed for the sources because describes in a versatile way the traffic, allows estimation based on traffic traces, and also consistent effective bandwidth estimation can be done. QoS, interpreted as buffer overflow probability, can be estimated for GMFM through the effective bandwidth estimation and solving the optimization problem presented in Courcoubetis (2002), the so call inf-sup formulas. In this work we implement a code to solve the inf-sup problem and other optimization related with it, that allow us to do traffic engineering in links of data networks to calculate both, minimum capacity required when QoS and buffer size are given or minimum buffer size required when QoS and capacity are given
Sampling and chemical analysis by TXRF of size-fractionated ambient aerosols and emissions
International Nuclear Information System (INIS)
John, A.C.; Kuhlbusch, T.A.J.; Fissan, H.; Schmidt, K.-G-; Schmidt, F.; Pfeffer, H.-U.; Gladtke, D.
2000-01-01
Results of recent epidemiological studies led to new European air quality standards which require the monitoring of particles with aerodynamic diameters ≤ 10 μm (PM 10) and ≤ 2.5 μm (PM 2.5) instead of TSP (total suspended particulate matter). As these ambient air limit values will be exceeded most likely at several locations in Europe, so-called 'action plans' have to be set up to reduce particle concentrations, which requires information about sources and processes of PMx aerosols. For chemical characterization of the aerosols, different samplers were used and total reflection x-ray fluorescence analysis (TXRF) was applied beside other methods (elemental and organic carbon analysis, ion chromatography, atomic absorption spectrometry). For TXRF analysis, a specially designed sampling unit was built where the particle size classes 10-2.5 μm and 2.5-1.0 μm were directly impacted on TXRF sample carriers. An electrostatic precipitator (ESP) was used as a back-up filter to collect particles <1 μm directly on a TXRF sample carrier. The sampling unit was calibrated in the laboratory and then used for field measurements to determine the elemental composition of the mentioned particle size fractions. One of the field campaigns was carried out at a measurement site in Duesseldorf, Germany, in November 1999. As the composition of the ambient aerosols may have been influenced by a large construction site directly in the vicinity of the station during the field campaign, not only the aerosol particles, but also construction material was sampled and analyzed by TXRF. As air quality is affected by natural and anthropogenic sources, the emissions of particles ≤ 10 μm and ≤ 2.5 μm, respectively, have to be determined to estimate their contributions to the so called coarse and fine particle modes of ambient air. Therefore, an in-stack particle sampling system was developed according to the new ambient air quality standards. This PM 10/PM 2.5 cascade impactor was
Replicon sizes in mammalian cells as estimated by an x-ray plus bromodeoxyuridine photolysis method
International Nuclear Information System (INIS)
Kapp, L.N.; Painter, R.B.
1978-01-01
A new method is described for estimating replicon sizes in mammalian cells. Cultures were pulse labeled with [ 3 H]thymidine ([ 3 H]TdR) and bromodeoxyuridine (BrDUrd) for up to 1 h. The lengths of the resulting labeled regions of DNA, L/sub obs/, were estimated by a technique wherein the change in molecular weight of nascent DNA strands, induced by 313 nm light, is measured by velocity sedimentation in alkaline sucrose gradients. If cells are exposed to 1,000 rads of x rays immediately before pulse labeling, initiation of replicon operation is blocked, although chain elongation proceeds almost normally. Under these conditions L/sub obs/ continues to increase only until operating replicons have completed their replication. This value for L/sub obs/ then remains constant as long as the block to initiation remains and represents an estimate for the average size of replicons operating in the cells before x irradiation. For human diploid fibroblasts and human HeLa cells this estimated average size is approximately 17 μM, whereas for Chinese hamster ovary cells, the average replicon size is about 42 μM
Sayers, Adrian; Crowther, Michael J; Judge, Andrew; Whitehouse, Michael R; Blom, Ashley W
2017-08-28
The use of benchmarks to assess the performance of implants such as those used in arthroplasty surgery is a widespread practice. It provides surgeons, patients and regulatory authorities with the reassurance that implants used are safe and effective. However, it is not currently clear how or how many implants should be statistically compared with a benchmark to assess whether or not that implant is superior, equivalent, non-inferior or inferior to the performance benchmark of interest.We aim to describe the methods and sample size required to conduct a one-sample non-inferiority study of a medical device for the purposes of benchmarking. Simulation study. Simulation study of a national register of medical devices. We simulated data, with and without a non-informative competing risk, to represent an arthroplasty population and describe three methods of analysis (z-test, 1-Kaplan-Meier and competing risks) commonly used in surgical research. We evaluate the performance of each method using power, bias, root-mean-square error, coverage and CI width. 1-Kaplan-Meier provides an unbiased estimate of implant net failure, which can be used to assess if a surgical device is non-inferior to an external benchmark. Small non-inferiority margins require significantly more individuals to be at risk compared with current benchmarking standards. A non-inferiority testing paradigm provides a useful framework for determining if an implant meets the required performance defined by an external benchmark. Current contemporary benchmarking standards have limited power to detect non-inferiority, and substantially larger samples sizes, in excess of 3200 procedures, are required to achieve a power greater than 60%. It is clear when benchmarking implant performance, net failure estimated using 1-KM is preferential to crude failure estimated by competing risk models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No
Ndayongeje, Joel; Msami, Amani; Laurent, Yovin Ivo; Mwankemwa, Syangu; Makumbuli, Moza; Ngonyani, Alois M; Tiberio, Jenny; Welty, Susie; Said, Christen; Morris, Meghan D; McFarland, Willi
2018-02-12
We mapped hot spots and estimated the numbers of people who use drugs (PWUD) and who inject drugs (PWID) in 12 regions of Tanzania. Primary (ie, current and past PWUD) and secondary (eg, police, service providers) key informants identified potential hot spots, which we visited to verify and count the number of PWUD and PWID present. Adjustments to counts and extrapolation to regional estimates were done by local experts through iterative rounds of discussion. Drug use, specifically cocaine and heroin, occurred in all regions. Tanga had the largest numbers of PWUD and PWID (5190 and 540, respectively), followed by Mwanza (3300 and 300, respectively). Findings highlight the need to strengthen awareness of drug use and develop prevention and harm reduction programs with broader reach in Tanzania. This exercise provides a foundation for understanding the extent and locations of drug use, a baseline for future size estimations, and a sampling frame for future research.
Density meter algorithm and system for estimating sampling/mixing uncertainty
International Nuclear Information System (INIS)
Shine, E.P.
1986-01-01
The Laboratories Department at the Savannah River Plant (SRP) has installed a six-place density meter with an automatic sampling device. This paper describes the statistical software developed to analyze the density of uranyl nitrate solutions using this automated system. The purpose of this software is twofold: to estimate the sampling/mixing and measurement uncertainties in the process and to provide a measurement control program for the density meter. Non-uniformities in density are analyzed both analytically and graphically. The mean density and its limit of error are estimated. Quality control standards are analyzed concurrently with process samples and used to control the density meter measurement error. The analyses are corrected for concentration due to evaporation of samples waiting to be analyzed. The results of this program have been successful in identifying sampling/mixing problems and controlling the quality of analyses
Density meter algorithm and system for estimating sampling/mixing uncertainty
International Nuclear Information System (INIS)
Shine, E.P.
1986-01-01
The Laboratories Department at the Savannah River Plant (SRP) has installed a six-place density meter with an automatic sampling device. This paper describes the statisical software developed to analyze the density of uranyl nitrate solutions using this automated system. The purpose of this software is twofold: to estimate the sampling/mixing and measurement uncertainties in the process and to provide a measurement control program for the density meter. Non-uniformities in density are analyzed both analytically and graphically. The mean density and its limit of error are estimated. Quality control standards are analyzed concurrently with process samples and used to control the density meter measurement error. The analyses are corrected for concentration due to evaporation of samples waiting to be analyzed. The results of this program have been successful in identifying sampling/mixing problems and controlling the quality of analyses
Smith, D.R.; Rogala, J.T.; Gray, B.R.; Zigler, S.J.; Newton, T.J.
2011-01-01
Reliable estimates of abundance are needed to assess consequences of proposed habitat restoration and enhancement projects on freshwater mussels in the Upper Mississippi River (UMR). Although there is general guidance on sampling techniques for population assessment of freshwater mussels, the actual performance of sampling designs can depend critically on the population density and spatial distribution at the project site. To evaluate various sampling designs, we simulated sampling of populations, which varied in density and degree of spatial clustering. Because of logistics and costs of large river sampling and spatial clustering of freshwater mussels, we focused on adaptive and non-adaptive versions of single and two-stage sampling. The candidate designs performed similarly in terms of precision (CV) and probability of species detection for fixed sample size. Both CV and species detection were determined largely by density, spatial distribution and sample size. However, designs did differ in the rate that occupied quadrats were encountered. Occupied units had a higher probability of selection using adaptive designs than conventional designs. We used two measures of cost: sample size (i.e. number of quadrats) and distance travelled between the quadrats. Adaptive and two-stage designs tended to reduce distance between sampling units, and thus performed better when distance travelled was considered. Based on the comparisons, we provide general recommendations on the sampling designs for the freshwater mussels in the UMR, and presumably other large rivers.
International Nuclear Information System (INIS)
Zhang, Song; Rajamani, Rajesh
2016-01-01
This paper develops analytical sensing principles for estimation of circumferential size of a cylindrical surface using magnetic sensors. An electromagnet and magnetic sensors are used on a wearable band for measurement of leg size. In order to enable robust size estimation during rough real-world use of the wearable band, three estimation algorithms are developed based on models of the magnetic field variation over a cylindrical surface. The magnetic field models developed include those for a dipole and for a uniformly magnetized cylinder. The estimation algorithms used include a linear regression equation, an extended Kalman filter and an unscented Kalman filter. Experimental laboratory tests show that the size sensor in general performs accurately, yielding sub-millimeter estimation errors. The unscented Kalman filter yields the best performance that is robust to bias and misalignment errors. The size sensor developed herein can be used for monitoring swelling due to fluid accumulation in the lower leg and a number of other biomedical applications. (paper)
Volatile and non-volatile elements in grain-size separated samples of Apollo 17 lunar soils
International Nuclear Information System (INIS)
Giovanoli, R.; Gunten, H.R. von; Kraehenbuehl, U.; Meyer, G.; Wegmueller, F.; Gruetter, A.; Wyttenbach, A.
1977-01-01
Three samples of Apollo 17 lunar soils (75081, 72501 and 72461) were separated into 9 grain-size fractions between 540 and 1 μm mean diameter. In order to detect mineral fractionations caused during the separation procedures major elements were determined by instrumental neutron activation analyses performed on small aliquots of the separated samples. Twenty elements were measured in each size fraction using instrumental and radiochemical neutron activation techniques. The concentration of the main elements in sample 75081 does not change with the grain-size. Exceptions are Fe and Ti which decrease slightly and Al which increases slightly with the decrease in the grain-size. These changes in the composition in main elements suggest a decrease in Ilmenite and an increase in Anorthite with decreasing grain-size. However, it can be concluded that the mineral composition of the fractions changes less than a factor of 2. Samples 72501 and 72461 are not yet analyzed for the main elements. (Auth.)
Lawson, Chris A
2014-07-01
Three experiments with 81 3-year-olds (M=3.62years) examined the conditions that enable young children to use the sample size principle (SSP) of induction-the inductive rule that facilitates generalizations from large rather than small samples of evidence. In Experiment 1, children exhibited the SSP when exemplars were presented sequentially but not when exemplars were presented simultaneously. Results from Experiment 3 suggest that the advantage of sequential presentation is not due to the additional time to process the available input from the two samples but instead may be linked to better memory for specific individuals in the large sample. In addition, findings from Experiments 1 and 2 suggest that adherence to the SSP is mediated by the disparity between presented samples. Overall, these results reveal that the SSP appears early in development and is guided by basic cognitive processes triggered during the acquisition of input. Copyright © 2013 Elsevier Inc. All rights reserved.
DEFF Research Database (Denmark)
Nielsen, Morten Ø.; Frederiksen, Per Houmann
2005-01-01
In this paper we compare through Monte Carlo simulations the finite sample properties of estimators of the fractional differencing parameter, d. This involves frequency domain, time domain, and wavelet based approaches, and we consider both parametric and semiparametric estimation methods. The es...... the time domain parametric methods, and (4) without sufficient trimming of scales the wavelet-based estimators are heavily biased.......In this paper we compare through Monte Carlo simulations the finite sample properties of estimators of the fractional differencing parameter, d. This involves frequency domain, time domain, and wavelet based approaches, and we consider both parametric and semiparametric estimation methods....... The estimators are briefly introduced and compared, and the criteria adopted for measuring finite sample performance are bias and root mean squared error. Most importantly, the simulations reveal that (1) the frequency domain maximum likelihood procedure is superior to the time domain parametric methods, (2) all...
International Nuclear Information System (INIS)
Baek, Ji Eun; Kim, Sung Hun; Lee, Ah Won
2014-01-01
Objective: To evaluate whether the degree of background parenchymal enhancement affects the accuracy of tumor size estimation based on breast MRI. Methods: Three hundred and twenty-two patients who had known breast cancer and underwent breast MRIs were recruited in our study. The total number of breast cancer cases was 339. All images were assessed retrospectively for the level of background parenchymal enhancement based on the BI-RADS criteria. Maximal lesion diameters were measured on the MRIs, and tumor types (mass vs. non-mass) were assessed. Tumor size differences between the MRI-based estimates and estimates based on pathological examinations were analyzed. The relationship between accuracy and tumor types and clinicopathologic features were also evaluated. Results: The cases included minimal (47.5%), mild (28.9%), moderate (12.4%) and marked background parenchymal enhancement (11.2%). The tumors of patients with minimal or mild background parenchymal enhancement were more accurately estimated than those of patients with moderate or marked enhancement (72.1% vs. 56.8%; p = 0.003). The tumors of women with mass type lesions were significantly more accurately estimated than those of the women with non-mass type lesions (81.6% vs. 28.6%; p < 0.001). The tumor of women negative for HER2 was more accurately estimated than those of women positive for HER2 (72.2% vs. 51.6%; p = 0.047). Conclusion: Moderate and marked background parenchymal enhancement is related to the inaccurate estimation of tumor size based on MRI. Non-mass type breast cancer and HER2-positive breast cancer are other factors that may cause inaccurate assessment of tumor size
Fang, Guor-Cheng; Chang, Kuan-Foo; Lu, Chungsying; Bai, Hsunling
2004-05-01
The concentrations of polycyclic aromatic hydrocarbons (PAHs) in gas phase and particle bound were measured simultaneously at industrial (INDUSTRY), urban (URBAN), and rural areas (RURAL) in Taichung, Taiwan. And the PAH concentrations, size distributions, estimated PAHs dry deposition fluxes and health risk study of PAHs in the ambient air of central Taiwan were discussed in this study. Total PAH concentrations at INDUSTRY, URBAN, and RURAL sampling sites were found to be 1650 +/- 1240, 1220 +/- 520, and 831 +/- 427 ng/m3, respectively. The results indicated that PAH concentrations were higher at INDUSTRY and URBAN sampling sites than the RURAL sampling sites because of the more industrial processes, traffic exhausts and human activities. The estimation dry deposition and size distribution of PAHs were also studied. The results indicated that the estimated dry deposition fluxes of total PAHs were 58.5, 48.8, and 38.6 microg/m2/day at INDUSTRY, URBAN, and RURAL, respectively. The BaP equivalency results indicated that the health risk of gas phase PAHs were higher than the particle phase at three sampling sites of central Taiwan. However, compared with the BaP equivalency results to other studies conducted in factory, this study indicated the health risk of PAHs was acceptable in the ambient air of central Taiwan.
Inventory implications of using sampling variances in estimation of growth model coefficients
Albert R. Stage; William R. Wykoff
2000-01-01
Variables based on stand densities or stocking have sampling errors that depend on the relation of tree size to plot size and on the spatial structure of the population, ignoring the sampling errors of such variables, which include most measures of competition used in both distance-dependent and distance-independent growth models, can bias the predictions obtained from...
McCarthy, David T; Zhang, Kefeng; Westerlund, Camilla; Viklander, Maria; Bertrand-Krajewski, Jean-Luc; Fletcher, Tim D; Deletic, Ana
2018-02-01
The estimation of stormwater pollutant concentrations is a primary requirement of integrated urban water management. In order to determine effective sampling strategies for estimating pollutant concentrations, data from extensive field measurements at seven different catchments was used. At all sites, 1-min resolution continuous flow measurements, as well as flow-weighted samples, were taken and analysed for total suspend solids (TSS), total nitrogen (TN) and Escherichia coli (E. coli). For each of these parameters, the data was used to calculate the Event Mean Concentrations (EMCs) for each event. The measured Site Mean Concentrations (SMCs) were taken as the volume-weighted average of these EMCs for each parameter, at each site. 17 different sampling strategies, including random and fixed strategies were tested to estimate SMCs, which were compared with the measured SMCs. The ratios of estimated/measured SMCs were further analysed to determine the most effective sampling strategies. Results indicate that the random sampling strategies were the most promising method in reproducing SMCs for TSS and TN, while some fixed sampling strategies were better for estimating the SMC of E. coli. The differences in taking one, two or three random samples were small (up to 20% for TSS, and 10% for TN and E. coli), indicating that there is little benefit in investing in collection of more than one sample per event if attempting to estimate the SMC through monitoring of multiple events. It was estimated that an average of 27 events across the studied catchments are needed for characterising SMCs of TSS with a 90% confidence interval (CI) width of 1.0, followed by E.coli (average 12 events) and TN (average 11 events). The coefficient of variation of pollutant concentrations was linearly and significantly correlated to the 90% confidence interval ratio of the estimated/measured SMCs (R 2 = 0.49; P sampling frequency needed to accurately estimate SMCs of pollutants. Crown
A model to estimate the size of nanoparticle agglomerates in gas−solid fluidized beds
Energy Technology Data Exchange (ETDEWEB)
Martín, Lilian de, E-mail: L.DeMartinMonton@tudelft.nl; Ommen, J. Ruud van [Delft University of Technology, Department of Chemical Engineering (Netherlands)
2013-11-15
The estimation of nanoparticle agglomerates’ size in fluidized beds remains an open challenge, mainly due to the difficulty of characterizing the inter-agglomerate van der Waals force. The current approach is to describe micron-sized nanoparticle agglomerates as micron-sized particles with 0.1–0.2-μm asperities. This simplification does not capture the influence of the particle size on the van der Waals attraction between agglomerates. In this paper, we propose a new description where the agglomerates are micron-sized particles with nanoparticles on the surface, acting as asperities. As opposed to previous models, here the van der Waals force between agglomerates decreases with an increase in the particle size. We have also included an additional force due to the hydrogen bond formation between the surfaces of hydrophilic and dry nanoparticles. The average size of the fluidized agglomerates has been estimated equating the attractive force obtained from this method to the weight of the individual agglomerates. The results have been compared to 54 experimental values, most of them collected from the literature. Our model approximates without a systematic error the size of most of the nanopowders, both in conventional and centrifugal fluidized beds, outperforming current models. Although simple, the model is able to capture the influence of the nanoparticle size, particle density, and Hamaker coefficient on the inter-agglomerate forces.
A model to estimate the size of nanoparticle agglomerates in gas−solid fluidized beds
International Nuclear Information System (INIS)
Martín, Lilian de; Ommen, J. Ruud van
2013-01-01
The estimation of nanoparticle agglomerates’ size in fluidized beds remains an open challenge, mainly due to the difficulty of characterizing the inter-agglomerate van der Waals force. The current approach is to describe micron-sized nanoparticle agglomerates as micron-sized particles with 0.1–0.2-μm asperities. This simplification does not capture the influence of the particle size on the van der Waals attraction between agglomerates. In this paper, we propose a new description where the agglomerates are micron-sized particles with nanoparticles on the surface, acting as asperities. As opposed to previous models, here the van der Waals force between agglomerates decreases with an increase in the particle size. We have also included an additional force due to the hydrogen bond formation between the surfaces of hydrophilic and dry nanoparticles. The average size of the fluidized agglomerates has been estimated equating the attractive force obtained from this method to the weight of the individual agglomerates. The results have been compared to 54 experimental values, most of them collected from the literature. Our model approximates without a systematic error the size of most of the nanopowders, both in conventional and centrifugal fluidized beds, outperforming current models. Although simple, the model is able to capture the influence of the nanoparticle size, particle density, and Hamaker coefficient on the inter-agglomerate forces
Arnup, Sarah J; McKenzie, Joanne E; Pilcher, David; Bellomo, Rinaldo; Forbes, Andrew B
2018-06-01
The cluster randomised crossover (CRXO) design provides an opportunity to conduct randomised controlled trials to evaluate low risk interventions in the intensive care setting. Our aim is to provide a tutorial on how to perform a sample size calculation for a CRXO trial, focusing on the meaning of the elements required for the calculations, with application to intensive care trials. We use all-cause in-hospital mortality from the Australian and New Zealand Intensive Care Society Adult Patient Database clinical registry to illustrate the sample size calculations. We show sample size calculations for a two-intervention, two 12-month period, cross-sectional CRXO trial. We provide the formulae, and examples of their use, to determine the number of intensive care units required to detect a risk ratio (RR) with a designated level of power between two interventions for trials in which the elements required for sample size calculations remain constant across all ICUs (unstratified design); and in which there are distinct groups (strata) of ICUs that differ importantly in the elements required for sample size calculations (stratified design). The CRXO design markedly reduces the sample size requirement compared with the parallel-group, cluster randomised design for the example cases. The stratified design further reduces the sample size requirement compared with the unstratified design. The CRXO design enables the evaluation of routinely used interventions that can bring about small, but important, improvements in patient care in the intensive care setting.
Gray bootstrap method for estimating frequency-varying random vibration signals with small samples
Directory of Open Access Journals (Sweden)
Wang Yanqing
2014-04-01
Full Text Available During environment testing, the estimation of random vibration signals (RVS is an important technique for the airborne platform safety and reliability. However, the available methods including extreme value envelope method (EVEM, statistical tolerances method (STM and improved statistical tolerance method (ISTM require large samples and typical probability distribution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated interval, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM and gray method (GM in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.
Optimizing denominator data estimation through a multimodel approach
Directory of Open Access Journals (Sweden)
Ward Bryssinckx
2014-05-01
Full Text Available To assess the risk of (zoonotic disease transmission in developing countries, decision makers generally rely on distribution estimates of animals from survey records or projections of historical enumeration results. Given the high cost of large-scale surveys, the sample size is often restricted and the accuracy of estimates is therefore low, especially when spatial high-resolution is applied. This study explores possibilities of improving the accuracy of livestock distribution maps without additional samples using spatial modelling based on regression tree forest models, developed using subsets of the Uganda 2008 Livestock Census data, and several covariates. The accuracy of these spatial models as well as the accuracy of an ensemble of a spatial model and direct estimate was compared to direct estimates and “true” livestock figures based on the entire dataset. The new approach is shown to effectively increase the livestock estimate accuracy (median relative error decrease of 0.166-0.037 for total sample sizes of 80-1,600 animals, respectively. This outcome suggests that the accuracy levels obtained with direct estimates can indeed be achieved with lower sample sizes and the multimodel approach presented here, indicating a more efficient use of financial resources.
Vereecken, Carine; Dohogne, Sophie; Covents, Marc; Maes, Lea
2010-06-01
Computer-administered questionnaires have received increased attention for large-scale population research on nutrition. In Belgium-Flanders, Young Adolescents' Nutrition Assessment on Computer (YANA-C) has been developed. In this tool, standardised photographs are available to assist in portion-size estimation. The purpose of the present study is to assess how accurate adolescents are in estimating portion sizes of food using YANA-C. A convenience sample, aged 11-17 years, estimated the amounts of ten commonly consumed foods (breakfast cereals, French fries, pasta, rice, apple sauce, carrots and peas, crisps, creamy velouté, red cabbage, and peas). Two procedures were followed: (1) short-term recall: adolescents (n 73) self-served their usual portions of the ten foods and estimated the amounts later the same day; (2) real-time perception: adolescents (n 128) estimated two sets (different portions) of pre-weighed portions displayed near the computer. Self-served portions were, on average, 8 % underestimated; significant underestimates were found for breakfast cereals, French fries, peas, and carrots and peas. Spearman's correlations between the self-served and estimated weights varied between 0.51 and 0.84, with an average of 0.72. The kappa statistics were moderate (>0.4) for all but one item. Pre-weighed portions were, on average, 15 % underestimated, with significant underestimates for fourteen of the twenty portions. Photographs of food items can serve as a good aid in ranking subjects; however, to assess the actual intake at a group level, underestimation must be considered.
Influence of Sampling Effort on the Estimated Richness of Road-Killed Vertebrate Wildlife
Bager, Alex; da Rosa, Clarissa A.
2011-05-01
Road-killed mammals, birds, and reptiles were collected weekly from highways in southern Brazil in 2002 and 2005. The objective was to assess variation in estimates of road-kill impacts on species richness produced by different sampling efforts, and to provide information to aid in the experimental design of future sampling. Richness observed in weekly samples was compared with sampling for different periods. In each period, the list of road-killed species was evaluated based on estimates the community structure derived from weekly samplings, and by the presence of the ten species most subject to road mortality, and also of threatened species. Weekly samples were sufficient only for reptiles and mammals, considered separately. Richness estimated from the biweekly samples was equal to that found in the weekly samples, and gave satisfactory results for sampling the most abundant and threatened species. The ten most affected species showed constant road-mortality rates, independent of sampling interval, and also maintained their dominance structure. Birds required greater sampling effort. When the composition of road-killed species varies seasonally, it is necessary to take biweekly samples for a minimum of one year. Weekly or more-frequent sampling for periods longer than two years is necessary to provide a reliable estimate of total species richness.
Lee, Eun Gyung; Lee, Taekhee; Kim, Seung Won; Lee, Larry; Flemmer, Michael M; Harper, Martin
2014-01-01
This second, and concluding, part of this study evaluated changes in sampling efficiency of respirable size-selective samplers due to air pulsations generated by the selected personal sampling pumps characterized in Part I (Lee E, Lee L, Möhlmann C et al. Evaluation of pump pulsation in respirable size-selective sampling: Part I. Pulsation measurements. Ann Occup Hyg 2013). Nine particle sizes of monodisperse ammonium fluorescein (from 1 to 9 μm mass median aerodynamic diameter) were generated individually by a vibrating orifice aerosol generator from dilute solutions of fluorescein in aqueous ammonia and then injected into an environmental chamber. To collect these particles, 10-mm nylon cyclones, also known as Dorr-Oliver (DO) cyclones, were used with five medium volumetric flow rate pumps. Those were the Apex IS, HFS513, GilAir5, Elite5, and Basic5 pumps, which were found in Part I to generate pulsations of 5% (the lowest), 25%, 30%, 56%, and 70% (the highest), respectively. GK2.69 cyclones were used with the Legacy [pump pulsation (PP) = 15%] and Elite12 (PP = 41%) pumps for collection at high flows. The DO cyclone was also used to evaluate changes in sampling efficiency due to pulse shape. The HFS513 pump, which generates a more complex pulse shape, was compared to a single sine wave fluctuation generated by a piston. The luminescent intensity of the fluorescein extracted from each sample was measured with a luminescence spectrometer. Sampling efficiencies were obtained by dividing the intensity of the fluorescein extracted from the filter placed in a cyclone with the intensity obtained from the filter used with a sharp-edged reference sampler. Then, sampling efficiency curves were generated using a sigmoid function with three parameters and each sampling efficiency curve was compared to that of the reference cyclone by constructing bias maps. In general, no change in sampling efficiency (bias under ±10%) was observed until pulsations exceeded 25% for the
Sample-size effects in fast-neutron gamma-ray production measurements: solid-cylinder samples
International Nuclear Information System (INIS)
Smith, D.L.
1975-09-01
The effects of geometry, absorption and multiple scattering in (n,Xγ) reaction measurements with solid-cylinder samples are investigated. Both analytical and Monte-Carlo methods are employed in the analysis. Geometric effects are shown to be relatively insignificant except in definition of the scattering angles. However, absorption and multiple-scattering effects are quite important; accurate microscopic differential cross sections can be extracted from experimental data only after a careful determination of corrections for these processes. The results of measurements performed using several natural iron samples (covering a wide range of sizes) confirm validity of the correction procedures described herein. It is concluded that these procedures are reliable whenever sufficiently accurate neutron and photon cross section and angular distribution information is available for the analysis. (13 figures, 5 tables) (auth)
Page sample size in web accessibility testing: how many pages is enough?
Velleman, Eric Martin; van der Geest, Thea
2013-01-01
Various countries and organizations use a different sampling approach and sample size of web pages in accessibility conformance tests. We are conducting a systematic analysis to determine how many pages is enough for testing whether a website is compliant with standard accessibility guidelines. This
Altschuler, Justin; Margolius, David; Bodenheimer, Thomas; Grumbach, Kevin
2012-01-01
PURPOSE Primary care faces the dilemma of excessive patient panel sizes in an environment of a primary care physician shortage. We aimed to estimate primary care panel sizes under different models of task delegation to nonphysician members of the primary care team. METHODS We used published estimates of the time it takes for a primary care physician to provide preventive, chronic, and acute care for a panel of 2,500 patients, and modeled how panel sizes would change if portions of preventive and chronic care services were delegated to nonphysician team members. RESULTS Using 3 assumptions about the degree of task delegation that could be achieved (77%, 60%, and 50% of preventive care, and 47%, 30%, and 25% of chronic care), we estimated that a primary care team could reasonably care for a panel of 1,947, 1,523, or 1,387 patients. CONCLUSIONS If portions of preventive and chronic care services are delegated to nonphysician team members, primary care practices can provide recommended preventive and chronic care with panel sizes that are achievable with the available primary care workforce.
Sensitivity of Mantel Haenszel Model and Rasch Model as Viewed From Sample Size
ALWI, IDRUS
2011-01-01
The aims of this research is to study the sensitivity comparison of Mantel Haenszel and Rasch Model for detection differential item functioning, observed from the sample size. These two differential item functioning (DIF) methods were compared using simulate binary item respon data sets of varying sample size, 200 and 400 examinees were used in the analyses, a detection method of differential item functioning (DIF) based on gender difference. These test conditions were replication 4 tim...
Estimating rare events in biochemical systems using conditional sampling
Sundar, V. S.
2017-01-01
The paper focuses on development of variance reduction strategies to estimate rare events in biochemical systems. Obtaining this probability using brute force Monte Carlo simulations in conjunction with the stochastic simulation algorithm (Gillespie's method) is computationally prohibitive. To circumvent this, important sampling tools such as the weighted stochastic simulation algorithm and the doubly weighted stochastic simulation algorithm have been proposed. However, these strategies require an additional step of determining the important region to sample from, which is not straightforward for most of the problems. In this paper, we apply the subset simulation method, developed as a variance reduction tool in the context of structural engineering, to the problem of rare event estimation in biochemical systems. The main idea is that the rare event probability is expressed as a product of more frequent conditional probabilities. These conditional probabilities are estimated with high accuracy using Monte Carlo simulations, specifically the Markov chain Monte Carlo method with the modified Metropolis-Hastings algorithm. Generating sample realizations of the state vector using the stochastic simulation algorithm is viewed as mapping the discrete-state continuous-time random process to the standard normal random variable vector. This viewpoint opens up the possibility of applying more sophisticated and efficient sampling schemes developed elsewhere to problems in stochastic chemical kinetics. The results obtained using the subset simulation method are compared with existing variance reduction strategies for a few benchmark problems, and a satisfactory improvement in computational time is demonstrated.
Effects of tree-to-tree variations on sap flux-based transpiration estimates in a forested watershed
Kume, Tomonori; Tsuruta, Kenji; Komatsu, Hikaru; Kumagai, Tomo'omi; Higashi, Naoko; Shinohara, Yoshinori; Otsuki, Kyoichi
2010-05-01
To estimate forest stand-scale water use, we assessed how sample sizes affect confidence of stand-scale transpiration (E) estimates calculated from sap flux (Fd) and sapwood area (AS_tree) measurements of individual trees. In a Japanese cypress plantation, we measured Fd and AS_tree in all trees (n = 58) within a 20 × 20 m study plot, which was divided into four 10 × 10 subplots. We calculated E from stand AS_tree (AS_stand) and mean stand Fd (JS) values. Using Monte Carlo analyses, we examined potential errors associated with sample sizes in E, AS_stand, and JS by using the original AS_tree and Fd data sets. Consequently, we defined optimal sample sizes of 10 and 15 for AS_stand and JS estimates, respectively, in the 20 × 20 m plot. Sample sizes greater than the optimal sample sizes did not decrease potential errors. The optimal sample sizes for JS changed according to plot size (e.g., 10 × 10 m and 10 × 20 m), while the optimal sample sizes for AS_stand did not. As well, the optimal sample sizes for JS did not change in different vapor pressure deficit conditions. In terms of E estimates, these results suggest that the tree-to-tree variations in Fd vary among different plots, and that plot size to capture tree-to-tree variations in Fd is an important factor. This study also discusses planning balanced sampling designs to extrapolate stand-scale estimates to catchment-scale estimates.
Sampling strategies for efficient estimation of tree foliage biomass
Hailemariam Temesgen; Vicente Monleon; Aaron Weiskittel; Duncan Wilson
2011-01-01
Conifer crowns can be highly variable both within and between trees, particularly with respect to foliage biomass and leaf area. A variety of sampling schemes have been used to estimate biomass and leaf area at the individual tree and stand scales. Rarely has the effectiveness of these sampling schemes been compared across stands or even across species. In addition,...
A method for estimating radioactive cesium concentrations in cattle blood using urine samples.
Sato, Itaru; Yamagishi, Ryoma; Sasaki, Jun; Satoh, Hiroshi; Miura, Kiyoshi; Kikuchi, Kaoru; Otani, Kumiko; Okada, Keiji
2017-12-01
In the region contaminated by the Fukushima nuclear accident, radioactive contamination of live cattle should be checked before slaughter. In this study, we establish a precise method for estimating radioactive cesium concentrations in cattle blood using urine samples. Blood and urine samples were collected from a total of 71 cattle on two farms in the 'difficult-to-return zone'. Urine 137 Cs, specific gravity, electrical conductivity, pH, sodium, potassium, calcium, and creatinine were measured and various estimation methods for blood 137 Cs were tested. The average error rate of the estimation was 54.2% without correction. Correcting for urine creatinine, specific gravity, electrical conductivity, or potassium improved the precision of the estimation. Correcting for specific gravity using the following formula gave the most precise estimate (average error rate = 16.9%): [blood 137 Cs] = [urinary 137 Cs]/([specific gravity] - 1)/329. Urine samples are faster to measure than blood samples because urine can be obtained in larger quantities and has a higher 137 Cs concentration than blood. These advantages of urine and the estimation precision demonstrated in our study, indicate that estimation of blood 137 Cs using urine samples is a practical means of monitoring radioactive contamination in live cattle. © 2017 Japanese Society of Animal Science.
Estimating drizzle drop size and precipitation rate using two-colour lidar measurements
Directory of Open Access Journals (Sweden)
C. D. Westbrook
2010-06-01
Full Text Available A method to estimate the size and liquid water content of drizzle drops using lidar measurements at two wavelengths is described. The method exploits the differential absorption of infrared light by liquid water at 905 nm and 1.5 μm, which leads to a different backscatter cross section for water drops larger than ≈50 μm. The ratio of backscatter measured from drizzle samples below cloud base at these two wavelengths (the colour ratio provides a measure of the median volume drop diameter D_{0}. This is a strong effect: for D_{0}=200 μm, a colour ratio of ≈6 dB is predicted. Once D_{0} is known, the measured backscatter at 905 nm can be used to calculate the liquid water content (LWC and other moments of the drizzle drop distribution.
The method is applied to observations of drizzle falling from stratocumulus and stratus clouds. High resolution (32 s, 36 m profiles of D_{0}, LWC and precipitation rate R are derived. The main sources of error in the technique are the need to assume a value for the dispersion parameter μ in the drop size spectrum (leading to at most a 35% error in R and the influence of aerosol returns on the retrieval (≈10% error in R for the cases considered here. Radar reflectivities are also computed from the lidar data, and compared to independent measurements from a colocated cloud radar, offering independent validation of the derived drop size distributions.
Directory of Open Access Journals (Sweden)
Martinásková Magdalena
2017-12-01
Full Text Available The article examines the use of Asymptotic Sampling (AS for the estimation of failure probability. The AS algorithm requires samples of multidimensional Gaussian random vectors, which may be obtained by many alternative means that influence the performance of the AS method. Several reliability problems (test functions have been selected in order to test AS with various sampling schemes: (i Monte Carlo designs; (ii LHS designs optimized using the Periodic Audze-Eglājs (PAE criterion; (iii designs prepared using Sobol’ sequences. All results are compared with the exact failure probability value.
Maximum likelihood estimation for Cox's regression model under nested case-control sampling
DEFF Research Database (Denmark)
Scheike, Thomas; Juul, Anders
2004-01-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazard...
Estimation of LOCA break size using cascaded Fuzzy neural networks
Energy Technology Data Exchange (ETDEWEB)
Choi, Geon Pil; Yoo, Kwae Hwan; Back, Ju Hyun; Na, Man Gyun [Dept. of Nuclear Engineering, Chosun University, Gwangju (Korea, Republic of)
2017-04-15
Operators of nuclear power plants may not be equipped with sufficient information during a loss-of-coolant accident (LOCA), which can be fatal, or they may not have sufficient time to analyze the information they do have, even if this information is adequate. It is not easy to predict the progression of LOCAs in nuclear power plants. Therefore, accurate information on the LOCA break position and size should be provided to efficiently manage the accident. In this paper, the LOCA break size is predicted using a cascaded fuzzy neural network (CFNN) model. The input data of the CFNN model are the time-integrated values of each measurement signal for an initial short-time interval after a reactor scram. The training of the CFNN model is accomplished by a hybrid method combined with a genetic algorithm and a least squares method. As a result, LOCA break size is estimated exactly by the proposed CFNN model.
Research Note Pilot survey to assess sample size for herbaceous ...
African Journals Online (AJOL)
A pilot survey to determine sub-sample size (number of point observations per plot) for herbaceous species composition assessments, using a wheel-point apparatus applying the nearest-plant method, was conducted. Three plots differing in species composition on the Zululand coastal plain were selected, and on each plot ...
Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.
2015-01-01
Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.
Replication Variance Estimation under Two-phase Sampling in the Presence of Non-response
Directory of Open Access Journals (Sweden)
Muqaddas Javed
2014-09-01
Full Text Available Kim and Yu (2011 discussed replication variance estimator for two-phase stratified sampling. In this paper estimators for mean have been proposed in two-phase stratified sampling for different situation of existence of non-response at first phase and second phase. The expressions of variances of these estimators have been derived. Furthermore, replication-based jackknife variance estimators of these variances have also been derived. Simulation study has been conducted to investigate the performance of the suggested estimators.
Xue, Ying; Ren, Yiping; Meng, Wenrong; Li, Long; Mao, Xia; Han, Dongyan; Ma, Qiuyun
2013-09-01
Cephalopods play key roles in global marine ecosystems as both predators and preys. Regressive estimation of original size and weight of cephalopod from beak measurements is a powerful tool of interrogating the feeding ecology of predators at higher trophic levels. In this study, regressive relationships among beak measurements and body length and weight were determined for an octopus species ( Octopus variabilis), an important endemic cephalopod species in the northwest Pacific Ocean. A total of 193 individuals (63 males and 130 females) were collected at a monthly interval from Jiaozhou Bay, China. Regressive relationships among 6 beak measurements (upper hood length, UHL; upper crest length, UCL; lower hood length, LHL; lower crest length, LCL; and upper and lower beak weights) and mantle length (ML), total length (TL) and body weight (W) were determined. Results showed that the relationships between beak size and TL and beak size and ML were linearly regressive, while those between beak size and W fitted a power function model. LHL and UCL were the most useful measurements for estimating the size and biomass of O. variabilis. The relationships among beak measurements and body length (either ML or TL) were not significantly different between two sexes; while those among several beak measurements (UHL, LHL and LBW) and body weight (W) were sexually different. Since male individuals of this species have a slightly greater body weight distribution than female individuals, the body weight was not an appropriate measurement for estimating size and biomass, especially when the sex of individuals in the stomachs of predators was unknown. These relationships provided essential information for future use in size and biomass estimation of O. variabilis, as well as the estimation of predator/prey size ratios in the diet of top predators.
Estimation of reference intervals from small samples: an example using canine plasma creatinine.
Geffré, A; Braun, J P; Trumel, C; Concordet, D
2009-12-01
According to international recommendations, reference intervals should be determined from at least 120 reference individuals, which often are impossible to achieve in veterinary clinical pathology, especially for wild animals. When only a small number of reference subjects is available, the possible bias cannot be known and the normality of the distribution cannot be evaluated. A comparison of reference intervals estimated by different methods could be helpful. The purpose of this study was to compare reference limits determined from a large set of canine plasma creatinine reference values, and large subsets of this data, with estimates obtained from small samples selected randomly. Twenty sets each of 120 and 27 samples were randomly selected from a set of 1439 plasma creatinine results obtained from healthy dogs in another study. Reference intervals for the whole sample and for the large samples were determined by a nonparametric method. The estimated reference limits for the small samples were minimum and maximum, mean +/- 2 SD of native and Box-Cox-transformed values, 2.5th and 97.5th percentiles by a robust method on native and Box-Cox-transformed values, and estimates from diagrams of cumulative distribution functions. The whole sample had a heavily skewed distribution, which approached Gaussian after Box-Cox transformation. The reference limits estimated from small samples were highly variable. The closest estimates to the 1439-result reference interval for 27-result subsamples were obtained by both parametric and robust methods after Box-Cox transformation but were grossly erroneous in some cases. For small samples, it is recommended that all values be reported graphically in a dot plot or histogram and that estimates of the reference limits be compared using different methods.
Graf, Alexandra C; Bauer, Peter; Glimm, Ekkehard; Koenig, Franz
2014-07-01
Sample size modifications in the interim analyses of an adaptive design can inflate the type 1 error rate, if test statistics and critical boundaries are used in the final analysis as if no modification had been made. While this is already true for designs with an overall change of the sample size in a balanced treatment-control comparison, the inflation can be much larger if in addition a modification of allocation ratios is allowed as well. In this paper, we investigate adaptive designs with several treatment arms compared to a single common control group. Regarding modifications, we consider treatment arm selection as well as modifications of overall sample size and allocation ratios. The inflation is quantified for two approaches: a naive procedure that ignores not only all modifications, but also the multiplicity issue arising from the many-to-one comparison, and a Dunnett procedure that ignores modifications, but adjusts for the initially started multiple treatments. The maximum inflation of the type 1 error rate for such types of design can be calculated by searching for the "worst case" scenarios, that are sample size adaptation rules in the interim analysis that lead to the largest conditional type 1 error rate in any point of the sample space. To show the most extreme inflation, we initially assume unconstrained second stage sample size modifications leading to a large inflation of the type 1 error rate. Furthermore, we investigate the inflation when putting constraints on the second stage sample sizes. It turns out that, for example fixing the sample size of the control group, leads to designs controlling the type 1 error rate. © 2014 The Author. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Identifying grain-size dependent errors on global forest area estimates and carbon studies
Daolan Zheng; Linda S. Heath; Mark J. Ducey
2008-01-01
Satellite-derived coarse-resolution data are typically used for conducting global analyses. But the forest areas estimated from coarse-resolution maps (e.g., 1 km) inevitably differ from a corresponding fine-resolution map (such as a 30-m map) that would be closer to ground truth. A better understanding of changes in grain size on area estimation will improve our...
International Nuclear Information System (INIS)
Wright, T.
1982-01-01
A new sampling procedure is introduced for estimating a population proportion. The procedure combines the ideas of inverse binomial sampling and Bernoulli sampling. An unbiased estimator is given with its variance. The procedure can be viewed as a generalization of inverse binomial sampling
Determining an Estimate of an Equivalence Relation for Moderate and Large Sized Sets
Directory of Open Access Journals (Sweden)
Leszek Klukowski
2017-01-01
Full Text Available This paper presents two approaches to determining estimates of an equivalence relation on the basis of pairwise comparisons with random errors. Obtaining such an estimate requires the solution of a discrete programming problem which minimizes the sum of the differences between the form of the relation and the comparisons. The problem is NP hard and can be solved with the use of exact algorithms for sets of moderate size, i.e. about 50 elements. In the case of larger sets, i.e. at least 200 comparisons for each element, it is necessary to apply heuristic algorithms. The paper presents results (a statistical preprocessing, which enable us to determine the optimal or a near-optimal solution with acceptable computational cost. They include: the development of a statistical procedure producing comparisons with low probabilities of errors and a heuristic algorithm based on such comparisons. The proposed approach guarantees the applicability of such estimators for any size of set. (original abstract
Norm Block Sample Sizes: A Review of 17 Individually Administered Intelligence Tests
Norfolk, Philip A.; Farmer, Ryan L.; Floyd, Randy G.; Woods, Isaac L.; Hawkins, Haley K.; Irby, Sarah M.
2015-01-01
The representativeness, recency, and size of norm samples strongly influence the accuracy of inferences drawn from their scores. Inadequate norm samples may lead to inflated or deflated scores for individuals and poorer prediction of developmental and academic outcomes. The purpose of this study was to apply Kranzler and Floyd's method for…
A Methodology to Estimate Ores Work Index Values, Using Miduk Copper Mine Sample
Directory of Open Access Journals (Sweden)
Mohammad Noaparast
2012-12-01
Full Text Available It is always attempted to reduce the costs of comminution in mineral processing plants. One of thedifficulties in size reduction section is not to be designed properly. The key factor to design size reductionunits such as crushers and grinding mills, is ore’s work index. The work index, wi, presents the oregrindability, and is used in Bond formula to calculate the required energy. Bond has defined a specificrelationship between some parameters which is applied to calculate wi, which are control screen, fineparticles produced, feed and product d80.In this research work, a high grade copper sample from Miduk copper concentrator was prepared, and itswork index values were experimentally estimated, using different control screens, 600, 425, 212, 150, 106and 75 microns. The obtained results from the tests showed two different behaviors in fine production.According to these two trends the required models were then defined to present the fine mass calculationusing control screen. In next step, an equation was presented in order to calculate Miduk copper ore workindex for any size. In addition to verify the model creditability, a test using 300 microns control screenwas performed and its result was compared with calculated ones using defined model, which showed agood fit. Finally the experimental and calculated values were compared and their relative error was equalto 4.11% which is an indication of good fit for the results.
Precision of quantization of the hall conductivity in a finite-size sample: Power law
International Nuclear Information System (INIS)
Greshnov, A. A.; Kolesnikova, E. N.; Zegrya, G. G.
2006-01-01
A microscopic calculation of the conductivity in the integer quantum Hall effect (IQHE) mode is carried out. The precision of quantization is analyzed for finite-size samples. The precision of quantization shows a power-law dependence on the sample size. A new scaling parameter describing this dependence is introduced. It is also demonstrated that the precision of quantization linearly depends on the ratio between the amplitude of the disorder potential and the cyclotron energy. The data obtained are compared with the results of magnetotransport measurements in mesoscopic samples
Directory of Open Access Journals (Sweden)
Steffen Oppel
2014-04-01
Full Text Available Population size assessments for nocturnal burrow-nesting seabirds are logistically challenging because these species are active in colonies only during darkness and often nest on remote islands where manual inspections of breeding burrows are not feasible. Many seabird species are highly vocal, and recent technological innovations now make it possible to record and quantify vocal activity in seabird colonies. Here we test the hypothesis that remotely recorded vocal activity in Cory’s shearwater (Calonectris borealis breeding colonies in the North Atlantic increases with nest density, and combined this relationship with cliff habitat mapping to estimate the population size of Cory’s shearwaters on the island of Corvo (Azores. We deployed acoustic recording devices in 9 Cory’s shearwater colonies of known size to establish a relationship between vocal activity and local nest density (slope = 1.07, R2 = 0.86, p < 0.001. We used this relationship to predict the nest density in various cliff habitat types and produced a habitat map of breeding cliffs to extrapolate nest density around the island of Corvo. The mean predicted nest density on Corvo ranged from 6.6 (2.1–16.2 to 27.8 (19.5–36.4 nests/ha. Extrapolation of habitat-specific nest densities across the cliff area of Corvo resulted in an estimate of 6326 Cory’s shearwater nests (95% confidence interval: 3735–10,524. This population size estimate is similar to previous assessments, but is too imprecise to detect moderate changes in population size over time. While estimating absolute population size from acoustic recordings may not be sufficiently precise, the strong positive relationship that we found between local nest density and recorded calling rate indicates that passive acoustic monitoring may be useful to document relative changes in seabird populations over time.
Sample size for monitoring sirex populations and their natural enemies
Directory of Open Access Journals (Sweden)
Susete do Rocio Chiarello Penteado
2016-09-01
Full Text Available The woodwasp Sirex noctilio Fabricius (Hymenoptera: Siricidae was introduced in Brazil in 1988 and became the main pest in pine plantations. It has spread to about 1.000.000 ha, at different population levels, in the states of Rio Grande do Sul, Santa Catarina, Paraná, São Paulo and Minas Gerais. Control is done mainly by using a nematode, Deladenus siricidicola Bedding (Nematoda: Neothylenchidae. The evaluation of the efficiency of natural enemies has been difficult because there are no appropriate sampling systems. This study tested a hierarchical sampling system to define the sample size to monitor the S. noctilio population and the efficiency of their natural enemies, which was found to be perfectly adequate.
Collection of size fractionated particulate matter sample for neutron activation analysis in Japan
International Nuclear Information System (INIS)
Otoshi, Tsunehiko; Nakamatsu, Hiroaki; Oura, Yasuji; Ebihara, Mitsuru
2004-01-01
According to the decision of the 2001 Workshop on Utilization of Research Reactor (Neutron Activation Analysis (NAA) Section), size fractionated particulate matter collection for NAA was started from 2002 at two sites in Japan. The two monitoring sites, ''Tokyo'' and ''Sakata'', were classified into ''urban'' and ''rural''. In each site, two size fractions, namely PM 2-10 '' and PM 2 '' particles (aerodynamic particle size between 2 to 10 micrometer and less than 2 micrometer, respectively) were collected every month on polycarbonate membrane filters. Average concentrations of PM 10 (sum of PM 2-10 and PM 2 samples) during the common sampling period of August to November 2002 in each site were 0.031mg/m 3 in Tokyo, and 0.022mg/m 3 in Sakata. (author)
van Hassel, Daniël; van der Velden, Lud; de Bakker, Dinny; van der Hoek, Lucas; Batenburg, Ronald
2017-12-04
Our research is based on a technique for time sampling, an innovative method for measuring the working hours of Dutch general practitioners (GPs), which was deployed in an earlier study. In this study, 1051 GPs were questioned about their activities in real time by sending them one SMS text message every 3 h during 1 week. The required sample size for this study is important for health workforce planners to know if they want to apply this method to target groups who are hard to reach or if fewer resources are available. In this time-sampling method, however, standard power analyses is not sufficient for calculating the required sample size as this accounts only for sample fluctuation and not for the fluctuation of measurements taken from every participant. We investigated the impact of the number of participants and frequency of measurements per participant upon the confidence intervals (CIs) for the hours worked per week. Statistical analyses of the time-use data we obtained from GPs were performed. Ninety-five percent CIs were calculated, using equations and simulation techniques, for various different numbers of GPs included in the dataset and for various frequencies of measurements per participant. Our res