WorldWideScience

Sample records for modeling sample size

  1. Sample size planning for classification models.

    Science.gov (United States)

    Beleites, Claudia; Neugebauer, Ute; Bocklitz, Thomas; Krafft, Christoph; Popp, Jürgen

    2013-01-14

    In biospectroscopy, suitably annotated and statistically independent samples (e.g. patients, batches, etc.) for classifier training and testing are scarce and costly. Learning curves show the model performance as function of the training sample size and can help to determine the sample size needed to train good classifiers. However, building a good model is actually not enough: the performance must also be proven. We discuss learning curves for typical small sample size situations with 5-25 independent samples per class. Although the classification models achieve acceptable performance, the learning curve can be completely masked by the random testing uncertainty due to the equally limited test sample size. In consequence, we determine test sample sizes necessary to achieve reasonable precision in the validation and find that 75-100 samples will usually be needed to test a good but not perfect classifier. Such a data set will then allow refined sample size planning on the basis of the achieved performance. We also demonstrate how to calculate necessary sample sizes in order to show the superiority of one classifier over another: this often requires hundreds of statistically independent test samples or is even theoretically impossible. We demonstrate our findings with a data set of ca. 2550 Raman spectra of single cells (five classes: erythrocytes, leukocytes and three tumour cell lines BT-20, MCF-7 and OCI-AML3) as well as by an extensive simulation that allows precise determination of the actual performance of the models in question. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. Sample size matters: Investigating the optimal sample size for a logistic regression debris flow susceptibility model

    Science.gov (United States)

    Heckmann, Tobias; Gegg, Katharina; Becht, Michael

    2013-04-01

    Statistical approaches to landslide susceptibility modelling on the catchment and regional scale are used very frequently compared to heuristic and physically based approaches. In the present study, we deal with the problem of the optimal sample size for a logistic regression model. More specifically, a stepwise approach has been chosen in order to select those independent variables (from a number of derivatives of a digital elevation model and landcover data) that explain best the spatial distribution of debris flow initiation zones in two neighbouring central alpine catchments in Austria (used mutually for model calculation and validation). In order to minimise problems arising from spatial autocorrelation, we sample a single raster cell from each debris flow initiation zone within an inventory. In addition, as suggested by previous work using the "rare events logistic regression" approach, we take a sample of the remaining "non-event" raster cells. The recommendations given in the literature on the size of this sample appear to be motivated by practical considerations, e.g. the time and cost of acquiring data for non-event cases, which do not apply to the case of spatial data. In our study, we aim at finding empirically an "optimal" sample size in order to avoid two problems: First, a sample too large will violate the independent sample assumption as the independent variables are spatially autocorrelated; hence, a variogram analysis leads to a sample size threshold above which the average distance between sampled cells falls below the autocorrelation range of the independent variables. Second, if the sample is too small, repeated sampling will lead to very different results, i.e. the independent variables and hence the result of a single model calculation will be extremely dependent on the choice of non-event cells. Using a Monte-Carlo analysis with stepwise logistic regression, 1000 models are calculated for a wide range of sample sizes. For each sample size

  3. Sample Size Determination for Rasch Model Tests

    Science.gov (United States)

    Draxler, Clemens

    2010-01-01

    This paper is concerned with supplementing statistical tests for the Rasch model so that additionally to the probability of the error of the first kind (Type I probability) the probability of the error of the second kind (Type II probability) can be controlled at a predetermined level by basing the test on the appropriate number of observations.…

  4. Sample Size Determination for Regression Models Using Monte Carlo Methods in R

    Science.gov (United States)

    Beaujean, A. Alexander

    2014-01-01

    A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…

  5. Effects of Sample Size, Estimation Methods, and Model Specification on Structural Equation Modeling Fit Indexes.

    Science.gov (United States)

    Fan, Xitao; Wang, Lin; Thompson, Bruce

    1999-01-01

    A Monte Carlo simulation study investigated the effects on 10 structural equation modeling fit indexes of sample size, estimation method, and model specification. Some fit indexes did not appear to be comparable, and it was apparent that estimation method strongly influenced almost all fit indexes examined, especially for misspecified models. (SLD)

  6. A Note on Sample Size and Solution Propriety for Confirmatory Factor Analytic Models

    Science.gov (United States)

    Jackson, Dennis L.; Voth, Jennifer; Frey, Marc P.

    2013-01-01

    Determining an appropriate sample size for use in latent variable modeling techniques has presented ongoing challenges to researchers. In particular, small sample sizes are known to present concerns over sampling error for the variances and covariances on which model estimation is based, as well as for fit indexes and convergence failures. The…

  7. Sample Size Considerations in Prevention Research Applications of Multilevel Modeling and Structural Equation Modeling.

    Science.gov (United States)

    Hoyle, Rick H; Gottfredson, Nisha C

    2015-10-01

    When the goal of prevention research is to capture in statistical models some measure of the dynamic complexity in structures and processes implicated in problem behavior and its prevention, approaches such as multilevel modeling (MLM) and structural equation modeling (SEM) are indicated. Yet the assumptions that must be satisfied if these approaches are to be used responsibly raise concerns regarding their use in prevention research involving smaller samples. In this article, we discuss in nontechnical terms the role of sample size in MLM and SEM and present findings from the latest simulation work on the performance of each approach at sample sizes typical of prevention research. For each statistical approach, we draw from extant simulation studies to establish lower bounds for sample size (e.g., MLM can be applied with as few as ten groups comprising ten members with normally distributed data, restricted maximum likelihood estimation, and a focus on fixed effects; sample sizes as small as N = 50 can produce reliable SEM results with normally distributed data and at least three reliable indicators per factor) and suggest strategies for making the best use of the modeling approach when N is near the lower bound.

  8. The attention-weighted sample-size model of visual short-term memory

    DEFF Research Database (Denmark)

    Smith, Philip L.; Lilburn, Simon D.; Corbett, Elaine A.

    2016-01-01

    exceeded that predicted by the sample-size model for both simultaneously and sequentially presented stimuli. Instead, the set-size effect and the serial position curves with sequential presentation were predicted by an attention-weighted version of the sample-size model, which assumes that one of the items......We investigated the capacity of visual short-term memory (VSTM) in a phase discrimination task that required judgments about the configural relations between pairs of black and white features. Sewell et al. (2014) previously showed that VSTM capacity in an orientation discrimination task was well...... described by a sample-size model, which views VSTM as a resource comprised of a finite number of noisy stimulus samples. The model predicts the invariance of ∑i(di ′)2, the sum of squared sensitivities across items, for displays of different sizes. For phase discrimination, the set-size effect significantly...

  9. Species sensitivity distribution for chlorpyrifos to aquatic organisms: Model choice and sample size.

    Science.gov (United States)

    Zhao, Jinsong; Chen, Boyu

    2016-03-01

    Species sensitivity distribution (SSD) is a widely used model that extrapolates the ecological risk to ecosystem levels from the ecotoxicity of a chemical to individual organisms. However, model choice and sample size significantly affect the development of the SSD model and the estimation of hazardous concentrations at the 5th centile (HC5). To interpret their effects, the SSD model for chlorpyrifos, a widely used organophosphate pesticide, to aquatic organisms is presented with emphases on model choice and sample size. Three subsets of median effective concentration (EC50) with different sample sizes were obtained from ECOTOX and used to build SSD models based on parametric distribution (normal, logistic, and triangle distribution) and nonparametric bootstrap. The SSD models based on the triangle distribution are superior to the normal and logistic distributions according to several goodness-of-fit techniques. Among all parametric SSD models, the one with the largest sample size based on the triangle distribution gives the most strict HC5 with 0.141μmolL(-1). The HC5 derived from the nonparametric bootstrap is 0.159μmol L(-1). The minimum sample size required to build a stable SSD model is 11 based on parametric distribution and 23 based on nonparametric bootstrap. The study suggests that model choice and sample size are important sources of uncertainty for application of the SSD model. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Sample size methodology

    CERN Document Server

    Desu, M M

    2012-01-01

    One of the most important problems in designing an experiment or a survey is sample size determination and this book presents the currently available methodology. It includes both random sampling from standard probability distributions and from finite populations. Also discussed is sample size determination for estimating parameters in a Bayesian setting by considering the posterior distribution of the parameter and specifying the necessary requirements. The determination of the sample size is considered for ranking and selection problems as well as for the design of clinical trials. Appropria

  11. Small size sampling

    Directory of Open Access Journals (Sweden)

    Rakesh R. Pathak

    2012-02-01

    Full Text Available Based on the law of large numbers which is derived from probability theory, we tend to increase the sample size to the maximum. Central limit theorem is another inference from the same probability theory which approves largest possible number as sample size for better validity of measuring central tendencies like mean and median. Sometimes increase in sample-size turns only into negligible betterment or there is no increase at all in statistical relevance due to strong dependence or systematic error. If we can afford a little larger sample, statistically power of 0.90 being taken as acceptable with medium Cohen's d (<0.5 and for that we can take a sample size of 175 very safely and considering problem of attrition 200 samples would suffice. [Int J Basic Clin Pharmacol 2012; 1(1.000: 43-44

  12. Regularization Methods for Fitting Linear Models with Small Sample Sizes: Fitting the Lasso Estimator Using R

    Science.gov (United States)

    Finch, W. Holmes; Finch, Maria E. Hernandez

    2016-01-01

    Researchers and data analysts are sometimes faced with the problem of very small samples, where the number of variables approaches or exceeds the overall sample size; i.e. high dimensional data. In such cases, standard statistical models such as regression or analysis of variance cannot be used, either because the resulting parameter estimates…

  13. A preliminary model to avoid the overestimation of sample size in bioequivalence studies.

    Science.gov (United States)

    Ramírez, E; Abraira, V; Guerra, P; Borobia, A M; Duque, B; López, J L; Mosquera, B; Lubomirov, R; Carcas, A J; Frías, J

    2013-02-01

    Often the only available data in literature for sample size estimations in bioequivalence studies is intersubject variability, which tends to result in overestimation of sample size. In this paper, we proposed a preliminary model of intrasubject variability based on intersubject variability for Cmax and AUC data from randomized, crossovers, bioequivalence (BE) studies. From 93 Cmax and 121 AUC data from test-reference comparisons that fulfilled BE criteria, we calculated intersubject variability for the reference formulation and intrasubject variability from ANOVA. Lineal and exponential models (y=a(1-e-bx)) were fitted weighted by the inverse of the variance, to predict the intrasubject variability based on intersubject variability. To validate the model we calculated the coefficient of cross-validation of data from 30 new BE studies. The models fit very well (R2=0.997 and 0.990 for Cmax and AUC respectively) and the cross-validation correlation were 0.847 for Cmax and 0.572 for AUC. A preliminary model analyses allow us to estimate the intrasubject variability based on intersubject variability for sample size calculation purposes in BE studies. This approximation provides an opportunity for sample size reduction avoiding unnecessary exposure of healthy volunteers. Further modelling studies are desirable to confirm these results especially suggestions of the higher intersubject variability range.

  14. Practical approach to determine sample size for building logistic prediction models using high-throughput data.

    Science.gov (United States)

    Son, Dae-Soon; Lee, DongHyuk; Lee, Kyusang; Jung, Sin-Ho; Ahn, Taejin; Lee, Eunjin; Sohn, Insuk; Chung, Jongsuk; Park, Woongyang; Huh, Nam; Lee, Jae Won

    2015-02-01

    An empirical method of sample size determination for building prediction models was proposed recently. Permutation method which is used in this procedure is a commonly used method to address the problem of overfitting during cross-validation while evaluating the performance of prediction models constructed from microarray data. But major drawback of such methods which include bootstrapping and full permutations is prohibitively high cost of computation required for calculating the sample size. In this paper, we propose that a single representative null distribution can be used instead of a full permutation by using both simulated and real data sets. During simulation, we have used a dataset with zero effect size and confirmed that the empirical type I error approaches to 0.05. Hence this method can be confidently applied to reduce overfitting problem during cross-validation. We have observed that pilot data set generated by random sampling from real data could be successfully used for sample size determination. We present our results using an experiment that was repeated for 300 times while producing results comparable to that of full permutation method. Since we eliminate full permutation, sample size estimation time is not a function of pilot data size. In our experiment we have observed that this process takes around 30min. With the increasing number of clinical studies, developing efficient sample size determination methods for building prediction models is critical. But empirical methods using bootstrap and permutation usually involve high computing costs. In this study, we propose a method that can reduce required computing time drastically by using representative null distribution of permutations. We use data from pilot experiments to apply this method for designing clinical studies efficiently for high throughput data.

  15. Sample size for collecting germplasms – a polyploid model with mixed mating system

    Indian Academy of Sciences (India)

    R L Sapra; Prem Narain; S V S Chauhan; S K Lal; B B Singh

    2003-03-01

    The present paper discusses a general expression for determining the minimum sample size (plants) for a given number of seeds or vice versa for capturing multiple allelic diversity. The model considers sampling from a large 2 k-ploid population under a broad range of mating systems. Numerous expressions/results developed for germplasm collection/regeneration for diploid populations by earlier workers can be directly deduced from our general expression by assigning appropriate values of the corresponding parameters. A seed factor which influences the plant sample size has also been isolated to aid the collectors in selecting the appropriate combination of number of plants and seeds per plant. When genotypic multiplicity of seeds is taken into consideration, a sample size of even less than 172 plants can conserve diversity of 20 alleles from 50,000 polymorphic loci with a very large probability of conservation (0.9999) in most of the cases.

  16. Using Structural Equation Modeling to Assess Functional Connectivity in the Brain: Power and Sample Size Considerations

    Science.gov (United States)

    Sideridis, Georgios; Simos, Panagiotis; Papanicolaou, Andrew; Fletcher, Jack

    2014-01-01

    The present study assessed the impact of sample size on the power and fit of structural equation modeling applied to functional brain connectivity hypotheses. The data consisted of time-constrained minimum norm estimates of regional brain activity during performance of a reading task obtained with magnetoencephalography. Power analysis was first…

  17. A simulation study of sample size for multilevel logistic regression models

    Directory of Open Access Journals (Sweden)

    Moineddin Rahim

    2007-07-01

    Full Text Available Abstract Background Many studies conducted in health and social sciences collect individual level data as outcome measures. Usually, such data have a hierarchical structure, with patients clustered within physicians, and physicians clustered within practices. Large survey data, including national surveys, have a hierarchical or clustered structure; respondents are naturally clustered in geographical units (e.g., health regions and may be grouped into smaller units. Outcomes of interest in many fields not only reflect continuous measures, but also binary outcomes such as depression, presence or absence of a disease, and self-reported general health. In the framework of multilevel studies an important problem is calculating an adequate sample size that generates unbiased and accurate estimates. Methods In this paper simulation studies are used to assess the effect of varying sample size at both the individual and group level on the accuracy of the estimates of the parameters and variance components of multilevel logistic regression models. In addition, the influence of prevalence of the outcome and the intra-class correlation coefficient (ICC is examined. Results The results show that the estimates of the fixed effect parameters are unbiased for 100 groups with group size of 50 or higher. The estimates of the variance covariance components are slightly biased even with 100 groups and group size of 50. The biases for both fixed and random effects are severe for group size of 5. The standard errors for fixed effect parameters are unbiased while for variance covariance components are underestimated. Results suggest that low prevalent events require larger sample sizes with at least a minimum of 100 groups and 50 individuals per group. Conclusion We recommend using a minimum group size of 50 with at least 50 groups to produce valid estimates for multi-level logistic regression models. Group size should be adjusted under conditions where the prevalence

  18. Resampling: An improvement of importance sampling in varying population size models.

    Science.gov (United States)

    Merle, C; Leblois, R; Rousset, F; Pudlo, P

    2017-04-01

    Sequential importance sampling algorithms have been defined to estimate likelihoods in models of ancestral population processes. However, these algorithms are based on features of the models with constant population size, and become inefficient when the population size varies in time, making likelihood-based inferences difficult in many demographic situations. In this work, we modify a previous sequential importance sampling algorithm to improve the efficiency of the likelihood estimation. Our procedure is still based on features of the model with constant size, but uses a resampling technique with a new resampling probability distribution depending on the pairwise composite likelihood. We tested our algorithm, called sequential importance sampling with resampling (SISR) on simulated data sets under different demographic cases. In most cases, we divided the computational cost by two for the same accuracy of inference, in some cases even by one hundred. This study provides the first assessment of the impact of such resampling techniques on parameter inference using sequential importance sampling, and extends the range of situations where likelihood inferences can be easily performed.

  19. Sample Size Requirements for Estimation of Item Parameters in the Multidimensional Graded Response Model

    Directory of Open Access Journals (Sweden)

    Shengyu eJiang

    2016-02-01

    Full Text Available Likert types of rating scales in which a respondent chooses a response from an ordered set of response options are used to measure a wide variety of psychological, educational, and medical outcome variables. The most appropriate item response theory model for analyzing and scoring these instruments when they provide scores on multiple scales is the multidimensional graded response model (MGRM. A simulation study was conducted to investigate the variables that might affect item parameter recovery for the MGRM. Data were generated based on different sample sizes, test lengths, and scale intercorrelations. Parameter estimates were obtained through the flexiMIRT software. The quality of parameter recovery was assessed by the correlation between true and estimated parameters as well as bias and root- mean-square-error. Results indicated that for the vast majority of cases studied a sample size of N = 500 provided accurate parameter estimates, except for tests with 240 items when 1,000 examinees were necessary to obtain accurate parameter estimates. Increasing sample size beyond N = 1,000 did not increase the accuracy of MGRM parameter estimates.

  20. Percolation segregation in multi-size and multi-component particulate mixtures: Measurement, sampling, and modeling

    Science.gov (United States)

    Jha, Anjani K.

    Particulate materials are routinely handled in large quantities by industries such as, agriculture, electronic, ceramic, chemical, cosmetic, fertilizer, food, nutraceutical, pharmaceutical, power, and powder metallurgy. These industries encounter segregation due to the difference in physical and mechanical properties of particulates. The general goal of this research was to study percolation segregation in multi-size and multi-component particulate mixtures, especially measurement, sampling, and modeling. A second generation primary segregation shear cell (PSSC-II), an industrial vibrator, a true cubical triaxial tester, and two samplers (triers) were used as primary test apparatuses for quantifying segregation and flowability; furthermore, to understand and propose strategies to mitigate segregation in particulates. Toward this end, percolation segregation in binary, ternary, and quaternary size mixtures for two particulate types: urea (spherical) and potash (angular) were studied. Three coarse size ranges 3,350-4,000 mum (mean size = 3,675 mum), 2,800-3,350 mum (3,075 mum), and 2,360-2,800 mum (2,580 mum) and three fines size ranges 2,000-2,360 mum (2,180 mum), 1,700-2,000 mum (1,850 mum), and 1,400-1,700 mum (1,550 mum) for angular-shaped and spherical-shaped were selected for tests. Since the fines size 1,550 mum of urea was not available in sufficient quantity; therefore, it was not included in tests. Percolation segregation in fertilizer bags was tested also at two vibration frequencies of 5 Hz and 7Hz. The segregation and flowability of binary mixtures of urea under three equilibrium relative humidities (40%, 50%, and 60%) were also tested. Furthermore, solid fertilizer sampling was performed to compare samples obtained from triers of opening widths 12.7 mm and 19.1 mm and to determine size segregation in blend fertilizers. Based on experimental results, the normalized segregation rate (NSR) of binary mixtures was dependent on size ratio, mixing ratio

  1. A Comparative Study of Power and Sample Size Calculations for Multivariate General Linear Models

    Science.gov (United States)

    Shieh, Gwowen

    2003-01-01

    Repeated measures and longitudinal studies arise often in social and behavioral science research. During the planning stage of such studies, the calculations of sample size are of particular interest to the investigators and should be an integral part of the research projects. In this article, we consider the power and sample size calculations for…

  2. Regularization Methods for Fitting Linear Models with Small Sample Sizes: Fitting the Lasso Estimator Using R

    Directory of Open Access Journals (Sweden)

    W. Holmes Finch

    2016-05-01

    Full Text Available Researchers and data analysts are sometimes faced with the problem of very small samples, where the number of variables approaches or exceeds the overall sample size; i.e. high dimensional data. In such cases, standard statistical models such as regression or analysis of variance cannot be used, either because the resulting parameter estimates exhibit very high variance and can therefore not be trusted, or because the statistical algorithm cannot converge on parameter estimates at all. There exist an alternative set of model estimation procedures, known collectively as regularization methods, which can be used in such circumstances, and which have been shown through simulation research to yield accurate parameter estimates. The purpose of this paper is to describe, for those unfamiliar with them, the most popular of these regularization methods, the lasso, and to demonstrate its use on an actual high dimensional dataset involving adults with autism, using the R software language. Results of analyses involving relating measures of executive functioning with a full scale intelligence test score are presented, and implications of using these models are discussed.

  3. Sample size determination and power

    CERN Document Server

    Ryan, Thomas P, Jr

    2013-01-01

    THOMAS P. RYAN, PhD, teaches online advanced statistics courses for Northwestern University and The Institute for Statistics Education in sample size determination, design of experiments, engineering statistics, and regression analysis.

  4. Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model.

    Science.gov (United States)

    Draxler, Clemens; Alexandrowicz, Rainer W

    2015-12-01

    This paper refers to the exponential family of probability distributions and the conditional maximum likelihood (CML) theory. It is concerned with the determination of the sample size for three groups of tests of linear hypotheses, known as the fundamental trinity of Wald, score, and likelihood ratio tests. The main practical purpose refers to the special case of tests of the class of Rasch models. The theoretical background is discussed and the formal framework for sample size calculations is provided, given a predetermined deviation from the model to be tested and the probabilities of the errors of the first and second kinds.

  5. Sample Size and Statistical Conclusions from Tests of Fit to the Rasch Model According to the Rasch Unidimensional Measurement Model (Rumm) Program in Health Outcome Measurement.

    Science.gov (United States)

    Hagell, Peter; Westergren, Albert

    Sample size is a major factor in statistical null hypothesis testing, which is the basis for many approaches to testing Rasch model fit. Few sample size recommendations for testing fit to the Rasch model concern the Rasch Unidimensional Measurement Models (RUMM) software, which features chi-square and ANOVA/F-ratio based fit statistics, including Bonferroni and algebraic sample size adjustments. This paper explores the occurrence of Type I errors with RUMM fit statistics, and the effects of algebraic sample size adjustments. Data with simulated Rasch model fitting 25-item dichotomous scales and sample sizes ranging from N = 50 to N = 2500 were analysed with and without algebraically adjusted sample sizes. Results suggest the occurrence of Type I errors with N less then or equal to 500, and that Bonferroni correction as well as downward algebraic sample size adjustment are useful to avoid such errors, whereas upward adjustment of smaller samples falsely signal misfit. Our observations suggest that sample sizes around N = 250 to N = 500 may provide a good balance for the statistical interpretation of the RUMM fit statistics studied here with respect to Type I errors and under the assumption of Rasch model fit within the examined frame of reference (i.e., about 25 item parameters well targeted to the sample).

  6. Sample Sizes Required to Detect Interactions between Two Binary Fixed-Effects in a Mixed-Effects Linear Regression Model.

    Science.gov (United States)

    Leon, Andrew C; Heo, Moonseong

    2009-01-15

    Mixed-effects linear regression models have become more widely used for analysis of repeatedly measured outcomes in clinical trials over the past decade. There are formulae and tables for estimating sample sizes required to detect the main effects of treatment and the treatment by time interactions for those models. A formula is proposed to estimate the sample size required to detect an interaction between two binary variables in a factorial design with repeated measures of a continuous outcome. The formula is based, in part, on the fact that the variance of an interaction is fourfold that of the main effect. A simulation study examines the statistical power associated with the resulting sample sizes in a mixed-effects linear regression model with a random intercept. The simulation varies the magnitude (Δ) of the standardized main effects and interactions, the intraclass correlation coefficient (ρ ), and the number (k) of repeated measures within-subject. The results of the simulation study verify that the sample size required to detect a 2 × 2 interaction in a mixed-effects linear regression model is fourfold that to detect a main effect of the same magnitude.

  7. Size definitions for particle sampling

    Energy Technology Data Exchange (ETDEWEB)

    1981-05-01

    The recommendations of an ad hoc working group appointed by Committee TC 146 of the International Standards Organization on size definitions for particle sampling are reported. The task of the group was to collect the various definitions of 'respirable dust' and to propose a practical definition on recommendations for handling standardization on this matter. One of two proposed cut-sizes in regard to division at the larynx will be adopted after a ballot.

  8. The Effect of Small Sample Size on Measurement Equivalence of Psychometric Questionnaires in MIMIC Model: A Simulation Study

    Directory of Open Access Journals (Sweden)

    Jamshid Jamali

    2017-01-01

    Full Text Available Evaluating measurement equivalence (also known as differential item functioning (DIF is an important part of the process of validating psychometric questionnaires. This study aimed at evaluating the multiple indicators multiple causes (MIMIC model for DIF detection when latent construct distribution is nonnormal and the focal group sample size is small. In this simulation-based study, Type I error rates and power of MIMIC model for detecting uniform-DIF were investigated under different combinations of reference to focal group sample size ratio, magnitude of the uniform-DIF effect, scale length, the number of response categories, and latent trait distribution. Moderate and high skewness in the latent trait distribution led to a decrease of 0.33% and 0.47% power of MIMIC model for detecting uniform-DIF, respectively. The findings indicated that, by increasing the scale length, the number of response categories and magnitude DIF improved the power of MIMIC model, by 3.47%, 4.83%, and 20.35%, respectively; it also decreased Type I error of MIMIC approach by 2.81%, 5.66%, and 0.04%, respectively. This study revealed that power of MIMIC model was at an acceptable level when latent trait distributions were skewed. However, empirical Type I error rate was slightly greater than nominal significance level. Consequently, the MIMIC was recommended for detection of uniform-DIF when latent construct distribution is nonnormal and the focal group sample size is small.

  9. Reliable calculation in probabilistic logic: Accounting for small sample size and model uncertainty

    Energy Technology Data Exchange (ETDEWEB)

    Ferson, S. [Applied Biomathematics, Setauket, NY (United States)

    1996-12-31

    A variety of practical computational problems arise in risk and safety assessments, forensic statistics and decision analyses in which the probability of some event or proposition E is to be estimated from the probabilities of a finite list of related subevents or propositions F,G,H,.... In practice, the analyst`s knowledge may be incomplete in two ways. First, the probabilities of the subevents may be imprecisely known from statistical estimations, perhaps based on very small sample sizes. Second, relationships among the subevents may be known imprecisely. For instance, there may be only limited information about their stochastic dependencies. Representing probability estimates as interval ranges on has been suggested as a way to address the first source of imprecision. A suite of AND, OR and NOT operators defined with reference to the classical Frochet inequalities permit these probability intervals to be used in calculations that address the second source of imprecision, in many cases, in a best possible way. Using statistical confidence intervals as inputs unravels the closure properties of this approach however, requiring that probability estimates be characterized by a nested stack of intervals for all possible levels of statistical confidence, from a point estimate (0% confidence) to the entire unit interval (100% confidence). The corresponding logical operations implied by convolutive application of the logical operators for every possible pair of confidence intervals reduces by symmetry to a manageably simple level-wise iteration. The resulting calculus can be implemented in software that allows users to compute comprehensive and often level-wise best possible bounds on probabilities for logical functions of events.

  10. [Clinical research V. Sample size].

    Science.gov (United States)

    Talavera, Juan O; Rivas-Ruiz, Rodolfo; Bernal-Rosales, Laura Paola

    2011-01-01

    In clinical research it is impossible and inefficient to study all patients with a specific pathology, so it is necessary to study a sample of them. The estimation of the sample size before starting a study guarantees the stability of the results and allows us to foresee the feasibility of the study depending on the availability of patients and cost. The basic structure of sample size estimation is based on the premise that seeks to demonstrate, among other cases, that the observed difference between two or more maneuvers in the subsequent state is real. Initially, it requires knowing the value of the expected difference (δ) and its data variation (standard deviation). These data are usually obtained from previous studies. Then, other components must be considered: a (alpha), percentage of error in the assertion that the difference between means is real, usually 5 %; and β, error rate accepting the claim that the no-difference between the means is real, usually ranging from 15 to 20 %. Finally, these values are substituted into the formula or in an electronic program for estimating sample size. While summary and dispersion measures vary with the type of variable according to the outcome, the basic structure is the same.

  11. A convenient method and numerical tables for sample size determination in longitudinal-experimental research using multilevel models.

    Science.gov (United States)

    Usami, Satoshi

    2014-12-01

    Recent years have shown increased awareness of the importance of sample size determination in experimental research. Yet effective and convenient methods for sample size determination, especially in longitudinal experimental design, are still under development, and application of power analysis in applied research remains limited. This article presents a convenient method for sample size determination in longitudinal experimental research using a multilevel model. A fundamental idea of this method is transformation of model parameters (level 1 error variance [σ(2)], level 2 error variances [τ 00, τ 11] and its covariance [τ 01, τ 10], and a parameter representing experimental effect [δ]) into indices (reliability of measurement at the first time point [ρ 1], effect size at the last time point [Δ T ], proportion of variance of outcomes between the first and the last time points [k], and level 2 error correlation [r]) that are intuitively understandable and easily specified. To foster more convenient use of power analysis, numerical tables are constructed that refer to ANOVA results to investigate the influence on statistical power by respective indices.

  12. On the importance of accounting for competing risks in pediatric brain cancer: II. Regression modeling and sample size.

    Science.gov (United States)

    Tai, Bee-Choo; Grundy, Richard; Machin, David

    2011-03-15

    To accurately model the cumulative need for radiotherapy in trials designed to delay or avoid irradiation among children with malignant brain tumor, it is crucial to account for competing events and evaluate how each contributes to the timing of irradiation. An appropriate choice of statistical model is also important for adequate determination of sample size. We describe the statistical modeling of competing events (A, radiotherapy after progression; B, no radiotherapy after progression; and C, elective radiotherapy) using proportional cause-specific and subdistribution hazard functions. The procedures of sample size estimation based on each method are outlined. These are illustrated by use of data comparing children with ependymoma and other malignant brain tumors. The results from these two approaches are compared. The cause-specific hazard analysis showed a reduction in hazards among infants with ependymoma for all event types, including Event A (adjusted cause-specific hazard ratio, 0.76; 95% confidence interval, 0.45-1.28). Conversely, the subdistribution hazard analysis suggested an increase in hazard for Event A (adjusted subdistribution hazard ratio, 1.35; 95% confidence interval, 0.80-2.30), but the reduction in hazards for Events B and C remained. Analysis based on subdistribution hazard requires a larger sample size than the cause-specific hazard approach. Notable differences in effect estimates and anticipated sample size were observed between methods when the main event showed a beneficial effect whereas the competing events showed an adverse effect on the cumulative incidence. The subdistribution hazard is the most appropriate for modeling treatment when its effects on both the main and competing events are of interest. Copyright © 2011 Elsevier Inc. All rights reserved.

  13. Sample size calculation based on generalized linear models for differential expression analysis in RNA-seq data.

    Science.gov (United States)

    Li, Chung-I; Shyr, Yu

    2016-12-01

    As RNA-seq rapidly develops and costs continually decrease, the quantity and frequency of samples being sequenced will grow exponentially. With proteomic investigations becoming more multivariate and quantitative, determining a study's optimal sample size is now a vital step in experimental design. Current methods for calculating a study's required sample size are mostly based on the hypothesis testing framework, which assumes each gene count can be modeled through Poisson or negative binomial distributions; however, these methods are limited when it comes to accommodating covariates. To address this limitation, we propose an estimating procedure based on the generalized linear model. This easy-to-use method constructs a representative exemplary dataset and estimates the conditional power, all without requiring complicated mathematical approximations or formulas. Even more attractive, the downstream analysis can be performed with current R/Bioconductor packages. To demonstrate the practicability and efficiency of this method, we apply it to three real-world studies, and introduce our on-line calculator developed to determine the optimal sample size for a RNA-seq study.

  14. Sample Size Limits for Estimating Upper Level Mediation Models Using Multilevel SEM

    Science.gov (United States)

    Li, Xin; Beretvas, S. Natasha

    2013-01-01

    This simulation study investigated use of the multilevel structural equation model (MLSEM) for handling measurement error in both mediator and outcome variables ("M" and "Y") in an upper level multilevel mediation model. Mediation and outcome variable indicators were generated with measurement error. Parameter and standard…

  15. Sufficient Sample Size and Power in Multilevel Ordinal Logistic Regression Models

    Directory of Open Access Journals (Sweden)

    Sabz Ali

    2016-01-01

    Full Text Available For most of the time, biomedical researchers have been dealing with ordinal outcome variable in multilevel models where patients are nested in doctors. We can justifiably apply multilevel cumulative logit model, where the outcome variable represents the mild, severe, and extremely severe intensity of diseases like malaria and typhoid in the form of ordered categories. Based on our simulation conditions, Maximum Likelihood (ML method is better than Penalized Quasilikelihood (PQL method in three-category ordinal outcome variable. PQL method, however, performs equally well as ML method where five-category ordinal outcome variable is used. Further, to achieve power more than 0.80, at least 50 groups are required for both ML and PQL methods of estimation. It may be pointed out that, for five-category ordinal response variable model, the power of PQL method is slightly higher than the power of ML method.

  16. Sufficient Sample Size and Power in Multilevel Ordinal Logistic Regression Models

    Science.gov (United States)

    Ali, Amjad; Khan, Sajjad Ahmad; Hussain, Sundas

    2016-01-01

    For most of the time, biomedical researchers have been dealing with ordinal outcome variable in multilevel models where patients are nested in doctors. We can justifiably apply multilevel cumulative logit model, where the outcome variable represents the mild, severe, and extremely severe intensity of diseases like malaria and typhoid in the form of ordered categories. Based on our simulation conditions, Maximum Likelihood (ML) method is better than Penalized Quasilikelihood (PQL) method in three-category ordinal outcome variable. PQL method, however, performs equally well as ML method where five-category ordinal outcome variable is used. Further, to achieve power more than 0.80, at least 50 groups are required for both ML and PQL methods of estimation. It may be pointed out that, for five-category ordinal response variable model, the power of PQL method is slightly higher than the power of ML method.

  17. Reduction of sample size requirements by bilateral versus unilateral research designs in animal models for cartilage tissue engineering.

    Science.gov (United States)

    Orth, Patrick; Zurakowski, David; Alini, Mauro; Cucchiarini, Magali; Madry, Henning

    2013-11-01

    Advanced tissue engineering approaches for articular cartilage repair in the knee joint rely on translational animal models. In these investigations, cartilage defects may be established either in one joint (unilateral design) or in both joints of the same animal (bilateral design). We hypothesized that a lower intraindividual variability following the bilateral strategy would reduce the number of required joints. Standardized osteochondral defects were created in the trochlear groove of 18 rabbits. In 12 animals, defects were produced unilaterally (unilateral design; n=12 defects), while defects were created bilaterally in 6 animals (bilateral design; n=12 defects). After 3 weeks, osteochondral repair was evaluated histologically applying an established grading system. Based on intra- and interindividual variabilities, required sample sizes for the detection of discrete differences in the histological score were determined for both study designs (α=0.05, β=0.20). Coefficients of variation (%CV) of the total histological score values were 1.9-fold increased following the unilateral design when compared with the bilateral approach (26 versus 14%CV). The resulting numbers of joints needed to treat were always higher for the unilateral design, resulting in an up to 3.9-fold increase in the required number of experimental animals. This effect was most pronounced for the detection of small-effect sizes and estimating large standard deviations. The data underline the possible benefit of bilateral study designs for the decrease of sample size requirements for certain investigations in articular cartilage research. These findings might also be transferred to other scoring systems, defect types, or translational animal models in the field of cartilage tissue engineering.

  18. Predicting sample size required for classification performance

    Directory of Open Access Journals (Sweden)

    Figueroa Rosa L

    2012-02-01

    Full Text Available Abstract Background Supervised learning methods need annotated data in order to generate efficient models. Annotated data, however, is a relatively scarce resource and can be expensive to obtain. For both passive and active learning methods, there is a need to estimate the size of the annotated sample required to reach a performance target. Methods We designed and implemented a method that fits an inverse power law model to points of a given learning curve created using a small annotated training set. Fitting is carried out using nonlinear weighted least squares optimization. The fitted model is then used to predict the classifier's performance and confidence interval for larger sample sizes. For evaluation, the nonlinear weighted curve fitting method was applied to a set of learning curves generated using clinical text and waveform classification tasks with active and passive sampling methods, and predictions were validated using standard goodness of fit measures. As control we used an un-weighted fitting method. Results A total of 568 models were fitted and the model predictions were compared with the observed performances. Depending on the data set and sampling method, it took between 80 to 560 annotated samples to achieve mean average and root mean squared error below 0.01. Results also show that our weighted fitting method outperformed the baseline un-weighted method (p Conclusions This paper describes a simple and effective sample size prediction algorithm that conducts weighted fitting of learning curves. The algorithm outperformed an un-weighted algorithm described in previous literature. It can help researchers determine annotation sample size for supervised machine learning.

  19. The attention-weighted sample-size model of visual short-term memory: Attention capture predicts resource allocation and memory load.

    Science.gov (United States)

    Smith, Philip L; Lilburn, Simon D; Corbett, Elaine A; Sewell, David K; Kyllingsbæk, Søren

    2016-09-01

    We investigated the capacity of visual short-term memory (VSTM) in a phase discrimination task that required judgments about the configural relations between pairs of black and white features. Sewell et al. (2014) previously showed that VSTM capacity in an orientation discrimination task was well described by a sample-size model, which views VSTM as a resource comprised of a finite number of noisy stimulus samples. The model predicts the invariance of [Formula: see text] , the sum of squared sensitivities across items, for displays of different sizes. For phase discrimination, the set-size effect significantly exceeded that predicted by the sample-size model for both simultaneously and sequentially presented stimuli. Instead, the set-size effect and the serial position curves with sequential presentation were predicted by an attention-weighted version of the sample-size model, which assumes that one of the items in the display captures attention and receives a disproportionate share of resources. The choice probabilities and response time distributions from the task were well described by a diffusion decision model in which the drift rates embodied the assumptions of the attention-weighted sample-size model.

  20. Sample size in qualitative interview studies

    DEFF Research Database (Denmark)

    Malterud, Kirsti; Siersma, Volkert Dirk; Guassora, Ann Dorrit Kristiane

    2016-01-01

    Sample sizes must be ascertained in qualitative studies like in quantitative studies but not by the same means. The prevailing concept for sample size in qualitative studies is “saturation.” Saturation is closely tied to a specific methodology, and the term is inconsistently applied. We propose...... the concept “information power” to guide adequate sample size for qualitative studies. Information power indicates that the more information the sample holds, relevant for the actual study, the lower amount of participants is needed. We suggest that the size of a sample with sufficient information power...... and during data collection of a qualitative study is discussed....

  1. How sample size influences research outcomes

    Directory of Open Access Journals (Sweden)

    Jorge Faber

    2014-08-01

    Full Text Available Sample size calculation is part of the early stages of conducting an epidemiological, clinical or lab study. In preparing a scientific paper, there are ethical and methodological indications for its use. Two investigations conducted with the same methodology and achieving equivalent results, but different only in terms of sample size, may point the researcher in different directions when it comes to making clinical decisions. Therefore, ideally, samples should not be small and, contrary to what one might think, should not be excessive. The aim of this paper is to discuss in clinical language the main implications of the sample size when interpreting a study.

  2. Biostatistics Series Module 5: Determining Sample Size.

    Science.gov (United States)

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Determining the appropriate sample size for a study, whatever be its type, is a fundamental aspect of biomedical research. An adequate sample ensures that the study will yield reliable information, regardless of whether the data ultimately suggests a clinically important difference between the interventions or elements being studied. The probability of Type 1 and Type 2 errors, the expected variance in the sample and the effect size are the essential determinants of sample size in interventional studies. Any method for deriving a conclusion from experimental data carries with it some risk of drawing a false conclusion. Two types of false conclusion may occur, called Type 1 and Type 2 errors, whose probabilities are denoted by the symbols σ and β. A Type 1 error occurs when one concludes that a difference exists between the groups being compared when, in reality, it does not. This is akin to a false positive result. A Type 2 error occurs when one concludes that difference does not exist when, in reality, a difference does exist, and it is equal to or larger than the effect size defined by the alternative to the null hypothesis. This may be viewed as a false negative result. When considering the risk of Type 2 error, it is more intuitive to think in terms of power of the study or (1 - β). Power denotes the probability of detecting a difference when a difference does exist between the groups being compared. Smaller α or larger power will increase sample size. Conventional acceptable values for power and α are 80% or above and 5% or below, respectively, when calculating sample size. Increasing variance in the sample tends to increase the sample size required to achieve a given power level. The effect size is the smallest clinically important difference that is sought to be detected and, rather than statistical convention, is a matter of past experience and clinical judgment. Larger samples are required if smaller differences are to be detected. Although the

  3. Basic Statistical Concepts for Sample Size Estimation

    Directory of Open Access Journals (Sweden)

    Vithal K Dhulkhed

    2008-01-01

    Full Text Available For grant proposals the investigator has to include an estimation of sample size .The size of the sample should be adequate enough so that there is sufficient data to reliably answer the research question being addressed by the study. At the very planning stage of the study the investigator has to involve the statistician. To have meaningful dialogue with the statistician every research worker should be familiar with the basic concepts of statistics. This paper is concerned with simple principles of sample size calculation. Concepts are explained based on logic rather than rigorous mathematical calculations to help him assimilate the fundamentals.

  4. Experimental determination of size distributions: analyzing proper sample sizes

    Science.gov (United States)

    Buffo, A.; Alopaeus, V.

    2016-04-01

    The measurement of various particle size distributions is a crucial aspect for many applications in the process industry. Size distribution is often related to the final product quality, as in crystallization or polymerization. In other cases it is related to the correct evaluation of heat and mass transfer, as well as reaction rates, depending on the interfacial area between the different phases or to the assessment of yield stresses of polycrystalline metals/alloys samples. The experimental determination of such distributions often involves laborious sampling procedures and the statistical significance of the outcome is rarely investigated. In this work, we propose a novel rigorous tool, based on inferential statistics, to determine the number of samples needed to obtain reliable measurements of size distribution, according to specific requirements defined a priori. Such methodology can be adopted regardless of the measurement technique used.

  5. Comparison of Bayesian Sample Size Criteria: ACC, ALC, and WOC.

    Science.gov (United States)

    Cao, Jing; Lee, J Jack; Alber, Susan

    2009-12-01

    A challenge for implementing performance based Bayesian sample size determination is selecting which of several methods to use. We compare three Bayesian sample size criteria: the average coverage criterion (ACC) which controls the coverage rate of fixed length credible intervals over the predictive distribution of the data, the average length criterion (ALC) which controls the length of credible intervals with a fixed coverage rate, and the worst outcome criterion (WOC) which ensures the desired coverage rate and interval length over all (or a subset of) possible datasets. For most models, the WOC produces the largest sample size among the three criteria, and sample sizes obtained by the ACC and the ALC are not the same. For Bayesian sample size determination for normal means and differences between normal means, we investigate, for the first time, the direction and magnitude of differences between the ACC and ALC sample sizes. For fixed hyperparameter values, we show that the difference of the ACC and ALC sample size depends on the nominal coverage, and not on the nominal interval length. There exists a threshold value of the nominal coverage level such that below the threshold the ALC sample size is larger than the ACC sample size, and above the threshold the ACC sample size is larger. Furthermore, the ACC sample size is more sensitive to changes in the nominal coverage. We also show that for fixed hyperparameter values, there exists an asymptotic constant ratio between the WOC sample size and the ALC (ACC) sample size. Simulation studies are conducted to show that similar relationships among the ACC, ALC, and WOC may hold for estimating binomial proportions. We provide a heuristic argument that the results can be generalized to a larger class of models.

  6. Did modeling overestimate the transmission potential of pandemic (H1N1-2009? Sample size estimation for post-epidemic seroepidemiological studies.

    Directory of Open Access Journals (Sweden)

    Hiroshi Nishiura

    Full Text Available BACKGROUND: Seroepidemiological studies before and after the epidemic wave of H1N1-2009 are useful for estimating population attack rates with a potential to validate early estimates of the reproduction number, R, in modeling studies. METHODOLOGY/PRINCIPAL FINDINGS: Since the final epidemic size, the proportion of individuals in a population who become infected during an epidemic, is not the result of a binomial sampling process because infection events are not independent of each other, we propose the use of an asymptotic distribution of the final size to compute approximate 95% confidence intervals of the observed final size. This allows the comparison of the observed final sizes against predictions based on the modeling study (R = 1.15, 1.40 and 1.90, which also yields simple formulae for determining sample sizes for future seroepidemiological studies. We examine a total of eleven published seroepidemiological studies of H1N1-2009 that took place after observing the peak incidence in a number of countries. Observed seropositive proportions in six studies appear to be smaller than that predicted from R = 1.40; four of the six studies sampled serum less than one month after the reported peak incidence. The comparison of the observed final sizes against R = 1.15 and 1.90 reveals that all eleven studies appear not to be significantly deviating from the prediction with R = 1.15, but final sizes in nine studies indicate overestimation if the value R = 1.90 is used. CONCLUSIONS: Sample sizes of published seroepidemiological studies were too small to assess the validity of model predictions except when R = 1.90 was used. We recommend the use of the proposed approach in determining the sample size of post-epidemic seroepidemiological studies, calculating the 95% confidence interval of observed final size, and conducting relevant hypothesis testing instead of the use of methods that rely on a binomial proportion.

  7. Sample size calculation in metabolic phenotyping studies.

    Science.gov (United States)

    Billoir, Elise; Navratil, Vincent; Blaise, Benjamin J

    2015-09-01

    The number of samples needed to identify significant effects is a key question in biomedical studies, with consequences on experimental designs, costs and potential discoveries. In metabolic phenotyping studies, sample size determination remains a complex step. This is due particularly to the multiple hypothesis-testing framework and the top-down hypothesis-free approach, with no a priori known metabolic target. Until now, there was no standard procedure available to address this purpose. In this review, we discuss sample size estimation procedures for metabolic phenotyping studies. We release an automated implementation of the Data-driven Sample size Determination (DSD) algorithm for MATLAB and GNU Octave. Original research concerning DSD was published elsewhere. DSD allows the determination of an optimized sample size in metabolic phenotyping studies. The procedure uses analytical data only from a small pilot cohort to generate an expanded data set. The statistical recoupling of variables procedure is used to identify metabolic variables, and their intensity distributions are estimated by Kernel smoothing or log-normal density fitting. Statistically significant metabolic variations are evaluated using the Benjamini-Yekutieli correction and processed for data sets of various sizes. Optimal sample size determination is achieved in a context of biomarker discovery (at least one statistically significant variation) or metabolic exploration (a maximum of statistically significant variations). DSD toolbox is encoded in MATLAB R2008A (Mathworks, Natick, MA) for Kernel and log-normal estimates, and in GNU Octave for log-normal estimates (Kernel density estimates are not robust enough in GNU octave). It is available at http://www.prabi.fr/redmine/projects/dsd/repository, with a tutorial at http://www.prabi.fr/redmine/projects/dsd/wiki. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  8. Sample size calculation for comparing two negative binomial rates.

    Science.gov (United States)

    Zhu, Haiyuan; Lakkis, Hassan

    2014-02-10

    Negative binomial model has been increasingly used to model the count data in recent clinical trials. It is frequently chosen over Poisson model in cases of overdispersed count data that are commonly seen in clinical trials. One of the challenges of applying negative binomial model in clinical trial design is the sample size estimation. In practice, simulation methods have been frequently used for sample size estimation. In this paper, an explicit formula is developed to calculate sample size based on the negative binomial model. Depending on different approaches to estimate the variance under null hypothesis, three variations of the sample size formula are proposed and discussed. Important characteristics of the formula include its accuracy and its ability to explicitly incorporate dispersion parameter and exposure time. The performance of the formula with each variation is assessed using simulations.

  9. Power and sample size for the S:T repeated measures design combined with a linear mixed-effects model allowing for missing data.

    Science.gov (United States)

    Tango, Toshiro

    2017-02-13

    Tango (Biostatistics 2016) proposed a new repeated measures design called the S:T repeated measures design, combined with generalized linear mixed-effects models and sample size calculations for a test of the average treatment effect that depend not only on the number of subjects but on the number of repeated measures before and after randomization per subject used for analysis. The main advantages of the proposed design combined with the generalized linear mixed-effects models are (1) it can easily handle missing data by applying the likelihood-based ignorable analyses under the missing at random assumption and (2) it may lead to a reduction in sample size compared with the simple pre-post design. In this article, we present formulas for calculating power and sample sizes for a test of the average treatment effect allowing for missing data within the framework of the S:T repeated measures design with a continuous response variable combined with a linear mixed-effects model. Examples are provided to illustrate the use of these formulas.

  10. Linear models for airborne-laser-scanning-based operational forest inventory with small field sample size and highly correlated LiDAR data

    Science.gov (United States)

    Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.

    2015-01-01

    Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.

  11. Power Analysis and Sample Size Determination in Metabolic Phenotyping.

    Science.gov (United States)

    Blaise, Benjamin J; Correia, Gonçalo; Tin, Adrienne; Young, J Hunter; Vergnaud, Anne-Claire; Lewis, Matthew; Pearce, Jake T M; Elliott, Paul; Nicholson, Jeremy K; Holmes, Elaine; Ebbels, Timothy M D

    2016-05-17

    Estimation of statistical power and sample size is a key aspect of experimental design. However, in metabolic phenotyping, there is currently no accepted approach for these tasks, in large part due to the unknown nature of the expected effect. In such hypothesis free science, neither the number or class of important analytes nor the effect size are known a priori. We introduce a new approach, based on multivariate simulation, which deals effectively with the highly correlated structure and high-dimensionality of metabolic phenotyping data. First, a large data set is simulated based on the characteristics of a pilot study investigating a given biomedical issue. An effect of a given size, corresponding either to a discrete (classification) or continuous (regression) outcome is then added. Different sample sizes are modeled by randomly selecting data sets of various sizes from the simulated data. We investigate different methods for effect detection, including univariate and multivariate techniques. Our framework allows us to investigate the complex relationship between sample size, power, and effect size for real multivariate data sets. For instance, we demonstrate for an example pilot data set that certain features achieve a power of 0.8 for a sample size of 20 samples or that a cross-validated predictivity QY(2) of 0.8 is reached with an effect size of 0.2 and 200 samples. We exemplify the approach for both nuclear magnetic resonance and liquid chromatography-mass spectrometry data from humans and the model organism C. elegans.

  12. Letter to the editor : Design-Based Versus Model-Based Sampling Strategies: Comment on R. J. Barnes' "Bounding the Required Sample Size for Geologic Site Characterization"

    NARCIS (Netherlands)

    Gruijter, de J.J.; Braak, ter C.J.F.

    1992-01-01

    Two fundamentally different sources of randomness exist on which design and inference in spatial sampling can be based: (a) variation that would occur on resampling the same spatial population with other sampling configurations generated by the same design, and (b) variation occurring on sampling

  13. Estimating population size for Capercaillie (Tetrao urogallus L.) with spatial capture-recapture models based on genotypes from one field sample

    Science.gov (United States)

    Mollet, Pierre; Kery, Marc; Gardner, Beth; Pasinelli, Gilberto; Royle, Andy

    2015-01-01

    We conducted a survey of an endangered and cryptic forest grouse, the capercaillie Tetrao urogallus, based on droppings collected on two sampling occasions in eight forest fragments in central Switzerland in early spring 2009. We used genetic analyses to sex and individually identify birds. We estimated sex-dependent detection probabilities and population size using a modern spatial capture-recapture (SCR) model for the data from pooled surveys. A total of 127 capercaillie genotypes were identified (77 males, 46 females, and 4 of unknown sex). The SCR model yielded atotal population size estimate (posterior mean) of 137.3 capercaillies (posterior sd 4.2, 95% CRI 130–147). The observed sex ratio was skewed towards males (0.63). The posterior mean of the sex ratio under the SCR model was 0.58 (posterior sd 0.02, 95% CRI 0.54–0.61), suggesting a male-biased sex ratio in our study area. A subsampling simulation study indicated that a reduced sampling effort representing 75% of the actual detections would still yield practically acceptable estimates of total size and sex ratio in our population. Hence, field work and financial effort could be reduced without compromising accuracy when the SCR model is used to estimate key population parameters of cryptic species.

  14. Estimating Population Size for Capercaillie (Tetrao urogallus L. with Spatial Capture-Recapture Models Based on Genotypes from One Field Sample.

    Directory of Open Access Journals (Sweden)

    Pierre Mollet

    Full Text Available We conducted a survey of an endangered and cryptic forest grouse, the capercaillie Tetrao urogallus, based on droppings collected on two sampling occasions in eight forest fragments in central Switzerland in early spring 2009. We used genetic analyses to sex and individually identify birds. We estimated sex-dependent detection probabilities and population size using a modern spatial capture-recapture (SCR model for the data from pooled surveys. A total of 127 capercaillie genotypes were identified (77 males, 46 females, and 4 of unknown sex. The SCR model yielded a total population size estimate (posterior mean of 137.3 capercaillies (posterior sd 4.2, 95% CRI 130-147. The observed sex ratio was skewed towards males (0.63. The posterior mean of the sex ratio under the SCR model was 0.58 (posterior sd 0.02, 95% CRI 0.54-0.61, suggesting a male-biased sex ratio in our study area. A subsampling simulation study indicated that a reduced sampling effort representing 75% of the actual detections would still yield practically acceptable estimates of total size and sex ratio in our population. Hence, field work and financial effort could be reduced without compromising accuracy when the SCR model is used to estimate key population parameters of cryptic species.

  15. Effects of lidar pulse density and sample size on a model-assisted approach to estimate forest inventory variables

    Science.gov (United States)

    Jacob Strunk; Hailemariam Temesgen; Hans-Erik Andersen; James P. Flewelling; Lisa Madsen

    2012-01-01

    Using lidar in an area-based model-assisted approach to forest inventory has the potential to increase estimation precision for some forest inventory variables. This study documents the bias and precision of a model-assisted (regression estimation) approach to forest inventory with lidar-derived auxiliary variables relative to lidar pulse density and the number of...

  16. On power and sample size calculation in ethnic sensitivity studies.

    Science.gov (United States)

    Zhang, Wei; Sethuraman, Venkat

    2011-01-01

    In ethnic sensitivity studies, it is of interest to know whether the same dose has the same effect over populations in different regions. Glasbrenner and Rosenkranz (2006) proposed a criterion for ethnic sensitivity studies in the context of different dose-exposure models. Their method is liberal in the sense that their sample size will not achieve the target power. We will show that the power function can be easily calculated by numeric integration, and the sample size can be determined by bisection.

  17. Defining sample size and sampling strategy for dendrogeomorphic rockfall reconstructions

    Science.gov (United States)

    Morel, Pauline; Trappmann, Daniel; Corona, Christophe; Stoffel, Markus

    2015-05-01

    Optimized sampling strategies have been recently proposed for dendrogeomorphic reconstructions of mass movements with a large spatial footprint, such as landslides, snow avalanches, and debris flows. Such guidelines have, by contrast, been largely missing for rockfalls and cannot be transposed owing to the sporadic nature of this process and the occurrence of individual rocks and boulders. Based on a data set of 314 European larch (Larix decidua Mill.) trees (i.e., 64 trees/ha), growing on an active rockfall slope, this study bridges this gap and proposes an optimized sampling strategy for the spatial and temporal reconstruction of rockfall activity. Using random extractions of trees, iterative mapping, and a stratified sampling strategy based on an arbitrary selection of trees, we investigate subsets of the full tree-ring data set to define optimal sample size and sampling design for the development of frequency maps of rockfall activity. Spatially, our results demonstrate that the sampling of only 6 representative trees per ha can be sufficient to yield a reasonable mapping of the spatial distribution of rockfall frequencies on a slope, especially if the oldest and most heavily affected individuals are included in the analysis. At the same time, however, sampling such a low number of trees risks causing significant errors especially if nonrepresentative trees are chosen for analysis. An increased number of samples therefore improves the quality of the frequency maps in this case. Temporally, we demonstrate that at least 40 trees/ha are needed to obtain reliable rockfall chronologies. These results will facilitate the design of future studies, decrease the cost-benefit ratio of dendrogeomorphic studies and thus will permit production of reliable reconstructions with reasonable temporal efforts.

  18. Sample size estimation and sampling techniques for selecting a representative sample

    OpenAIRE

    Aamir Omair

    2014-01-01

    Introduction: The purpose of this article is to provide a general understanding of the concepts of sampling as applied to health-related research. Sample Size Estimation: It is important to select a representative sample in quantitative research in order to be able to generalize the results to the target population. The sample should be of the required sample size and must be selected using an appropriate probability sampling technique. There are many hidden biases which can adversely affect ...

  19. A stochastic simulation model to determine the sample size of repeated national surveys to document freedom from bovine herpesvirus 1 (BoHV-1 infection

    Directory of Open Access Journals (Sweden)

    Schwermer Heinzpeter

    2007-05-01

    Full Text Available Abstract Background International trade regulations require that countries document their livestock's sanitary status in general and freedom from specific infective agents in detail provided that import restrictions should be applied. The latter is generally achieved by large national serological surveys and risk assessments. The paper describes the basic structure and application of a generic stochastic model for risk-based sample size calculation of consecutive national surveys to document freedom from contagious disease agents in livestock. Methods In the model, disease spread during the time period between two consecutive surveys was considered, either from undetected infections within the domestic population or from imported infected animals. The @Risk model consists of the domestic spread in-between two national surveys; the infection of domestic herds from animals imported from countries with a sanitary status comparable to Switzerland or lower sanitary status and the summary sheet which summed up the numbers of resulting infected herds of all infection pathways to derive the pre-survey prevalence in the domestic population. Thereof the pre-survey probability of freedom from infection and required survey sample sizes were calculated. A scenario for detection of infected herds by general surveillance was included optionally. Results The model highlights the importance of residual domestic infection spread and characteristics of different import pathways. The sensitivity analysis revealed that number of infected, but undetected domestic herds and the multiplicative between-survey-spread factor were most correlated with the pre-survey probability of freedom from infection and the resulting sample size, respectively. Compared to the deterministic pre-cursor model, the stochastic model was therefore more sensitive to the previous survey's results. Undetected spread of infection in the domestic population between two surveys gained more

  20. Sample Size Requirements for Traditional and Regression-Based Norms.

    Science.gov (United States)

    Oosterhuis, Hannah E M; van der Ark, L Andries; Sijtsma, Klaas

    2016-04-01

    Test norms enable determining the position of an individual test taker in the group. The most frequently used approach to obtain test norms is traditional norming. Regression-based norming may be more efficient than traditional norming and is rapidly growing in popularity, but little is known about its technical properties. A simulation study was conducted to compare the sample size requirements for traditional and regression-based norming by examining the 95% interpercentile ranges for percentile estimates as a function of sample size, norming method, size of covariate effects on the test score, test length, and number of answer categories in an item. Provided the assumptions of the linear regression model hold in the data, for a subdivision of the total group into eight equal-size subgroups, we found that regression-based norming requires samples 2.5 to 5.5 times smaller than traditional norming. Sample size requirements are presented for each norming method, test length, and number of answer categories. We emphasize that additional research is needed to establish sample size requirements when the assumptions of the linear regression model are violated.

  1. Sample size estimation and sampling techniques for selecting a representative sample

    Directory of Open Access Journals (Sweden)

    Aamir Omair

    2014-01-01

    Full Text Available Introduction: The purpose of this article is to provide a general understanding of the concepts of sampling as applied to health-related research. Sample Size Estimation: It is important to select a representative sample in quantitative research in order to be able to generalize the results to the target population. The sample should be of the required sample size and must be selected using an appropriate probability sampling technique. There are many hidden biases which can adversely affect the outcome of the study. Important factors to consider for estimating the sample size include the size of the study population, confidence level, expected proportion of the outcome variable (for categorical variables/standard deviation of the outcome variable (for numerical variables, and the required precision (margin of accuracy from the study. The more the precision required, the greater is the required sample size. Sampling Techniques: The probability sampling techniques applied for health related research include simple random sampling, systematic random sampling, stratified random sampling, cluster sampling, and multistage sampling. These are more recommended than the nonprobability sampling techniques, because the results of the study can be generalized to the target population.

  2. Application of the ESRI Geostatistical Analyst for Determining the Adequacy and Sample Size Requirements of Ozone Distribution Models in the Carpathian and Sierra Nevada Mountains

    Directory of Open Access Journals (Sweden)

    Witold Fraczek

    2001-01-01

    Full Text Available Models of O3 distribution in two mountain ranges, the Carpathians in Central Europe and the Sierra Nevada in California were constructed using ArcGIS Geostatistical Analyst extension (ESRI, Redlands, CA using kriging and cokriging methods. The adequacy of the spatially interpolated ozone (O3 concentrations and sample size requirements for ozone passive samplers was also examined. In case of the Carpathian Mountains, only a general surface of O3 distribution could be obtained, partially due to a weak correlation between O3 concentration and elevation, and partially due to small numbers of unevenly distributed sample sites. In the Sierra Nevada Mountains, the O3 monitoring network was much denser and more evenly distributed, and additional climatologic information was available. As a result the estimated surfaces were more precise and reliable than those created for the Carpathians. The final maps of O3 concentrations for Sierra Nevada were derived from cokriging algorithm based on two secondary variables — elevation and maximum temperature as well as the determined geographic trend. Evenly distributed and sufficient numbers of sample points are a key factor for model accuracy and reliability.

  3. Application of the ESRI Geostatistical Analyst for determining the adequacy and sample size requirements of ozone distribution models in the Carpathian and Sierra Nevada Mountains.

    Science.gov (United States)

    Fraczek, W; Bytnerowicz, A; Arbaugh, M J

    2001-12-07

    Models of O3 distribution in two mountain ranges, the Carpathians in Central Europe and the Sierra Nevada in California were constructed using ArcGIS Geostatistical Analyst extension (ESRI, Redlands, CA) using kriging and cokriging methods. The adequacy of the spatially interpolated ozone (O3) concentrations and sample size requirements for ozone passive samplers was also examined. In case of the Carpathian Mountains, only a general surface of O3 distribution could be obtained, partially due to a weak correlation between O3 concentration and elevation, and partially due to small numbers of unevenly distributed sample sites. In the Sierra Nevada Mountains, the O3 monitoring network was much denser and more evenly distributed, and additional climatologic information was available. As a result the estimated surfaces were more precise and reliable than those created for the Carpathians. The final maps of O3 concentrations for Sierra Nevada were derived from cokriging algorithm based on two secondary variables--elevation and maximum temperature as well as the determined geographic trend. Evenly distributed and sufficient numbers of sample points are a key factor for model accuracy and reliability.

  4. Power and sample size determination in the Rasch model: evaluation of the robustness of a numerical method to non-normality of the latent trait.

    Directory of Open Access Journals (Sweden)

    Alice Guilleux

    Full Text Available Patient-reported outcomes (PRO have gained importance in clinical and epidemiological research and aim at assessing quality of life, anxiety or fatigue for instance. Item Response Theory (IRT models are increasingly used to validate and analyse PRO. Such models relate observed variables to a latent variable (unobservable variable which is commonly assumed to be normally distributed. A priori sample size determination is important to obtain adequately powered studies to determine clinically important changes in PRO. In previous developments, the Raschpower method has been proposed for the determination of the power of the test of group effect for the comparison of PRO in cross-sectional studies with an IRT model, the Rasch model. The objective of this work was to evaluate the robustness of this method (which assumes a normal distribution for the latent variable to violations of distributional assumption. The statistical power of the test of group effect was estimated by the empirical rejection rate in data sets simulated using a non-normally distributed latent variable. It was compared to the power obtained with the Raschpower method. In both cases, the data were analyzed using a latent regression Rasch model including a binary covariate for group effect. For all situations, both methods gave comparable results whatever the deviations from the model assumptions. Given the results, the Raschpower method seems to be robust to the non-normality of the latent trait for determining the power of the test of group effect.

  5. Publication bias in psychology: a diagnosis based on the correlation between effect size and sample size

    National Research Council Canada - National Science Library

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    .... We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values...

  6. Bayesian sample size for diagnostic test studies in the absence of a gold standard: Comparing identifiable with non-identifiable models.

    Science.gov (United States)

    Dendukuri, Nandini; Bélisle, Patrick; Joseph, Lawrence

    2010-11-20

    Diagnostic tests rarely provide perfect results. The misclassification induced by imperfect sensitivities and specificities of diagnostic tests must be accounted for when planning prevalence studies or investigations into properties of new tests. The previous work has shown that applying a single imperfect test to estimate prevalence can often result in very large sample size requirements, and that sometimes even an infinite sample size is insufficient for precise estimation because the problem is non-identifiable. Adding a second test can sometimes reduce the sample size substantially, but infinite sample sizes can still occur as the problem remains non-identifiable. We investigate the further improvement possible when three diagnostic tests are to be applied. We first develop methods required for studies when three conditionally independent tests are available, using different Bayesian criteria. We then apply these criteria to prototypic scenarios, showing that large sample size reductions can occur compared to when only one or two tests are used. As the problem is now identifiable, infinite sample sizes cannot occur except in pathological situations. Finally, we relax the conditional independence assumption, demonstrating in this once again non-identifiable situation that sample sizes may substantially grow and possibly be infinite. We apply our methods to the planning of two infectious disease studies, the first designed to estimate the prevalence of Strongyloides infection, and the second relating to estimating the sensitivity of a new test for tuberculosis transmission. The much smaller sample sizes that are typically required when three as compared to one or two tests are used should encourage researchers to plan their studies using more than two diagnostic tests whenever possible. User-friendly software is available for both design and analysis stages greatly facilitating the use of these methods.

  7. Sample size in psychological research over the past 30 years.

    Science.gov (United States)

    Marszalek, Jacob M; Barber, Carolyn; Kohlhart, Julie; Holmes, Cooper B

    2011-04-01

    The American Psychological Association (APA) Task Force on Statistical Inference was formed in 1996 in response to a growing body of research demonstrating methodological issues that threatened the credibility of psychological research, and made recommendations to address them. One issue was the small, even dramatically inadequate, size of samples used in studies published by leading journals. The present study assessed the progress made since the Task Force's final report in 1999. Sample sizes reported in four leading APA journals in 1955, 1977, 1995, and 2006 were compared using nonparametric statistics, while data from the last two waves were fit to a hierarchical generalized linear growth model for more in-depth analysis. Overall, results indicate that the recommendations for increasing sample sizes have not been integrated in core psychological research, although results slightly vary by field. This and other implications are discussed in the context of current methodological critique and practice.

  8. Estimation of individual reference intervals in small sample sizes

    DEFF Research Database (Denmark)

    Hansen, Ase Marie; Garde, Anne Helene; Eller, Nanna Hurwitz

    2007-01-01

    of that order of magnitude for all topics in question. Therefore, new methods to estimate reference intervals for small sample sizes are needed. We present an alternative method based on variance component models. The models are based on data from 37 men and 84 women taking into account biological variation...... presented in this study. The presented method enables occupational health researchers to calculate reference intervals for specific groups, i.e. smokers versus non-smokers, etc. In conclusion, the variance component models provide an appropriate tool to estimate reference intervals based on small sample...

  9. 7 CFR 52.803 - Sample unit size.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Sample unit size. 52.803 Section 52.803 Agriculture... United States Standards for Grades of Frozen Red Tart Pitted Cherries Sample Unit Size § 52.803 Sample unit size. Compliance with requirements for size and the various quality factors is based on the...

  10. 7 CFR 52.775 - Sample unit size.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Sample unit size. 52.775 Section 52.775 Agriculture... United States Standards for Grades of Canned Red Tart Pitted Cherries 1 Sample Unit Size § 52.775 Sample unit size. Compliance with requirements for the size and the various quality factors is based on the...

  11. The choice of sample size for mortality forecasting : A Bayesian learning approach

    NARCIS (Netherlands)

    Li, Hong; De Waegenaere, Anja; Melenberg, Bertrand

    2015-01-01

    Forecasted mortality rates using mortality models proposed in the recent literature are sensitive to the sample size. In this paper we propose a method based on Bayesian learning to determine model-specific posterior distributions of the sample sizes. In particular, the sample size is included as an

  12. Sample size calculation for meta-epidemiological studies.

    Science.gov (United States)

    Giraudeau, Bruno; Higgins, Julian P T; Tavernier, Elsa; Trinquart, Ludovic

    2016-01-30

    Meta-epidemiological studies are used to compare treatment effect estimates between randomized clinical trials with and without a characteristic of interest. To our knowledge, there is presently nothing to help researchers to a priori specify the required number of meta-analyses to be included in a meta-epidemiological study. We derived a theoretical power function and sample size formula in the framework of a hierarchical model that allows for variation in the impact of the characteristic between trials within a meta-analysis and between meta-analyses. A simulation study revealed that the theoretical function overestimated power (because of the assumption of equal weights for each trial within and between meta-analyses). We also propose a simulation approach that allows for relaxing the constraints used in the theoretical approach and is more accurate. We illustrate that the two variables that mostly influence power are the number of trials per meta-analysis and the proportion of trials with the characteristic of interest. We derived a closed-form power function and sample size formula for estimating the impact of trial characteristics in meta-epidemiological studies. Our analytical results can be used as a 'rule of thumb' for sample size calculation for a meta-epidemiologic study. A more accurate sample size can be derived with a simulation study.

  13. (Sample) Size Matters: Defining Error in Planktic Foraminiferal Isotope Measurement

    Science.gov (United States)

    Lowery, C.; Fraass, A. J.

    2015-12-01

    Planktic foraminifera have been used as carriers of stable isotopic signals since the pioneering work of Urey and Emiliani. In those heady days, instrumental limitations required hundreds of individual foraminiferal tests to return a usable value. This had the fortunate side-effect of smoothing any seasonal to decadal changes within the planktic foram population, which generally turns over monthly, removing that potential noise from each sample. With the advent of more sensitive mass spectrometers, smaller sample sizes have now become standard. This has been a tremendous advantage, allowing longer time series with the same investment of time and energy. Unfortunately, the use of smaller numbers of individuals to generate a data point has lessened the amount of time averaging in the isotopic analysis and decreased precision in paleoceanographic datasets. With fewer individuals per sample, the differences between individual specimens will result in larger variation, and therefore error, and less precise values for each sample. Unfortunately, most workers (the authors included) do not make a habit of reporting the error associated with their sample size. We have created an open-source model in R to quantify the effect of sample sizes under various realistic and highly modifiable parameters (calcification depth, diagenesis in a subset of the population, improper identification, vital effects, mass, etc.). For example, a sample in which only 1 in 10 specimens is diagenetically altered can be off by >0.3‰ δ18O VPDB or ~1°C. Additionally, and perhaps more importantly, we show that under unrealistically ideal conditions (perfect preservation, etc.) it takes ~5 individuals from the mixed-layer to achieve an error of less than 0.1‰. Including just the unavoidable vital effects inflates that number to ~10 individuals to achieve ~0.1‰. Combining these errors with the typical machine error inherent in mass spectrometers make this a vital consideration moving forward.

  14. On bootstrap sample size in extreme value theory

    NARCIS (Netherlands)

    J.L. Geluk (Jaap); L.F.M. de Haan (Laurens)

    2002-01-01

    textabstractIt has been known for a long time that for bootstrapping the probability distribution of the maximum of a sample consistently, the bootstrap sample size needs to be of smaller order than the original sample size. See Jun Shao and Dongsheng Tu (1995), Ex. 3.9,p. 123. We show that the same

  15. Sample size determination in clinical trials with multiple endpoints

    CERN Document Server

    Sozu, Takashi; Hamasaki, Toshimitsu; Evans, Scott R

    2015-01-01

    This book integrates recent methodological developments for calculating the sample size and power in trials with more than one endpoint considered as multiple primary or co-primary, offering an important reference work for statisticians working in this area. The determination of sample size and the evaluation of power are fundamental and critical elements in the design of clinical trials. If the sample size is too small, important effects may go unnoticed; if the sample size is too large, it represents a waste of resources and unethically puts more participants at risk than necessary. Recently many clinical trials have been designed with more than one endpoint considered as multiple primary or co-primary, creating a need for new approaches to the design and analysis of these clinical trials. The book focuses on the evaluation of power and sample size determination when comparing the effects of two interventions in superiority clinical trials with multiple endpoints. Methods for sample size calculation in clin...

  16. Simple and multiple linear regression: sample size considerations.

    Science.gov (United States)

    Hanley, James A

    2016-11-01

    The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. How Small Is Big: Sample Size and Skewness.

    Science.gov (United States)

    Piovesana, Adina; Senior, Graeme

    2016-09-21

    Sample sizes of 50 have been cited as sufficient to obtain stable means and standard deviations in normative test data. The influence of skewness on this minimum number, however, has not been evaluated. Normative test data with varying levels of skewness were compiled for 12 measures from 7 tests collected as part of ongoing normative studies in Brisbane, Australia. Means and standard deviations were computed from sample sizes of 10 to 100 drawn with replacement from larger samples of 272 to 973 cases. The minimum sample size was determined by the number at which both mean and standard deviation estimates remained within the 90% confidence intervals surrounding the population estimates. Sample sizes of greater than 85 were found to generate stable means and standard deviations regardless of the level of skewness, with smaller samples required in skewed distributions. A formula was derived to compute recommended sample size at differing levels of skewness.

  18. Cutoff sample size estimation for survival data: a simulation study

    OpenAIRE

    2014-01-01

    This thesis demonstrates the possible cutoff sample size point that balances goodness of es-timation and study expenditure by a practical cancer case. As it is crucial to determine the sample size in designing an experiment, researchers attempt to find the suitable sample size that achieves desired power and budget efficiency at the same time. The thesis shows how simulation can be used for sample size and precision calculations with survival data. The pre-sentation concentrates on the simula...

  19. Size Effect in Continuum Modeling

    Energy Technology Data Exchange (ETDEWEB)

    Lu, Wei-Yang [Sandia National Lab. (SNL-CA), Livermore, CA (United States). Mechanics of Materials; Foulk, James W. [Sandia National Lab. (SNL-CA), Livermore, CA (United States). Mechanics of Materials; Huestis, Edwin M. [Sandia National Lab. (SNL-CA), Livermore, CA (United States). Mechanics of Materials; Connelly, Kevin [Sandia National Lab. (SNL-CA), Livermore, CA (United States). Mechanics of Materials; Song, Bo [Sandia National Lab. (SNL-CA), Livermore, CA (United States). Mechanics of Materials; Yang, Nancy Y. C. [Sandia National Lab. (SNL-CA), Livermore, CA (United States). Engineered Materials

    2008-09-01

    The mechanical properties of some materials (Cu, Ni, Ag, etc.) have been shown to develop strong dependence on the geometric dimensions, resulting in a size effect. Several theories have been proposed to model size effects, but have been based on very few experiments conducted at appropriate scales. Some experimental results implied that size effects are caused by increasing strain gradients and have been used to confirm many strain gradient theories. On the other hand, some recent experiments show that a size effect exists in the absence of strain gradients. This report describes a brief analytical and experimental study trying to clarify the material and experimental issues surrounding the most influential size-effect experiments by Fleck et al (1994). This effort is to understand size effects intended to further develop predictive models.

  20. MetSizeR: selecting the optimal sample size for metabolomic studies using an analysis based approach

    Science.gov (United States)

    2013-01-01

    Background Determining sample sizes for metabolomic experiments is important but due to the complexity of these experiments, there are currently no standard methods for sample size estimation in metabolomics. Since pilot studies are rarely done in metabolomics, currently existing sample size estimation approaches which rely on pilot data can not be applied. Results In this article, an analysis based approach called MetSizeR is developed to estimate sample size for metabolomic experiments even when experimental pilot data are not available. The key motivation for MetSizeR is that it considers the type of analysis the researcher intends to use for data analysis when estimating sample size. MetSizeR uses information about the data analysis technique and prior expert knowledge of the metabolomic experiment to simulate pilot data from a statistical model. Permutation based techniques are then applied to the simulated pilot data to estimate the required sample size. Conclusions The MetSizeR methodology, and a publicly available software package which implements the approach, are illustrated through real metabolomic applications. Sample size estimates, informed by the intended statistical analysis technique, and the associated uncertainty are provided. PMID:24261687

  1. Publication bias in psychology: a diagnosis based on the correlation between effect size and sample size.

    Directory of Open Access Journals (Sweden)

    Anton Kühberger

    Full Text Available The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias.We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values.We found a negative correlation of r = -.45 [95% CI: -.53; -.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings.The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology.

  2. Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size

    Science.gov (United States)

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology. PMID:25192357

  3. 40 CFR 80.127 - Sample size guidelines.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 16 2010-07-01 2010-07-01 false Sample size guidelines. 80.127 Section 80.127 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) REGULATION OF FUELS AND FUEL ADDITIVES Attest Engagements § 80.127 Sample size guidelines. In performing...

  4. Approaches to sample size determination for multivariate data

    NARCIS (Netherlands)

    Saccenti, Edoardo; Timmerman, Marieke E.

    2016-01-01

    Sample size determination is a fundamental step in the design of experiments. Methods for sample size determination are abundant for univariate analysis methods, but scarce in the multivariate case. Omics data are multivariate in nature and are commonly investigated using multivariate statistical

  5. Sample Size Requirements for Estimating Pearson, Spearman and Kendall Correlations.

    Science.gov (United States)

    Bonett, Douglas G.; Wright, Thomas A.

    2000-01-01

    Reviews interval estimates of the Pearson, Kendall tau-alpha, and Spearman correlates and proposes an improved standard error for the Spearman correlation. Examines the sample size required to yield a confidence interval having the desired width. Findings show accurate results from a two-stage approximation to the sample size. (SLD)

  6. Determination of sample size in genome-scale RNAi screens.

    Science.gov (United States)

    Zhang, Xiaohua Douglas; Heyse, Joseph F

    2009-04-01

    For genome-scale RNAi research, it is critical to investigate sample size required for the achievement of reasonably low false negative rate (FNR) and false positive rate. The analysis in this article reveals that current design of sample size contributes to the occurrence of low signal-to-noise ratio in genome-scale RNAi projects. The analysis suggests that (i) an arrangement of 16 wells per plate is acceptable and an arrangement of 20-24 wells per plate is preferable for a negative control to be used for hit selection in a primary screen without replicates; (ii) in a confirmatory screen or a primary screen with replicates, a sample size of 3 is not large enough, and there is a large reduction in FNRs when sample size increases from 3 to 4. To search a tradeoff between benefit and cost, any sample size between 4 and 11 is a reasonable choice. If the main focus is the selection of siRNAs with strong effects, a sample size of 4 or 5 is a good choice. If we want to have enough power to detect siRNAs with moderate effects, sample size needs to be 8, 9, 10 or 11. These discoveries about sample size bring insight to the design of a genome-scale RNAi screen experiment.

  7. Optimized Heart Sampling and Systematic Evaluation of Cardiac Therapies in Mouse Models of Ischemic Injury: Assessment of Cardiac Remodeling and Semi-Automated Quantification of Myocardial Infarct Size.

    Science.gov (United States)

    Valente, Mariana; Araújo, Ana; Esteves, Tiago; Laundos, Tiago L; Freire, Ana G; Quelhas, Pedro; Pinto-do-Ó, Perpétua; Nascimento, Diana S

    2015-12-02

    Cardiac therapies are commonly tested preclinically in small-animal models of myocardial infarction. Following functional evaluation, post-mortem histological analysis is essential to assess morphological and molecular alterations underlying the effectiveness of treatment. However, non-methodical and inadequate sampling of the left ventricle often leads to misinterpretations and variability, making direct study comparisons unreliable. Protocols are provided for representative sampling of the ischemic mouse heart followed by morphometric analysis of the left ventricle. Extending the use of this sampling to other types of in situ analysis is also illustrated through the assessment of neovascularization and cellular engraftment in a cell-based therapy setting. This is of interest to the general cardiovascular research community as it details methods for standardization and simplification of histo-morphometric evaluation of emergent heart therapies. © 2015 by John Wiley & Sons, Inc. Copyright © 2015 John Wiley & Sons, Inc.

  8. [Effect sizes, statistical power and sample sizes in "the Japanese Journal of Psychology"].

    Science.gov (United States)

    Suzukawa, Yumi; Toyoda, Hideki

    2012-04-01

    This study analyzed the statistical power of research studies published in the "Japanese Journal of Psychology" in 2008 and 2009. Sample effect sizes and sample statistical powers were calculated for each statistical test and analyzed with respect to the analytical methods and the fields of the studies. The results show that in the fields like perception, cognition or learning, the effect sizes were relatively large, although the sample sizes were small. At the same time, because of the small sample sizes, some meaningful effects could not be detected. In the other fields, because of the large sample sizes, meaningless effects could be detected. This implies that researchers who could not get large enough effect sizes would use larger samples to obtain significant results.

  9. Sample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys.

    Science.gov (United States)

    Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R

    2017-09-14

    While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey.

  10. Learning maximum entropy models from finite-size data sets: A fast data-driven algorithm allows sampling from the posterior distribution

    Science.gov (United States)

    Ferrari, Ulisse

    2016-08-01

    Maximum entropy models provide the least constrained probability distributions that reproduce statistical properties of experimental datasets. In this work we characterize the learning dynamics that maximizes the log-likelihood in the case of large but finite datasets. We first show how the steepest descent dynamics is not optimal as it is slowed down by the inhomogeneous curvature of the model parameters' space. We then provide a way for rectifying this space which relies only on dataset properties and does not require large computational efforts. We conclude by solving the long-time limit of the parameters' dynamics including the randomness generated by the systematic use of Gibbs sampling. In this stochastic framework, rather than converging to a fixed point, the dynamics reaches a stationary distribution, which for the rectified dynamics reproduces the posterior distribution of the parameters. We sum up all these insights in a "rectified" data-driven algorithm that is fast and by sampling from the parameters' posterior avoids both under- and overfitting along all the directions of the parameters' space. Through the learning of pairwise Ising models from the recording of a large population of retina neurons, we show how our algorithm outperforms the steepest descent method.

  11. Random-effects linear modeling and sample size tables for two special crossover designs of average bioequivalence studies: the four-period, two-sequence, two-formulation and six-period, three-sequence, three-formulation designs.

    Science.gov (United States)

    Diaz, Francisco J; Berg, Michel J; Krebill, Ron; Welty, Timothy; Gidal, Barry E; Alloway, Rita; Privitera, Michael

    2013-12-01

    Due to concern and debate in the epilepsy medical community and to the current interest of the US Food and Drug Administration (FDA) in revising approaches to the approval of generic drugs, the FDA is currently supporting ongoing bioequivalence studies of antiepileptic drugs, the EQUIGEN studies. During the design of these crossover studies, the researchers could not find commercial or non-commercial statistical software that quickly allowed computation of sample sizes for their designs, particularly software implementing the FDA requirement of using random-effects linear models for the analyses of bioequivalence studies. This article presents tables for sample-size evaluations of average bioequivalence studies based on the two crossover designs used in the EQUIGEN studies: the four-period, two-sequence, two-formulation design, and the six-period, three-sequence, three-formulation design. Sample-size computations assume that random-effects linear models are used in bioequivalence analyses with crossover designs. Random-effects linear models have been traditionally viewed by many pharmacologists and clinical researchers as just mathematical devices to analyze repeated-measures data. In contrast, a modern view of these models attributes an important mathematical role in theoretical formulations in personalized medicine to them, because these models not only have parameters that represent average patients, but also have parameters that represent individual patients. Moreover, the notation and language of random-effects linear models have evolved over the years. Thus, another goal of this article is to provide a presentation of the statistical modeling of data from bioequivalence studies that highlights the modern view of these models, with special emphasis on power analyses and sample-size computations.

  12. Modeling Size Polydisperse Granular Flows

    Science.gov (United States)

    Lueptow, Richard M.; Schlick, Conor P.; Isner, Austin B.; Umbanhowar, Paul B.; Ottino, Julio M.

    2014-11-01

    Modeling size segregation of granular materials has important applications in many industrial processes and geophysical phenomena. We have developed a continuum model for granular multi- and polydisperse size segregation based on flow kinematics, which we obtain from discrete element method (DEM) simulations. The segregation depends on dimensionless control parameters that are functions of flow rate, particle sizes, collisional diffusion coefficient, shear rate, and flowing layer depth. To test the theoretical approach, we model segregation in tri-disperse quasi-2D heap flow and log-normally distributed polydisperse quasi-2D chute flow. In both cases, the segregated particle size distributions match results from full-scale DEM simulations and experiments. While the theory was applied to size segregation in steady quasi-2D flows here, the approach can be readily generalized to include additional drivers of segregation such as density and shape as well as other geometries where the flow field can be characterized including rotating tumbler flow and three-dimensional bounded heap flow. Funded by The Dow Chemical Company and NSF Grant CMMI-1000469.

  13. Determination of the optimal sample size for a clinical trial accounting for the population size.

    Science.gov (United States)

    Stallard, Nigel; Miller, Frank; Day, Simon; Hee, Siew Wan; Madan, Jason; Zohar, Sarah; Posch, Martin

    2017-07-01

    The problem of choosing a sample size for a clinical trial is a very common one. In some settings, such as rare diseases or other small populations, the large sample sizes usually associated with the standard frequentist approach may be infeasible, suggesting that the sample size chosen should reflect the size of the population under consideration. Incorporation of the population size is possible in a decision-theoretic approach either explicitly by assuming that the population size is fixed and known, or implicitly through geometric discounting of the gain from future patients reflecting the expected population size. This paper develops such approaches. Building on previous work, an asymptotic expression is derived for the sample size for single and two-arm clinical trials in the general case of a clinical trial with a primary endpoint with a distribution of one parameter exponential family form that optimizes a utility function that quantifies the cost and gain per patient as a continuous function of this parameter. It is shown that as the size of the population, N, or expected size, N∗ in the case of geometric discounting, becomes large, the optimal trial size is O(N1/2) or O(N∗1/2). The sample size obtained from the asymptotic expression is also compared with the exact optimal sample size in examples with responses with Bernoulli and Poisson distributions, showing that the asymptotic approximations can also be reasonable in relatively small sample sizes. © 2016 The Author. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Sample Size Determination: A Comparison of Attribute, Continuous Variable, and Cell Size Methods.

    Science.gov (United States)

    Clark, Philip M.

    1984-01-01

    Describes three methods of sample size determination, each having its use in investigation of social science problems: Attribute method; Continuous Variable method; Galtung's Cell Size method. Statistical generalization, benefits of cell size method (ease of use, trivariate analysis and trichotyomized variables), and choice of method are…

  15. RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment

    OpenAIRE

    Yan Guo; Shilin Zhao; Chung-I Li; Quanhu Sheng; Yu Shyr

    2014-01-01

    Sample size and power determination is the first step in the experimental design of a successful study. Sample size and power calculation is required for applications for National Institutes of Health (NIH) funding. Sample size and power calculation is well established for traditional biological studies such as mouse model, genome wide association study (GWAS), and microarray studies. Recent developments in high-throughput sequencing technology have allowed RNAseq to replace microarray as the...

  16. Size selective sampling using mobile, 3D nanoporous membranes.

    Science.gov (United States)

    Randall, Christina L; Gillespie, Aubri; Singh, Siddarth; Leong, Timothy G; Gracias, David H

    2009-02-01

    We describe the fabrication of 3D membranes with precisely patterned surface nanoporosity and their utilization in size selective sampling. The membranes were self-assembled as porous cubes from lithographically fabricated 2D templates (Leong et al., Langmuir 23:8747-8751, 2007) with face dimensions of 200 microm, volumes of 8 nL, and monodisperse pores ranging in size from approximately 10 microm to 100 nm. As opposed to conventional sampling and filtration schemes where fluid is moved across a static membrane, we demonstrate sampling by instead moving the 3D nanoporous membrane through the fluid. This new scheme allows for straightforward sampling in small volumes, with little to no loss. Membranes with five porous faces and one open face were moved through fluids to sample and retain nanoscale beads and cells based on pore size. Additionally, cells retained within the membranes were subsequently cultured and multiplied using standard cell culture protocols upon retrieval.

  17. Calculating sample size in trials using historical controls.

    Science.gov (United States)

    Zhang, Song; Cao, Jing; Ahn, Chul

    2010-08-01

    Makuch and Simon [Sample size considerations for non-randomised comparative studies. J Chronic Dis 1980; 33: 175-81.] developed a sample size formula for historical control trials. When assessing power, they assumed the true control treatment effect to be equal to the observed effect from the historical control group. Many researchers have pointed out that the Makuch-Simon approach does not preserve the nominal power and type I error when considering the uncertainty in the true historical control treatment effect. To develop a sample size formula that properly accounts for the underlying randomness in the observations from the historical control group. We reveal the extremely skewed nature in the distributions of power and type I error, obtained over all the random realizations of the historical control data. The skewness motivates us to derive a sample size formula that controls the percentiles, instead of the means, of the power and type I error. A closed-form sample size formula is developed to control arbitrary percentiles of power and type I error for historical control trials. A simulation study further demonstrates that this approach preserves the operational characteristics in a more realistic scenario where the population variances are unknown and replaced by sample variances. The closed-form sample size formula is derived for continuous outcomes. The formula is more complicated for binary or survival time outcomes. We have derived a closed-form sample size formula that controls the percentiles instead of means of power and type I error in historical control trials, which have extremely skewed distributions over all the possible realizations of historical control data.

  18. Modeling and Sizing of Supercapacitors

    Directory of Open Access Journals (Sweden)

    PETREUS, D.

    2008-06-01

    Full Text Available Faced with numerous challenges raised by the requirements of the modern industries for higher power and higher energy, supercapacitors study started playing an important role in offering viable solutions for some of these requirements. This paper presents the surface redox reactions based modeling in order to study the origin of high capacity of EDLC (electrical double-layer capacitor for better understanding the working principles of supercapacitors. Some application-dependent sizing methods are also presented since proper sizing can increase the efficiency and the life cycle of the supercapacitor based systems.

  19. Conservative Sample Size Determination for Repeated Measures Analysis of Covariance.

    Science.gov (United States)

    Morgan, Timothy M; Case, L Douglas

    2013-07-05

    In the design of a randomized clinical trial with one pre and multiple post randomized assessments of the outcome variable, one needs to account for the repeated measures in determining the appropriate sample size. Unfortunately, one seldom has a good estimate of the variance of the outcome measure, let alone the correlations among the measurements over time. We show how sample sizes can be calculated by making conservative assumptions regarding the correlations for a variety of covariance structures. The most conservative choice for the correlation depends on the covariance structure and the number of repeated measures. In the absence of good estimates of the correlations, the sample size is often based on a two-sample t-test, making the 'ultra' conservative and unrealistic assumption that there are zero correlations between the baseline and follow-up measures while at the same time assuming there are perfect correlations between the follow-up measures. Compared to the case of taking a single measurement, substantial savings in sample size can be realized by accounting for the repeated measures, even with very conservative assumptions regarding the parameters of the assumed correlation matrix. Assuming compound symmetry, the sample size from the two-sample t-test calculation can be reduced at least 44%, 56%, and 61% for repeated measures analysis of covariance by taking 2, 3, and 4 follow-up measures, respectively. The results offer a rational basis for determining a fairly conservative, yet efficient, sample size for clinical trials with repeated measures and a baseline value.

  20. Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size: e105825

    National Research Council Canada - National Science Library

    Anton Kühberger; Astrid Fritz; Thomas Scherndl

    2014-01-01

    .... We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values...

  1. Estimating hidden population size using Respondent-Driven Sampling data.

    Science.gov (United States)

    Handcock, Mark S; Gile, Krista J; Mar, Corinne M

    Respondent-Driven Sampling (RDS) is n approach to sampling design and inference in hard-to-reach human populations. It is often used in situations where the target population is rare and/or stigmatized in the larger population, so that it is prohibitively expensive to contact them through the available frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. This paper presents an approach to estimating the size of a target population based on data collected through RDS. The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes. The inference uses the Bayesian framework, allowing for the incorporation of prior knowledge. A flexible class of priors for the population size is used that aids elicitation. An extensive simulation study provides insight into the performance of the method for estimating population size under a broad range of conditions. A further study shows the approach also improves estimation of aggregate characteristics. Finally, the method demonstrates sensible results when used to estimate the size of known networked populations from the National Longitudinal Study of Adolescent Health, and when used to estimate the size of a hard-to-reach population at high risk for HIV.

  2. Sample-Size Planning for More Accurate Statistical Power: A Method Adjusting Sample Effect Sizes for Publication Bias and Uncertainty.

    Science.gov (United States)

    Anderson, Samantha F; Kelley, Ken; Maxwell, Scott E

    2017-09-01

    The sample size necessary to obtain a desired level of statistical power depends in part on the population value of the effect size, which is, by definition, unknown. A common approach to sample-size planning uses the sample effect size from a prior study as an estimate of the population value of the effect to be detected in the future study. Although this strategy is intuitively appealing, effect-size estimates, taken at face value, are typically not accurate estimates of the population effect size because of publication bias and uncertainty. We show that the use of this approach often results in underpowered studies, sometimes to an alarming degree. We present an alternative approach that adjusts sample effect sizes for bias and uncertainty, and we demonstrate its effectiveness for several experimental designs. Furthermore, we discuss an open-source R package, BUCSS, and user-friendly Web applications that we have made available to researchers so that they can easily implement our suggested methods.

  3. Sample size considerations for historical control studies with survival outcomes

    Science.gov (United States)

    Zhu, Hong; Zhang, Song; Ahn, Chul

    2015-01-01

    Historical control trials (HCTs) are frequently conducted to compare an experimental treatment with a control treatment from a previous study, when they are applicable and favored over a randomized clinical trial (RCT) due to feasibility, ethics and cost concerns. Makuch and Simon developed a sample size formula for historical control (HC) studies with binary outcomes, assuming that the observed response rate in the HC group is the true response rate. This method was extended by Dixon and Simon to specify sample size for HC studies comparing survival outcomes. For HC studies with binary and continuous outcomes, many researchers have shown that the popular Makuch and Simon method does not preserve the nominal power and type I error, and suggested alternative approaches. For HC studies with survival outcomes, we reveal through simulation that the conditional power and type I error over all the random realizations of the HC data have highly skewed distributions. Therefore, the sampling variability of the HC data needs to be appropriately accounted for in determining sample size. A flexible sample size formula that controls arbitrary percentiles, instead of means, of the conditional power and type I error, is derived. Although an explicit sample size formula with survival outcomes is not available, the computation is straightforward. Simulations demonstrate that the proposed method preserves the operational characteristics in a more realistic scenario where the true hazard rate of the HC group is unknown. A real data application of an advanced non-small cell lung cancer (NSCLC) clinical trial is presented to illustrate sample size considerations for HC studies in comparison of survival outcomes. PMID:26098200

  4. Current sample size conventions: Flaws, harms, and alternatives

    Directory of Open Access Journals (Sweden)

    Bacchetti Peter

    2010-03-01

    Full Text Available Abstract Background The belief remains widespread that medical research studies must have statistical power of at least 80% in order to be scientifically sound, and peer reviewers often question whether power is high enough. Discussion This requirement and the methods for meeting it have severe flaws. Notably, the true nature of how sample size influences a study's projected scientific or practical value precludes any meaningful blanket designation of value of information methods, simple choices based on cost or feasibility that have recently been justified, sensitivity analyses that examine a meaningful array of possible findings, and following previous analogous studies. To promote more rational approaches, research training should cover the issues presented here, peer reviewers should be extremely careful before raising issues of "inadequate" sample size, and reports of completed studies should not discuss power. Summary Common conventions and expectations concerning sample size are deeply flawed, cause serious harm to the research process, and should be replaced by more rational alternatives.

  5. Power and Sample Size Calculations for Contrast Analysis in ANCOVA.

    Science.gov (United States)

    Shieh, Gwowen

    2017-01-01

    Analysis of covariance (ANCOVA) is commonly used in behavioral and educational research to reduce the error variance and improve the power of analysis of variance by adjusting the covariate effects. For planning and evaluating randomized ANCOVA designs, a simple sample-size formula has been proposed to account for the variance deflation factor in the comparison of two treatment groups. The objective of this article is to highlight an overlooked and potential problem of the exiting approximation and to provide an alternative and exact solution of power and sample size assessments for testing treatment contrasts. Numerical investigations are conducted to reveal the relative performance of the two procedures as a reliable technique to accommodate the covariate features that make ANCOVA design particularly distinctive. The described approach has important advantages over the current method in general applicability, methodological justification, and overall accuracy. To enhance the practical usefulness, computer algorithms are presented to implement the recommended power calculations and sample-size determinations.

  6. Causality in Statistical Power: Isomorphic Properties of Measurement, Research Design, Effect Size, and Sample Size

    Science.gov (United States)

    Heidel, R. Eric

    2016-01-01

    Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power. PMID:27073717

  7. Causality in Statistical Power: Isomorphic Properties of Measurement, Research Design, Effect Size, and Sample Size

    Directory of Open Access Journals (Sweden)

    R. Eric Heidel

    2016-01-01

    Full Text Available Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power.

  8. Sample size in orthodontic randomized controlled trials: are numbers justified?

    Science.gov (United States)

    Koletsi, Despina; Pandis, Nikolaos; Fleming, Padhraig S

    2014-02-01

    Sample size calculations are advocated by the Consolidated Standards of Reporting Trials (CONSORT) group to justify sample sizes in randomized controlled trials (RCTs). This study aimed to analyse the reporting of sample size calculations in trials published as RCTs in orthodontic speciality journals. The performance of sample size calculations was assessed and calculations verified where possible. Related aspects, including number of authors; parallel, split-mouth, or other design; single- or multi-centre study; region of publication; type of data analysis (intention-to-treat or per-protocol basis); and number of participants recruited and lost to follow-up, were considered. Of 139 RCTs identified, complete sample size calculations were reported in 41 studies (29.5 per cent). Parallel designs were typically adopted (n = 113; 81 per cent), with 80 per cent (n = 111) involving two arms and 16 per cent having three arms. Data analysis was conducted on an intention-to-treat (ITT) basis in a small minority of studies (n = 18; 13 per cent). According to the calculations presented, overall, a median of 46 participants were required to demonstrate sufficient power to highlight meaningful differences (typically at a power of 80 per cent). The median number of participants recruited was 60, with a median of 4 participants being lost to follow-up. Our finding indicates good agreement between projected numbers required and those verified (median discrepancy: 5.3 per cent), although only a minority of trials (29.5 per cent) could be examined. Although sample size calculations are often reported in trials published as RCTs in orthodontic speciality journals, presentation is suboptimal and in need of significant improvement.

  9. On an Approach to Bayesian Sample Sizing in Clinical Trials

    CERN Document Server

    Muirhead, Robb J

    2012-01-01

    This paper explores an approach to Bayesian sample size determination in clinical trials. The approach falls into the category of what is often called "proper Bayesian", in that it does not mix frequentist concepts with Bayesian ones. A criterion for a "successful trial" is defined in terms of a posterior probability, its probability is assessed using the marginal distribution of the data, and this probability forms the basis for choosing sample sizes. We illustrate with a standard problem in clinical trials, that of establishing superiority of a new drug over a control.

  10. Sample Size Calculations for Precise Interval Estimation of the Eta-Squared Effect Size

    Science.gov (United States)

    Shieh, Gwowen

    2015-01-01

    Analysis of variance is one of the most frequently used statistical analyses in the behavioral, educational, and social sciences, and special attention has been paid to the selection and use of an appropriate effect size measure of association in analysis of variance. This article presents the sample size procedures for precise interval estimation…

  11. Sample size considerations for clinical research studies in nuclear cardiology.

    Science.gov (United States)

    Chiuzan, Cody; West, Erin A; Duong, Jimmy; Cheung, Ken Y K; Einstein, Andrew J

    2015-12-01

    Sample size calculation is an important element of research design that investigators need to consider in the planning stage of the study. Funding agencies and research review panels request a power analysis, for example, to determine the minimum number of subjects needed for an experiment to be informative. Calculating the right sample size is crucial to gaining accurate information and ensures that research resources are used efficiently and ethically. The simple question "How many subjects do I need?" does not always have a simple answer. Before calculating the sample size requirements, a researcher must address several aspects, such as purpose of the research (descriptive or comparative), type of samples (one or more groups), and data being collected (continuous or categorical). In this article, we describe some of the most frequent methods for calculating the sample size with examples from nuclear cardiology research, including for t tests, analysis of variance (ANOVA), non-parametric tests, correlation, Chi-squared tests, and survival analysis. For the ease of implementation, several examples are also illustrated via user-friendly free statistical software.

  12. Optimal and maximin sample sizes for multicentre cost-effectiveness trials.

    Science.gov (United States)

    Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F

    2015-10-01

    This paper deals with the optimal sample sizes for a multicentre trial in which the cost-effectiveness of two treatments in terms of net monetary benefit is studied. A bivariate random-effects model, with the treatment-by-centre interaction effect being random and the main effect of centres fixed or random, is assumed to describe both costs and effects. The optimal sample sizes concern the number of centres and the number of individuals per centre in each of the treatment conditions. These numbers maximize the efficiency or power for given research costs or minimize the research costs at a desired level of efficiency or power. Information on model parameters and sampling costs are required to calculate these optimal sample sizes. In case of limited information on relevant model parameters, sample size formulas are derived for so-called maximin sample sizes which guarantee a power level at the lowest study costs. Four different maximin sample sizes are derived based on the signs of the lower bounds of two model parameters, with one case being worst compared to others. We numerically evaluate the efficiency of the worst case instead of using others. Finally, an expression is derived for calculating optimal and maximin sample sizes that yield sufficient power to test the cost-effectiveness of two treatments. © The Author(s) 2015.

  13. Consultants' forum: should post hoc sample size calculations be done?

    Science.gov (United States)

    Walters, Stephen J

    2009-01-01

    Pre-study sample size calculations for clinical trial research protocols are now mandatory. When an investigator is designing a study to compare the outcomes of an intervention, an essential step is the calculation of sample sizes that will allow a reasonable chance (power) of detecting a pre-determined difference (effect size) in the outcome variable, at a given level of statistical significance. Frequently studies will recruit fewer patients than the initial pre-study sample size calculation suggested. Investigators are faced with the fact that their study may be inadequately powered to detect the pre-specified treatment effect and the statistical analysis of the collected outcome data may or may not report a statistically significant result. If the data produces a "non-statistically significant result" then investigators are frequently tempted to ask the question "Given the actual final study size, what is the power of the study, now, to detect a treatment effect or difference?" The aim of this article is to debate whether or not it is desirable to answer this question and to undertake a power calculation, after the data have been collected and analysed.

  14. Sample size calculation in cost-effectiveness cluster randomized trials: optimal and maximin approaches.

    Science.gov (United States)

    Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F

    2014-07-10

    In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention.

  15. Rock sampling. [method for controlling particle size distribution

    Science.gov (United States)

    Blum, P. (Inventor)

    1971-01-01

    A method for sampling rock and other brittle materials and for controlling resultant particle sizes is described. The method involves cutting grooves in the rock surface to provide a grouping of parallel ridges and subsequently machining the ridges to provide a powder specimen. The machining step may comprise milling, drilling, lathe cutting or the like; but a planing step is advantageous. Control of the particle size distribution is effected primarily by changing the height and width of these ridges. This control exceeds that obtainable by conventional grinding.

  16. Sample size cognizant detection of signals in white noise

    CERN Document Server

    Rao, N Raj

    2007-01-01

    The detection and estimation of signals in noisy, limited data is a problem of interest to many scientific and engineering communities. We present a computationally simple, sample eigenvalue based procedure for estimating the number of high-dimensional signals in white noise when there are relatively few samples. We highlight a fundamental asymptotic limit of sample eigenvalue based detection of weak high-dimensional signals from a limited sample size and discuss its implication for the detection of two closely spaced signals. This motivates our heuristic definition of the 'effective number of identifiable signals.' Numerical simulations are used to demonstrate the consistency of the algorithm with respect to the effective number of signals and the superior performance of the algorithm with respect to Wax and Kailath's "asymptotically consistent" MDL based estimator.

  17. Power and sample size in cost-effectiveness analysis.

    Science.gov (United States)

    Laska, E M; Meisner, M; Siegel, C

    1999-01-01

    For resource allocation under a constrained budget, optimal decision rules for mutually exclusive programs require that the treatment with the highest incremental cost-effectiveness ratio (ICER) below a willingness-to-pay (WTP) criterion be funded. This is equivalent to determining the treatment with the smallest net health cost. The designer of a cost-effectiveness study needs to select a sample size so that the power to reject the null hypothesis, the equality of the net health costs of two treatments, is high. A recently published formula derived under normal distribution theory overstates sample-size requirements. Using net health costs, the authors present simple methods for power analysis based on conventional normal and on nonparametric statistical theory.

  18. Hydrophobicity of soil samples and soil size fractions

    Energy Technology Data Exchange (ETDEWEB)

    Lowen, H.A.; Dudas, M.J. [Alberta Univ., Edmonton, AB (Canada). Dept. of Renewable Resources; Roy, J.L. [Imperial Oil Resources Canada, Calgary, AB (Canada); Johnson, R.L. [Alberta Research Council, Vegreville, AB (Canada); McGill, W.B. [Alberta Univ., Edmonton, AB (Canada). Dept. of Renewable Resources

    2001-07-01

    The inability of dry soil to absorb water droplets within 10 seconds or less is defined as soil hydrophobicity. The severity, persistence and circumstances causing it vary greatly. There is a possibility that hydrophobicity in Alberta is a symptom of crude oil spills. In this study, the authors investigated the severity of soil hydrophobicity, as determined by the molarity of ethanol droplet test (MED) and dichloromethane extractable organic (DEO) concentration. The soil samples were collected from pedons within 12 hydrophobic soil sites, located northeast from Calgary to Cold Lake, Alberta. All the sites were located at an elevation ranging from 450 metres to 990 metres above sea level. The samples contained compounds from the Chernozemic, Gleysolic, Luvisolic, and Solonetzic soil orders. The results obtained indicated that the MED and DEO were positively correlated in whole soil samples. No relationships were found between MED and DEO in soil samples divided in soil fractions. More severe hydrophobicity and lower DEO concentrations were exhibited in clay- and silt-sized particles in the less than 53 micrometres, when compared to the samples in the other fraction (between 53 and 2000 micrometres). It was concluded that hydrophobicity was not restricted to a particular soil particle size class. 5 refs., 4 figs.

  19. Using an EM Covariance Matrix to Estimate Structural Equation Models with Missing Data: Choosing an Adjusted Sample Size to Improve the Accuracy of Inferences

    Science.gov (United States)

    Enders, Craig K.; Peugh, James L.

    2004-01-01

    Two methods, direct maximum likelihood (ML) and the expectation maximization (EM) algorithm, can be used to obtain ML parameter estimates for structural equation models with missing data (MD). Although the 2 methods frequently produce identical parameter estimates, it may be easier to satisfy missing at random assumptions using EM. However, no…

  20. Sample size of the reference sample in a case-augmented study.

    Science.gov (United States)

    Ghosh, Palash; Dewanji, Anup

    2017-05-01

    The case-augmented study, in which a case sample is augmented with a reference (random) sample from the source population with only covariates information known, is becoming popular in different areas of applied science such as pharmacovigilance, ecology, and econometrics. In general, the case sample is available from some source (for example, hospital database, case registry, etc.); however, the reference sample is required to be drawn from the corresponding source population. The required minimum size of the reference sample is an important issue in this regard. In this work, we address the minimum sample size calculation and discuss related issues. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Blinded sample size reestimation in non-inferiority trials with binary endpoints.

    Science.gov (United States)

    Friede, Tim; Mitchell, Charles; Müller-Velten, Günther

    2007-12-01

    Sample size calculations in the planning of clinical trials depend on good estimates of the model parameters involved. When the estimates of these parameters have a high degree of uncertainty attached to them, it is advantageous to reestimate the sample size after an internal pilot study. For non-inferiority trials with binary outcome we compare the performance of Type I error rate and power between fixed-size designs and designs with sample size reestimation. The latter design shows itself to be effective in correcting sample size and power of the tests when misspecification of nuisance parameters occurs with the former design. (c) 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

  2. Sample size for monitoring sirex populations and their natural enemies

    Directory of Open Access Journals (Sweden)

    Susete do Rocio Chiarello Penteado

    2016-09-01

    Full Text Available The woodwasp Sirex noctilio Fabricius (Hymenoptera: Siricidae was introduced in Brazil in 1988 and became the main pest in pine plantations. It has spread to about 1.000.000 ha, at different population levels, in the states of Rio Grande do Sul, Santa Catarina, Paraná, São Paulo and Minas Gerais. Control is done mainly by using a nematode, Deladenus siricidicola Bedding (Nematoda: Neothylenchidae. The evaluation of the efficiency of natural enemies has been difficult because there are no appropriate sampling systems. This study tested a hierarchical sampling system to define the sample size to monitor the S. noctilio population and the efficiency of their natural enemies, which was found to be perfectly adequate.

  3. Sample size for monitoring sirex populations and their natural enemies

    Directory of Open Access Journals (Sweden)

    Susete do Rocio Chiarello Penteado

    2016-09-01

    Full Text Available The woodwasp Sirex noctilio Fabricius (Hymenoptera: Siricidae was introduced in Brazil in 1988 and became the main pest in pine plantations. It has spread to about 1.000.000 ha, at different population levels, in the states of Rio Grande do Sul, Santa Catarina, Paraná, São Paulo and Minas Gerais. Control is done mainly by using a nematode, Deladenus siricidicola Bedding (Nematoda: Neothylenchidae. The evaluation of the efficiency of natural enemies has been difficult because there are no appropriate sampling systems. This study tested a hierarchical sampling system to define the sample size to monitor the S. noctilio population and the efficiency of their natural enemies, which was found to be perfectly adequate.

  4. Hierarchical modeling of cluster size in wildlife surveys

    Science.gov (United States)

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  5. Sample size and precision in NIH peer review.

    Directory of Open Access Journals (Sweden)

    David Kaplan

    Full Text Available The Working Group on Peer Review of the Advisory Committee to the Director of NIH has recommended that at least 4 reviewers should be used to assess each grant application. A sample size analysis of the number of reviewers needed to evaluate grant applications reveals that a substantially larger number of evaluators are required to provide the level of precision that is currently mandated. NIH should adjust their peer review system to account for the number of reviewers needed to provide adequate precision in their evaluations.

  6. Comparing Server Energy Use and Efficiency Using Small Sample Sizes

    Energy Technology Data Exchange (ETDEWEB)

    Coles, Henry C.; Qin, Yong; Price, Phillip N.

    2014-11-01

    This report documents a demonstration that compared the energy consumption and efficiency of a limited sample size of server-type IT equipment from different manufacturers by measuring power at the server power supply power cords. The results are specific to the equipment and methods used. However, it is hoped that those responsible for IT equipment selection can used the methods described to choose models that optimize energy use efficiency. The demonstration was conducted in a data center at Lawrence Berkeley National Laboratory in Berkeley, California. It was performed with five servers of similar mechanical and electronic specifications; three from Intel and one each from Dell and Supermicro. Server IT equipment is constructed using commodity components, server manufacturer-designed assemblies, and control systems. Server compute efficiency is constrained by the commodity component specifications and integration requirements. The design freedom, outside of the commodity component constraints, provides room for the manufacturer to offer a product with competitive efficiency that meets market needs at a compelling price. A goal of the demonstration was to compare and quantify the server efficiency for three different brands. The efficiency is defined as the average compute rate (computations per unit of time) divided by the average energy consumption rate. The research team used an industry standard benchmark software package to provide a repeatable software load to obtain the compute rate and provide a variety of power consumption levels. Energy use when the servers were in an idle state (not providing computing work) were also measured. At high server compute loads, all brands, using the same key components (processors and memory), had similar results; therefore, from these results, it could not be concluded that one brand is more efficient than the other brands. The test results show that the power consumption variability caused by the key components as a

  7. Procedures manual for the recommended ARB (Air Resources Board) sized chemical sample method (cascade cyclones)

    Energy Technology Data Exchange (ETDEWEB)

    McCain, J.D.; Dawes, S.S.; Farthing, W.E.

    1986-05-01

    The report is Attachment No. 2 to the Final Report of ARB Contract A3-092-32 and provides a tutorial on the use of Cascade (Series) Cyclones to obtain size-fractionated particulate samples from industrial flue gases at stationary sources. The instrumentation and procedures described are designed to protect the purity of the collected samples so that post-test chemical analysis may be performed for organic and inorganic compounds, including instrumental analysis for trace elements. The instrumentation described collects bulk quantities for each of six size fractions over the range 10 to 0.4 micrometer diameter. The report describes the operating principles, calibration, and empirical modeling of small cyclone performance. It also discusses the preliminary calculations, operation, sample retrieval, and data analysis associated with the use of cyclones to obtain size-segregated samples and to measure particle-size distributions.

  8. Sample size reduction in groundwater surveys via sparse data assimilation

    KAUST Repository

    Hussain, Z.

    2013-04-01

    In this paper, we focus on sparse signal recovery methods for data assimilation in groundwater models. The objective of this work is to exploit the commonly understood spatial sparsity in hydrodynamic models and thereby reduce the number of measurements to image a dynamic groundwater profile. To achieve this we employ a Bayesian compressive sensing framework that lets us adaptively select the next measurement to reduce the estimation error. An extension to the Bayesian compressive sensing framework is also proposed which incorporates the additional model information to estimate system states from even lesser measurements. Instead of using cumulative imaging-like measurements, such as those used in standard compressive sensing, we use sparse binary matrices. This choice of measurements can be interpreted as randomly sampling only a small subset of dug wells at each time step, instead of sampling the entire grid. Therefore, this framework offers groundwater surveyors a significant reduction in surveying effort without compromising the quality of the survey. © 2013 IEEE.

  9. Progressive prediction method for failure data with small sample size

    Institute of Scientific and Technical Information of China (English)

    WANG Zhi-hua; FU Hui-min; LIU Cheng-rui

    2011-01-01

    The small sample prediction problem which commonly exists in reliability analysis was discussed with the progressive prediction method in this paper.The modeling and estimation procedure,as well as the forecast and confidence limits formula of the progressive auto regressive(PAR) method were discussed in great detail.PAR model not only inherits the simple linear features of auto regressive(AR) model,but also has applicability for nonlinear systems.An application was illustrated for predicting the future fatigue failure for Tantalum electrolytic capacitors.Forecasting results of PAR model were compared with auto regressive moving average(ARMA) model,and it can be seen that the PAR method can be considered good and shows a promise for future applications.

  10. Quantifying the degree of bias from using county-scale data in species distribution modeling: Can increasing sample size or using county-averaged environmental data reduce distributional overprediction?

    Science.gov (United States)

    Collins, Steven D; Abbott, John C; McIntyre, Nancy E

    2017-08-01

    Citizen-science databases have been used to develop species distribution models (SDMs), although many taxa may be only georeferenced to county. It is tacitly assumed that SDMs built from county-scale data should be less precise than those built with more accurate localities, but the extent of the bias is currently unknown. Our aims in this study were to illustrate the effects of using county-scale data on the spatial extent and accuracy of SDMs relative to true locality data and to compare potential compensatory methods (including increased sample size and using overall county environmental averages rather than point locality environmental data). To do so, we developed SDMs in maxent with PRISM-derived BIOCLIM parameters for 283 and 230 species of odonates (dragonflies and damselflies) and butterflies, respectively, for five subsets from the OdonataCentral and Butterflies and Moths of North America citizen-science databases: (1) a true locality dataset, (2) a corresponding sister dataset of county-centroid coordinates, (3) a dataset where the average environmental conditions within each county were assigned to each record, (4) a 50/50% mix of true localities and county-centroid coordinates, and (5) a 50/50% mix of true localities and records assigned the average environmental conditions within each county. These mixtures allowed us to quantify the degree of bias from county-scale data. Models developed with county centroids overpredicted the extent of suitable habitat by 15% on average compared to true locality models, although larger sample sizes (>100 locality records) reduced this disparity. Assigning county-averaged environmental conditions did not offer consistent improvement, however. Because county-level data are of limited value for developing SDMs except for species that are widespread and well collected or that inhabit regions where small, climatically uniform counties predominate, three means of encouraging more accurate georeferencing in citizen

  11. Modelling of Size Effect with Regularised Continua

    Directory of Open Access Journals (Sweden)

    H. Askes

    2004-01-01

    Full Text Available A nonlocal damage continuum and a viscoplastic damage continuum are used to model size effects. Three-point bending specimens are analysed, whereby a distinction is made between unnotched specimens, specimens with a constant notch and specimens with a proportionally scaled notch. Numerical finite element simulations have been performed for specimen sizes in a range of 1:64. Size effects are established in terms of nominal strength and compared to existing size effect models from the literature. 

  12. Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization

    Science.gov (United States)

    Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A.

    2017-01-01

    The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between

  13. Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization.

    Science.gov (United States)

    Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A

    2017-01-01

    The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between

  14. Size Matters: FTIR Spectral Analysis of Apollo Regolith Samples Exhibits Grain Size Dependence.

    Science.gov (United States)

    Martin, Dayl; Joy, Katherine; Pernet-Fisher, John; Wogelius, Roy; Morlok, Andreas; Hiesinger, Harald

    2017-04-01

    The Mercury Thermal Infrared Spectrometer (MERTIS) on the upcoming BepiColombo mission is designed to analyse the surface of Mercury in thermal infrared wavelengths (7-14 μm) to investigate the physical properties of the surface materials [1]. Laboratory analyses of analogue materials are useful for investigating how various sample properties alter the resulting infrared spectrum. Laboratory FTIR analysis of Apollo fine (60%) causes a 'flattening' of the spectrum, with reduced reflectance in the Reststrahlen Band region (RB) as much as 30% in comparison to samples that are dominated by a high proportion of crystalline material. Apollo 15401,147 is an immature regolith with a high proportion of volcanic glass pyroclastic beads [2]. The high mafic mineral content results in a systematic shift in the Christiansen Feature (CF - the point of lowest reflectance) to longer wavelength: 8.6 μm. The glass beads dominate the spectrum, displaying a broad peak around the main Si-O stretch band (at 10.8 μm). As such, individual mineral components of this sample cannot be resolved from the average spectrum alone. Apollo 67481,96 is a sub-mature regolith composed dominantly of anorthite plagioclase [2]. The CF position of the average spectrum is shifted to shorter wavelengths (8.2 μm) due to the higher proportion of felsic minerals. Its average spectrum is dominated by anorthite reflectance bands at 8.7, 9.1, 9.8, and 10.8 μm. The average reflectance is greater than the other samples due to a lower proportion of glassy material. In each soil, the smallest fractions (0-25 and 25-63 μm) have CF positions 0.1-0.4 μm higher than the larger grain sizes. Also, the bulk-sample spectra mostly closely resemble the 0-25 μm sieved size fraction spectrum, indicating that this size fraction of each sample dominates the bulk spectrum regardless of other physical properties. This has implications for surface analyses of other Solar System bodies where some mineral phases or components

  15. RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment.

    Science.gov (United States)

    Guo, Yan; Zhao, Shilin; Li, Chung-I; Sheng, Quanhu; Shyr, Yu

    2014-01-01

    Sample size and power determination is the first step in the experimental design of a successful study. Sample size and power calculation is required for applications for National Institutes of Health (NIH) funding. Sample size and power calculation is well established for traditional biological studies such as mouse model, genome wide association study (GWAS), and microarray studies. Recent developments in high-throughput sequencing technology have allowed RNAseq to replace microarray as the technology of choice for high-throughput gene expression profiling. However, the sample size and power analysis of RNAseq technology is an underdeveloped area. Here, we present RNAseqPS, an advanced online RNAseq power and sample size calculation tool based on the Poisson and negative binomial distributions. RNAseqPS was built using the Shiny package in R. It provides an interactive graphical user interface that allows the users to easily conduct sample size and power analysis for RNAseq experimental design. RNAseqPS can be accessed directly at http://cqs.mc.vanderbilt.edu/shiny/RNAseqPS/.

  16. A normative inference approach for optimal sample sizes in decisions from experience.

    Science.gov (United States)

    Ostwald, Dirk; Starke, Ludger; Hertwig, Ralph

    2015-01-01

    "Decisions from experience" (DFE) refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experience-based choice is the "sampling paradigm," which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the "optimal" sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical study, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for DFE. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE.

  17. A normative inference approach for optimal sample sizes in decisions from experience

    Directory of Open Access Journals (Sweden)

    Dirk eOstwald

    2015-09-01

    Full Text Available Decisions from experience (DFE refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experienced-based choice is the sampling paradigm, which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the optimal sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical manuscript, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for decisions from experience. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE.

  18. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  19. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  20. Constrained statistical inference: sample-size tables for ANOVA and regression.

    Science.gov (United States)

    Vanbrabant, Leonard; Van De Schoot, Rens; Rosseel, Yves

    2014-01-01

    Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient β1 is larger than β2 and β3. The corresponding hypothesis is H: β1 > {β2, β3} and this is known as an (order) constrained hypothesis. A major advantage of testing such a hypothesis is that power can be gained and inherently a smaller sample size is needed. This article discusses this gain in sample size reduction, when an increasing number of constraints is included into the hypothesis. The main goal is to present sample-size tables for constrained hypotheses. A sample-size table contains the necessary sample-size at a pre-specified power (say, 0.80) for an increasing number of constraints. To obtain sample-size tables, two Monte Carlo simulations were performed, one for ANOVA and one for multiple regression. Three results are salient. First, in an ANOVA the needed sample-size decreases with 30-50% when complete ordering of the parameters is taken into account. Second, small deviations from the imposed order have only a minor impact on the power. Third, at the maximum number of constraints, the linear regression results are comparable with the ANOVA results. However, in the case of fewer constraints, ordering the parameters (e.g., β1 > β2) results in a higher power than assigning a positive or a negative sign to the parameters (e.g., β1 > 0).

  1. Sample size re-estimation in paired comparative diagnostic accuracy studies with a binary response.

    Science.gov (United States)

    McCray, Gareth P J; Titman, Andrew C; Ghaneh, Paula; Lancaster, Gillian A

    2017-07-14

    The sample size required to power a study to a nominal level in a paired comparative diagnostic accuracy study, i.e. studies in which the diagnostic accuracy of two testing procedures is compared relative to a gold standard, depends on the conditional dependence between the two tests - the lower the dependence the greater the sample size required. A priori, we usually do not know the dependence between the two tests and thus cannot determine the exact sample size required. One option is to use the implied sample size for the maximal negative dependence, giving the largest possible sample size. However, this is potentially wasteful of resources and unnecessarily burdensome on study participants as the study is likely to be overpowered. A more accurate estimate of the sample size can be determined at a planned interim analysis point where the sample size is re-estimated. This paper discusses a sample size estimation and re-estimation method based on the maximum likelihood estimates, under an implied multinomial model, of the observed values of conditional dependence between the two tests and, if required, prevalence, at a planned interim. The method is illustrated by comparing the accuracy of two procedures for the detection of pancreatic cancer, one procedure using the standard battery of tests, and the other using the standard battery with the addition of a PET/CT scan all relative to the gold standard of a cell biopsy. Simulation of the proposed method illustrates its robustness under various conditions. The results show that the type I error rate of the overall experiment is stable using our suggested method and that the type II error rate is close to or above nominal. Furthermore, the instances in which the type II error rate is above nominal are in the situations where the lowest sample size is required, meaning a lower impact on the actual number of participants recruited. We recommend multinomial model maximum likelihood estimation of the conditional

  2. Species-genetic diversity correlations in habitat fragmentation can be biased by small sample sizes.

    Science.gov (United States)

    Nazareno, Alison G; Jump, Alistair S

    2012-06-01

    Predicted parallel impacts of habitat fragmentation on genes and species lie at the core of conservation biology, yet tests of this rule are rare. In a recent article in Ecology Letters, Struebig et al. (2011) report that declining genetic diversity accompanies declining species diversity in tropical forest fragments. However, this study estimates diversity in many populations through extrapolation from very small sample sizes. Using the data of this recent work, we show that results estimated from the smallest sample sizes drive the species-genetic diversity correlation (SGDC), owing to a false-positive association between habitat fragmentation and loss of genetic diversity. Small sample sizes are a persistent problem in habitat fragmentation studies, the results of which often do not fit simple theoretical models. It is essential, therefore, that data assessing the proposed SGDC are sufficient in order that conclusions be robust.

  3. A simple method for estimating genetic diversity in large populations from finite sample sizes

    Directory of Open Access Journals (Sweden)

    Rajora Om P

    2009-12-01

    Full Text Available Abstract Background Sample size is one of the critical factors affecting the accuracy of the estimation of population genetic diversity parameters. Small sample sizes often lead to significant errors in determining the allelic richness, which is one of the most important and commonly used estimators of genetic diversity in populations. Correct estimation of allelic richness in natural populations is challenging since they often do not conform to model assumptions. Here, we introduce a simple and robust approach to estimate the genetic diversity in large natural populations based on the empirical data for finite sample sizes. Results We developed a non-linear regression model to infer genetic diversity estimates in large natural populations from finite sample sizes. The allelic richness values predicted by our model were in good agreement with those observed in the simulated data sets and the true allelic richness observed in the source populations. The model has been validated using simulated population genetic data sets with different evolutionary scenarios implied in the simulated populations, as well as large microsatellite and allozyme experimental data sets for four conifer species with contrasting patterns of inherent genetic diversity and mating systems. Our model was a better predictor for allelic richness in natural populations than the widely-used Ewens sampling formula, coalescent approach, and rarefaction algorithm. Conclusions Our regression model was capable of accurately estimating allelic richness in natural populations regardless of the species and marker system. This regression modeling approach is free from assumptions and can be widely used for population genetic and conservation applications.

  4. On the repeated measures designs and sample sizes for randomized controlled trials.

    Science.gov (United States)

    Tango, Toshiro

    2016-04-01

    For the analysis of longitudinal or repeated measures data, generalized linear mixed-effects models provide a flexible and powerful tool to deal with heterogeneity among subject response profiles. However, the typical statistical design adopted in usual randomized controlled trials is an analysis of covariance type analysis using a pre-defined pair of "pre-post" data, in which pre-(baseline) data are used as a covariate for adjustment together with other covariates. Then, the major design issue is to calculate the sample size or the number of subjects allocated to each treatment group. In this paper, we propose a new repeated measures design and sample size calculations combined with generalized linear mixed-effects models that depend not only on the number of subjects but on the number of repeated measures before and after randomization per subject used for the analysis. The main advantages of the proposed design combined with the generalized linear mixed-effects models are (1) it can easily handle missing data by applying the likelihood-based ignorable analyses under the missing at random assumption and (2) it may lead to a reduction in sample size, compared with the simple pre-post design. The proposed designs and the sample size calculations are illustrated with real data arising from randomized controlled trials.

  5. Developing optimum sample size and multistage sampling plans for Lobesia botrana (Lepidoptera: Tortricidae) larval infestation and injury in northern Greece.

    Science.gov (United States)

    Ifoulis, A A; Savopoulou-Soultani, M

    2006-10-01

    The purpose of this research was to quantify the spatial pattern and develop a sampling program for larvae of Lobesia botrana Denis and Schiffermüller (Lepidoptera: Tortricidae), an important vineyard pest in northern Greece. Taylor's power law and Iwao's patchiness regression were used to model the relationship between the mean and the variance of larval counts. Analysis of covariance was carried out, separately for infestation and injury, with combined second and third generation data, for vine and half-vine sample units. Common regression coefficients were estimated to permit use of the sampling plan over a wide range of conditions. Optimum sample sizes for infestation and injury, at three levels of precision, were developed. An investigation of a multistage sampling plan with a nested analysis of variance showed that if the goal of sampling is focusing on larval infestation, three grape clusters should be sampled in a half-vine; if the goal of sampling is focusing on injury, then two grape clusters per half-vine are recommended.

  6. A Size-based Ecosystem Model

    DEFF Research Database (Denmark)

    Ravn-Jonsen, Lars

     Ecosystem Management requires models that can link the ecosystem level to the operation level. This link can be created by an ecosystem production model. Because the function of the individual fish in the marine ecosystem, seen in trophic context, is closely related to its size, the model groups...... fish according to size. The model summarises individual predation events into ecosystem level properties, and thereby uses the law of conversation of mass as a framework. This paper provides the background, the conceptual model, basic assumptions, integration of fishing activities, mathematical...... completion, and a numeric implementation. Using two experiments, the model's ability to act as tool for economic production analysis and regulation design testing is demonstrated. The presented model is the simplest possible and is built on the principles of (i) size, as the attribute that determines...

  7. Sample-size calculations for multi-group comparison in population pharmacokinetic experiments.

    Science.gov (United States)

    Ogungbenro, Kayode; Aarons, Leon

    2010-01-01

    This paper describes an approach for calculating sample size for population pharmacokinetic experiments that involve hypothesis testing based on multi-group comparison detecting the difference in parameters between groups under mixed-effects modelling. This approach extends what has been described for generalized linear models and nonlinear population pharmacokinetic models that involve only binary covariates to more complex nonlinear population pharmacokinetic models. The structural nonlinear model is linearized around the random effects to obtain the marginal model and the hypothesis testing involving model parameters is based on Wald's test. This approach provides an efficient and fast method for calculating sample size for hypothesis testing in population pharmacokinetic models. The approach can also handle different design problems such as unequal allocation of subjects to groups and unbalanced sampling times between and within groups. The results obtained following application to a one compartment intravenous bolus dose model that involved three different hypotheses under different scenarios showed good agreement between the power obtained from NONMEM simulations and nominal power.

  8. Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method.

    Science.gov (United States)

    Eldridge, Sandra M; Ashby, Deborah; Kerry, Sally

    2006-10-01

    Cluster randomized trials are increasingly popular. In many of these trials, cluster sizes are unequal. This can affect trial power, but standard sample size formulae for these trials ignore this. Previous studies addressing this issue have mostly focused on continuous outcomes or methods that are sometimes difficult to use in practice. We show how a simple formula can be used to judge the possible effect of unequal cluster sizes for various types of analyses and both continuous and binary outcomes. We explore the practical estimation of the coefficient of variation of cluster size required in this formula and demonstrate the formula's performance for a hypothetical but typical trial randomizing UK general practices. The simple formula provides a good estimate of sample size requirements for trials analysed using cluster-level analyses weighting by cluster size and a conservative estimate for other types of analyses. For trials randomizing UK general practices the coefficient of variation of cluster size depends on variation in practice list size, variation in incidence or prevalence of the medical condition under examination, and practice and patient recruitment strategies, and for many trials is expected to be approximately 0.65. Individual-level analyses can be noticeably more efficient than some cluster-level analyses in this context. When the coefficient of variation is <0.23, the effect of adjustment for variable cluster size on sample size is negligible. Most trials randomizing UK general practices and many other cluster randomized trials should account for variable cluster size in their sample size calculations.

  9. Threshold-dependent sample sizes for selenium assessment with stream fish tissue

    Science.gov (United States)

    Hitt, Nathaniel P.; Smith, David R.

    2015-01-01

    Natural resource managers are developing assessments of selenium (Se) contamination in freshwater ecosystems based on fish tissue concentrations. We evaluated the effects of sample size (i.e., number of fish per site) on the probability of correctly detecting mean whole-body Se values above a range of potential management thresholds. We modeled Se concentrations as gamma distributions with shape and scale parameters fitting an empirical mean-to-variance relationship in data from southwestern West Virginia, USA (63 collections, 382 individuals). We used parametric bootstrapping techniques to calculate statistical power as the probability of detecting true mean concentrations up to 3 mg Se/kg above management thresholds ranging from 4 to 8 mg Se/kg. Sample sizes required to achieve 80% power varied as a function of management thresholds and Type I error tolerance (α). Higher thresholds required more samples than lower thresholds because populations were more heterogeneous at higher mean Se levels. For instance, to assess a management threshold of 4 mg Se/kg, a sample of eight fish could detect an increase of approximately 1 mg Se/kg with 80% power (given α = 0.05), but this sample size would be unable to detect such an increase from a management threshold of 8 mg Se/kg with more than a coin-flip probability. Increasing α decreased sample size requirements to detect above-threshold mean Se concentrations with 80% power. For instance, at an α-level of 0.05, an 8-fish sample could detect an increase of approximately 2 units above a threshold of 8 mg Se/kg with 80% power, but when α was relaxed to 0.2, this sample size was more sensitive to increasing mean Se concentrations, allowing detection of an increase of approximately 1.2 units with equivalent power. Combining individuals into 2- and 4-fish composite samples for laboratory analysis did not decrease power because the reduced number of laboratory samples was compensated for by increased

  10. A web application for sample size and power calculation in case-control microbiome studies.

    Science.gov (United States)

    Mattiello, Federico; Verbist, Bie; Faust, Karoline; Raes, Jeroen; Shannon, William D; Bijnens, Luc; Thas, Olivier

    2016-07-01

    : When designing a case-control study to investigate differences in microbial composition, it is fundamental to assess the sample sizes needed to detect an hypothesized difference with sufficient statistical power. Our application includes power calculation for (i) a recoded version of the two-sample generalized Wald test of the 'HMP' R-package for comparing community composition, and (ii) the Wilcoxon-Mann-Whitney test for comparing operational taxonomic unit-specific abundances between two samples (optional). The simulation-based power calculations make use of the Dirichlet-Multinomial model to describe and generate abundances. The web interface allows for easy specification of sample and effect sizes. As an illustration of our application, we compared the statistical power of the two tests, with and without stratification of samples. We observed that statistical power increases considerably when stratification is employed, meaning that less samples are needed to detect the same effect size with the same power. The web interface is written in R code using Shiny (RStudio Inc., 2016) and it is available at https://fedematt.shinyapps.io/shinyMB The R code for the recoded generalized Wald test can be found at https://github.com/mafed/msWaldHMP CONTACT: Federico.Mattiello@UGent.be. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach.

    Science.gov (United States)

    Sheehan, Sara; Harris, Kelley; Song, Yun S

    2013-07-01

    Throughout history, the population size of modern humans has varied considerably due to changes in environment, culture, and technology. More accurate estimates of population size changes, and when they occurred, should provide a clearer picture of human colonization history and help remove confounding effects from natural selection inference. Demography influences the pattern of genetic variation in a population, and thus genomic data of multiple individuals sampled from one or more present-day populations contain valuable information about the past demographic history. Recently, Li and Durbin developed a coalescent-based hidden Markov model, called the pairwise sequentially Markovian coalescent (PSMC), for a pair of chromosomes (or one diploid individual) to estimate past population sizes. This is an efficient, useful approach, but its accuracy in the very recent past is hampered by the fact that, because of the small sample size, only few coalescence events occur in that period. Multiple genomes from the same population contain more information about the recent past, but are also more computationally challenging to study jointly in a coalescent framework. Here, we present a new coalescent-based method that can efficiently infer population size changes from multiple genomes, providing access to a new store of information about the recent past. Our work generalizes the recently developed sequentially Markov conditional sampling distribution framework, which provides an accurate approximation of the probability of observing a newly sampled haplotype given a set of previously sampled haplotypes. Simulation results demonstrate that we can accurately reconstruct the true population histories, with a significant improvement over the PSMC in the recent past. We apply our method, called diCal, to the genomes of multiple human individuals of European and African ancestry to obtain a detailed population size change history during recent times.

  12. Size variation in samples of fossil and recent murid teeth

    NARCIS (Netherlands)

    Freudenthal, M.; Martín Suárez, E.

    1990-01-01

    The variability coefficient proposed by Freudenthal & Cuenca Bescós (1984) for samples of fossil cricetid teeth, is calculated for about 200 samples of fossil and recent murid teeth. The results are discussed, and compared with those obtained for the Cricetidae.

  13. Size variation in samples of fossil and recent murid teeth

    NARCIS (Netherlands)

    Freudenthal, M.; Martín Suárez, E.

    1990-01-01

    The variability coefficient proposed by Freudenthal & Cuenca Bescós (1984) for samples of fossil cricetid teeth, is calculated for about 200 samples of fossil and recent murid teeth. The results are discussed, and compared with those obtained for the Cricetidae.

  14. Modelling complete particle-size distributions from operator estimates of particle-size

    Science.gov (United States)

    Roberson, Sam; Weltje, Gert Jan

    2014-05-01

    Estimates of particle-size made by operators in the field and laboratory represent a vast and relatively untapped data archive. The wide spatial distribution of particle-size estimates makes them ideal for constructing geological models and soil maps. This study uses a large data set from the Netherlands (n = 4837) containing both operator estimates of particle size and complete particle-size distributions measured by laser granulometry. This study introduces a logit-based constrained-cubic-spline (CCS) algorithm to interpolate complete particle-size distributions from operator estimates. The CCS model is compared to four other models: (i) a linear interpolation; (ii) a log-hyperbolic interpolation; (iii) an empirical logistic function; and (iv) an empirical arctan function. Operator estimates were found to be both inaccurate and imprecise; only 14% of samples were successfully classified using the Dutch classification scheme for fine sediment. Operator estimates of sediment particle-size encompass the same range of values as particle-size distributions measured by laser analysis. However, the distributions measured by laser analysis show that most of the sand percentage values lie between zero and one, so the majority of the variability in the data is lost because operator estimates are made to the nearest 1% at best, and more frequently to the nearest 5%. A method for constructing complete particle-size distributions from operator estimates of sediment texture using a logit constrained cubit spline (CCS) interpolation algorithm is presented. This model and four other previously published methods are compared to establish the best approach to modelling particle-size distributions. The logit-CCS model is the most accurate method, although both logit-linear and log-linear interpolation models provide reasonable alternatives. Models based on empirical distribution functions are less accurate than interpolation algorithms for modelling particle-size distributions in

  15. Sample size for estimating the mean concentration of organisms in ballast water.

    Science.gov (United States)

    Costa, Eliardo G; Lopes, Rubens M; Singer, Julio M

    2016-09-15

    We consider the computation of sample sizes for estimating the mean concentration of organisms in ballast water. Given the possible heterogeneity of their distribution in the tank, we adopt a negative binomial model to obtain confidence intervals for the mean concentration. We show that the results obtained by Chen and Chen (2012) in a different set-up hold for the proposed model and use them to develop algorithms to compute sample sizes both in cases where the mean concentration is known to lie in some bounded interval or where there is no information about its range. We also construct simple diagrams that may be easily employed to decide for compliance with the D-2 regulation of the International Maritime Organization (IMO). Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Exact sampling hardness of Ising spin models

    Science.gov (United States)

    Fefferman, B.; Foss-Feig, M.; Gorshkov, A. V.

    2017-09-01

    We study the complexity of classically sampling from the output distribution of an Ising spin model, which can be implemented naturally in a variety of atomic, molecular, and optical systems. In particular, we construct a specific example of an Ising Hamiltonian that, after time evolution starting from a trivial initial state, produces a particular output configuration with probability very nearly proportional to the square of the permanent of a matrix with arbitrary integer entries. In a similar spirit to boson sampling, the ability to sample classically from the probability distribution induced by time evolution under this Hamiltonian would imply unlikely complexity theoretic consequences, suggesting that the dynamics of such a spin model cannot be efficiently simulated with a classical computer. Physical Ising spin systems capable of achieving problem-size instances (i.e., qubit numbers) large enough so that classical sampling of the output distribution is classically difficult in practice may be achievable in the near future. Unlike boson sampling, our current results only imply hardness of exact classical sampling, leaving open the important question of whether a much stronger approximate-sampling hardness result holds in this context. The latter is most likely necessary to enable a convincing experimental demonstration of quantum supremacy. As referenced in a recent paper [A. Bouland, L. Mancinska, and X. Zhang, in Proceedings of the 31st Conference on Computational Complexity (CCC 2016), Leibniz International Proceedings in Informatics (Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Dagstuhl, 2016)], our result completes the sampling hardness classification of two-qubit commuting Hamiltonians.

  17. Two Test Items to Explore High School Students' Beliefs of Sample Size When Sampling from Large Populations

    Science.gov (United States)

    Bill, Anthony; Henderson, Sally; Penman, John

    2010-01-01

    Two test items that examined high school students' beliefs of sample size for large populations using the context of opinion polls conducted prior to national and state elections were developed. A trial of the two items with 21 male and 33 female Year 9 students examined their naive understanding of sample size: over half of students chose a…

  18. Effect of sample size on the fluid flow through a single fractured granitoid

    Institute of Scientific and Technical Information of China (English)

    Kunal Kumar Singh; Devendra Narain Singh; Ranjith Pathegama Gamage

    2016-01-01

    Most of deep geological engineered structures, such as rock caverns, nuclear waste disposal repositories, metro rail tunnels, multi-layer underground parking, are constructed within hard crystalline rocks because of their high quality and low matrix permeability. In such rocks, fluid flows mainly through fractures. Quantification of fractures along with the behavior of the fluid flow through them, at different scales, becomes quite important. Earlier studies have revealed the influence of sample size on the confining stressepermeability relationship and it has been demonstrated that permeability of the fractured rock mass decreases with an increase in sample size. However, most of the researchers have employed numerical simulations to model fluid flow through the fracture/fracture network, or laboratory investigations on intact rock samples with diameter ranging between 38 mm and 45 cm and the diameter-to-length ratio of 1:2 using different experimental methods. Also, the confining stress, s3, has been considered to be less than 30 MPa and the effect of fracture roughness has been ignored. In the present study, an extension of the previous studies on “laboratory simulation of flow through single fractured granite” was conducted, in which consistent fluid flow experiments were performed on cy-lindrical samples of granitoids of two different sizes (38 mm and 54 mm in diameters), containing a“rough walled single fracture”. These experiments were performed under varied confining pressure (s3 ¼ 5e40 MPa), fluid pressure (fp ? 25 MPa), and fracture roughness. The results indicate that a nonlinear relationship exists between the discharge, Q, and the effective confining pressure, sef ., and Q decreases with an increase in sef .. Also, the effects of sample size and fracture roughness do not persist when sef . ? 20 MPa. It is expected that such a study will be quite useful in correlating and extrapolating the laboratory scale investigations to in-situ scale and

  19. Utility of Inferential Norming with Smaller Sample Sizes

    Science.gov (United States)

    Zhu, Jianjun; Chen, Hsin-Yi

    2011-01-01

    We examined the utility of inferential norming using small samples drawn from the larger "Wechsler Intelligence Scales for Children-Fourth Edition" (WISC-IV) standardization data set. The quality of the norms was estimated with multiple indexes such as polynomial curve fit, percentage of cases receiving the same score, average absolute…

  20. Influence of macroinvertebrate sample size on bioassessment of streams

    NARCIS (Netherlands)

    Vlek, H.E.; Sporka, F.; Krno, I.

    2006-01-01

    In order to standardise biological assessment of surface waters in Europe, a standardised method for sampling, sorting and identification of benthic macroinvertebrates in running waters was developed during the AQEM project. The AQEM method has proved to be relatively time-consuming. Hence, this stu

  1. Finite-sample-size effects on convection in mushy layers

    CERN Document Server

    Zhong, Jin-Qiang; Wells, Andrew J; Wettlaufer, John S

    2012-01-01

    We report theoretical and experimental investigations of the flow instability responsible for the mushy-layer mode of convection and the formation of chimneys, drainage channels devoid of solid, during steady-state solidification of aqueous ammonium chloride. Under certain growth conditions a state of steady mushy-layer growth with no flow is unstable to the onset of convection, resulting in the formation of chimneys. We present regime diagrams to quantify the state of the flow as a function of the initial liquid concentration, the porous-medium Rayleigh number, and the sample width. For a given liquid concentration, increasing both the porous-medium Rayleigh number and the sample width caused the system to change from a stable state of no flow to a different state with the formation of chimneys. Decreasing the concentration ratio destabilized the system and promoted the formation of chimneys. As the initial liquid concentration increased, onset of convection and formation of chimneys occurred at larger value...

  2. Presentation of the intrasubject coefficient of variation for sample size planning in bioequivalence studies.

    Science.gov (United States)

    Hauschke, D; Steinijans, W V; Diletti, E; Schall, R; Luus, H G; Elze, M; Blume, H

    1994-07-01

    Bioequivalence studies are generally performed as crossover studies and, therefore, information on the intrasubject coefficient of variation is needed for sample size planning. Unfortunately, this information is usually not presented in publications on bioequivalence studies, and only the pooled inter- and intrasubject coefficient of variation for either test or reference formulation is reported. Thus, the essential information for sample size planning of future studies is not made available to other researchers. In order to overcome such shortcomings, the presentation of results from bioequivalence studies should routinely include the intrasubject coefficient of variation. For the relevant coefficients of variation, theoretical background together with modes of calculation and presentation are given in this communication with particular emphasis on the multiplicative model.

  3. Calculating sample sizes for cluster randomized trials: we can keep it simple and efficient !

    NARCIS (Netherlands)

    van Breukelen, Gerard J.P.; Candel, Math J.J.M.

    2012-01-01

    Objective: Simple guidelines for efficient sample sizes in cluster randomized trials with unknown intraclass correlation and varying cluster sizes. Methods: A simple equation is given for the optimal number of clusters and sample size per cluster. Here, optimal means maximizing power for a given

  4. Sample to sample fluctuations in the random energy model

    Energy Technology Data Exchange (ETDEWEB)

    Derrida, B. (Service de Physique Theorique, CEN Saclay, 91 - Gif-sur-Yvette (France)); Toulouse, G. (E.S.P.C.I., 75 - Paris (France))

    1985-03-15

    In the spin glass phase, mean field theory says that the weights of the valleys vary from sample to sample. Exact expressions for the probability laws of these fluctuations are derived, from the random energy model, without recourse to the replica method.

  5. Distance software: design and analysis of distance sampling surveys for estimating population size.

    Science.gov (United States)

    Thomas, Len; Buckland, Stephen T; Rexstad, Eric A; Laake, Jeff L; Strindberg, Samantha; Hedley, Sharon L; Bishop, Jon Rb; Marques, Tiago A; Burnham, Kenneth P

    2010-02-01

    1.Distance sampling is a widely used technique for estimating the size or density of biological populations. Many distance sampling designs and most analyses use the software Distance.2.We briefly review distance sampling and its assumptions, outline the history, structure and capabilities of Distance, and provide hints on its use.3.Good survey design is a crucial prerequisite for obtaining reliable results. Distance has a survey design engine, with a built-in geographic information system, that allows properties of different proposed designs to be examined via simulation, and survey plans to be generated.4.A first step in analysis of distance sampling data is modelling the probability of detection. Distance contains three increasingly sophisticated analysis engines for this: conventional distance sampling, which models detection probability as a function of distance from the transect and assumes all objects at zero distance are detected; multiple-covariate distance sampling, which allows covariates in addition to distance; and mark-recapture distance sampling, which relaxes the assumption of certain detection at zero distance.5.All three engines allow estimation of density or abundance, stratified if required, with associated measures of precision calculated either analytically or via the bootstrap.6.Advanced analysis topics covered include the use of multipliers to allow analysis of indirect surveys (such as dung or nest surveys), the density surface modelling analysis engine for spatial and habitat modelling, and information about accessing the analysis engines directly from other software.7.Synthesis and applications. Distance sampling is a key method for producing abundance and density estimates in challenging field conditions. The theory underlying the methods continues to expand to cope with realistic estimation situations. In step with theoretical developments, state-of-the-art software that implements these methods is described that makes the methods

  6. Limitations of mRNA amplification from small-size cell samples

    Directory of Open Access Journals (Sweden)

    Myklebost Ola

    2005-10-01

    Full Text Available Abstract Background Global mRNA amplification has become a widely used approach to obtain gene expression profiles from limited material. An important concern is the reliable reflection of the starting material in the results obtained. This is especially important with extremely low quantities of input RNA where stochastic effects due to template dilution may be present. This aspect remains under-documented in the literature, as quantitative measures of data reliability are most often lacking. To address this issue, we examined the sensitivity levels of each transcript in 3 different cell sample sizes. ANOVA analysis was used to estimate the overall effects of reduced input RNA in our experimental design. In order to estimate the validity of decreasing sample sizes, we examined the sensitivity levels of each transcript by applying a novel model-based method, TransCount. Results From expression data, TransCount provided estimates of absolute transcript concentrations in each examined sample. The results from TransCount were used to calculate the Pearson correlation coefficient between transcript concentrations for different sample sizes. The correlations were clearly transcript copy number dependent. A critical level was observed where stochastic fluctuations became significant. The analysis allowed us to pinpoint the gene specific number of transcript templates that defined the limit of reliability with respect to number of cells from that particular source. In the sample amplifying from 1000 cells, transcripts expressed with at least 121 transcripts/cell were statistically reliable and for 250 cells, the limit was 1806 transcripts/cell. Above these thresholds, correlation between our data sets was at acceptable values for reliable interpretation. Conclusion These results imply that the reliability of any amplification experiment must be validated empirically to justify that any gene exists in sufficient quantity in the input material. This

  7. Evaluation of design flood estimates with respect to sample size

    Science.gov (United States)

    Kobierska, Florian; Engeland, Kolbjorn

    2016-04-01

    Estimation of design floods forms the basis for hazard management related to flood risk and is a legal obligation when building infrastructure such as dams, bridges and roads close to water bodies. Flood inundation maps used for land use planning are also produced based on design flood estimates. In Norway, the current guidelines for design flood estimates give recommendations on which data, probability distribution, and method to use dependent on length of the local record. If less than 30 years of local data is available, an index flood approach is recommended where the local observations are used for estimating the index flood and regional data are used for estimating the growth curve. For 30-50 years of data, a 2 parameter distribution is recommended, and for more than 50 years of data, a 3 parameter distribution should be used. Many countries have national guidelines for flood frequency estimation, and recommended distributions include the log Pearson II, generalized logistic and generalized extreme value distributions. For estimating distribution parameters, ordinary and linear moments, maximum likelihood and Bayesian methods are used. The aim of this study is to r-evaluate the guidelines for local flood frequency estimation. In particular, we wanted to answer the following questions: (i) Which distribution gives the best fit to the data? (ii) Which estimation method provides the best fit to the data? (iii) Does the answer to (i) and (ii) depend on local data availability? To answer these questions we set up a test bench for local flood frequency analysis using data based cross-validation methods. The criteria were based on indices describing stability and reliability of design flood estimates. Stability is used as a criterion since design flood estimates should not excessively depend on the data sample. The reliability indices describe to which degree design flood predictions can be trusted.

  8. A Web-based Simulator for Sample Size and Power Estimation in Animal Carcinogenicity Studies

    Directory of Open Access Journals (Sweden)

    Hojin Moon

    2002-12-01

    Full Text Available A Web-based statistical tool for sample size and power estimation in animal carcinogenicity studies is presented in this paper. It can be used to provide a design with sufficient power for detecting a dose-related trend in the occurrence of a tumor of interest when competing risks are present. The tumors of interest typically are occult tumors for which the time to tumor onset is not directly observable. It is applicable to rodent tumorigenicity assays that have either a single terminal sacrifice or multiple (interval sacrifices. The design is achieved by varying sample size per group, number of sacrifices, number of sacrificed animals at each interval, if any, and scheduled time points for sacrifice. Monte Carlo simulation is carried out in this tool to simulate experiments of rodent bioassays because no closed-form solution is available. It takes design parameters for sample size and power estimation as inputs through the World Wide Web. The core program is written in C and executed in the background. It communicates with the Web front end via a Component Object Model interface passing an Extensible Markup Language string. The proposed statistical tool is illustrated with an animal study in lung cancer prevention research.

  9. Estimating the Size of a Large Network and its Communities from a Random Sample

    CERN Document Server

    Chen, Lin; Crawford, Forrest W

    2016-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V;E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that correctly estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhausti...

  10. SAMPLE SIZE DETERMINATION IN NON-RADOMIZED SURVIVAL STUDIES WITH NON-CENSORED AND CENSORED DATA

    Directory of Open Access Journals (Sweden)

    S FAGHIHZADEH

    2003-06-01

    likelihood function, when data has censored cases an estimate of the probability of censorship should be considered, after obtaining the varince of maximum likelihood estimator and considering its asymptotic normal distribution and by using coefficient of determination, formulas have been derived. The derived sample size formulas could attain the required power for a test adjuasted for effect of other explanatory covariates. Discussion: application of regression model in non-randomnized survival analysis helps to derive suitable formulas to determin sample size in both randomized and non-randomnized studies in a error level, to attain necessary statistical power. In Coxs semiparametric proportional hazard model ,since the varince of the parameter can not be stated in a simple form ,a simulation model can be used. When the coefficient of determination is partialy large the power bassed on log-rank test overestimates the true value of power, but when coefficient of determination is near to difference between powers decreases zero. By increasing of regression coefficient of determination, the difference between the log-rank test and adjusted coefficient of determination of this paper increases.

  11. Mathematical model of a cell size checkpoint.

    Directory of Open Access Journals (Sweden)

    Marco Vilela

    Full Text Available How cells regulate their size from one generation to the next has remained an enigma for decades. Recently, a molecular mechanism that links cell size and cell cycle was proposed in fission yeast. This mechanism involves changes in the spatial cellular distribution of two proteins, Pom1 and Cdr2, as the cell grows. Pom1 inhibits Cdr2 while Cdr2 promotes the G2 → M transition. Cdr2 is localized in the middle cell region (midcell whereas the concentration of Pom1 is highest at the cell tips and declines towards the midcell. In short cells, Pom1 efficiently inhibits Cdr2. However, as cells grow, the Pom1 concentration at midcell decreases such that Cdr2 becomes activated at some critical size. In this study, the chemistry of Pom1 and Cdr2 was modeled using a deterministic reaction-diffusion-convection system interacting with a deterministic model describing microtubule dynamics. Simulations mimicked experimental data from wild-type (WT fission yeast growing at normal and reduced rates; they also mimicked the behavior of a Pom1 overexpression mutant and WT yeast exposed to a microtubule depolymerizing drug. A mechanism linking cell size and cell cycle, involving the downstream action of Cdr2 on Wee1 phosphorylation, is proposed.

  12. Sampling bee communities using pan traps: alternative methods increase sample size

    Science.gov (United States)

    Monitoring of the status of bee populations and inventories of bee faunas require systematic sampling. Efficiency and ease of implementation has encouraged the use of pan traps to sample bees. Efforts to find an optimal standardized sampling method for pan traps have focused on pan trap color. Th...

  13. Adjustable virtual pore-size filter for automated sample preparation using acoustic radiation force

    Energy Technology Data Exchange (ETDEWEB)

    Jung, B; Fisher, K; Ness, K; Rose, K; Mariella, R

    2008-05-22

    We present a rapid and robust size-based separation method for high throughput microfluidic devices using acoustic radiation force. We developed a finite element modeling tool to predict the two-dimensional acoustic radiation force field perpendicular to the flow direction in microfluidic devices. Here we compare the results from this model with experimental parametric studies including variations of the PZT driving frequencies and voltages as well as various particle sizes and compressidensities. These experimental parametric studies also provide insight into the development of an adjustable 'virtual' pore-size filter as well as optimal operating conditions for various microparticle sizes. We demonstrated the separation of Saccharomyces cerevisiae and MS2 bacteriophage using acoustic focusing. The acoustic radiation force did not affect the MS2 viruses, and their concentration profile remained unchanged. With optimized design of our microfluidic flow system we were able to achieve yields of > 90% for the MS2 with > 80% of the S. cerevisiae being removed in this continuous-flow sample preparation device.

  14. Distribution of the two-sample t-test statistic following blinded sample size re-estimation.

    Science.gov (United States)

    Lu, Kaifeng

    2016-05-01

    We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd.

  15. Autoregressive Prediction with Rolling Mechanism for Time Series Forecasting with Small Sample Size

    Directory of Open Access Journals (Sweden)

    Zhihua Wang

    2014-01-01

    Full Text Available Reasonable prediction makes significant practical sense to stochastic and unstable time series analysis with small or limited sample size. Motivated by the rolling idea in grey theory and the practical relevance of very short-term forecasting or 1-step-ahead prediction, a novel autoregressive (AR prediction approach with rolling mechanism is proposed. In the modeling procedure, a new developed AR equation, which can be used to model nonstationary time series, is constructed in each prediction step. Meanwhile, the data window, for the next step ahead forecasting, rolls on by adding the most recent derived prediction result while deleting the first value of the former used sample data set. This rolling mechanism is an efficient technique for its advantages of improved forecasting accuracy, applicability in the case of limited and unstable data situations, and requirement of little computational effort. The general performance, influence of sample size, nonlinearity dynamic mechanism, and significance of the observed trends, as well as innovation variance, are illustrated and verified with Monte Carlo simulations. The proposed methodology is then applied to several practical data sets, including multiple building settlement sequences and two economic series.

  16. CT dose survey in adults: what sample size for what precision?

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, Stephen [Hopital Ambroise Pare, Department of Radiology, Mons (Belgium); Muylem, Alain van [Hopital Erasme, Department of Pneumology, Brussels (Belgium); Howarth, Nigel [Clinique des Grangettes, Department of Radiology, Chene-Bougeries (Switzerland); Gevenois, Pierre Alain [Hopital Erasme, Department of Radiology, Brussels (Belgium); Tack, Denis [EpiCURA, Clinique Louis Caty, Department of Radiology, Baudour (Belgium)

    2017-01-15

    To determine variability of volume computed tomographic dose index (CTDIvol) and dose-length product (DLP) data, and propose a minimum sample size to achieve an expected precision. CTDIvol and DLP values of 19,875 consecutive CT acquisitions of abdomen (7268), thorax (3805), lumbar spine (3161), cervical spine (1515) and head (4106) were collected in two centers. Their variabilities were investigated according to sample size (10 to 1000 acquisitions) and patient body weight categories (no weight selection, 67-73 kg and 60-80 kg). The 95 % confidence interval in percentage of their median (CI95/med) value was calculated for increasing sample sizes. We deduced the sample size that set a 95 % CI lower than 10 % of the median (CI95/med ≤ 10 %). Sample size ensuring CI95/med ≤ 10 %, ranged from 15 to 900 depending on the body region and the dose descriptor considered. In sample sizes recommended by regulatory authorities (i.e., from 10-20 patients), mean CTDIvol and DLP of one sample ranged from 0.50 to 2.00 times its actual value extracted from 2000 samples. The sampling error in CTDIvol and DLP means is high in dose surveys based on small samples of patients. Sample size should be increased at least tenfold to decrease this variability. (orig.)

  17. Origin of sample size effect: Stochastic dislocation formation in crystalline metals at small scales

    Science.gov (United States)

    Huang, Guan-Rong; Huang, J. C.; Tsai, W. Y.

    2016-12-01

    In crystalline metals at small scales, the dislocation density will be increased by stochastic events of dislocation network, leading to a universal power law for various material structures. In this work, we develop a model obeyed by a probability distribution of dislocation density to describe the dislocation formation in terms of a chain reaction. The leading order terms of steady-state of probability distribution gives physical and quantitative insight to the scaling exponent n values in the power law of sample size effect. This approach is found to be consistent with experimental n values in a wide range.

  18. Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling

    National Research Council Canada - National Science Library

    Salganik, Matthew J

    2006-01-01

    .... A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence...

  19. Implications of sampling design and sample size for national carbon accounting systems

    Science.gov (United States)

    Michael Köhl; Andrew Lister; Charles T. Scott; Thomas Baldauf; Daniel. Plugge

    2011-01-01

    Countries willing to adopt a REDD regime need to establish a national Measurement, Reporting and Verification (MRV) system that provides information on forest carbon stocks and carbon stock changes. Due to the extensive areas covered by forests the information is generally obtained by sample based surveys. Most operational sampling approaches utilize a combination of...

  20. Implications of sampling design and sample size for national carbon accounting systems.

    Science.gov (United States)

    Köhl, Michael; Lister, Andrew; Scott, Charles T; Baldauf, Thomas; Plugge, Daniel

    2011-11-08

    Countries willing to adopt a REDD regime need to establish a national Measurement, Reporting and Verification (MRV) system that provides information on forest carbon stocks and carbon stock changes. Due to the extensive areas covered by forests the information is generally obtained by sample based surveys. Most operational sampling approaches utilize a combination of earth-observation data and in-situ field assessments as data sources. We compared the cost-efficiency of four different sampling design alternatives (simple random sampling, regression estimators, stratified sampling, 2-phase sampling with regression estimators) that have been proposed in the scope of REDD. Three of the design alternatives provide for a combination of in-situ and earth-observation data. Under different settings of remote sensing coverage, cost per field plot, cost of remote sensing imagery, correlation between attributes quantified in remote sensing and field data, as well as population variability and the percent standard error over total survey cost was calculated. The cost-efficiency of forest carbon stock assessments is driven by the sampling design chosen. Our results indicate that the cost of remote sensing imagery is decisive for the cost-efficiency of a sampling design. The variability of the sample population impairs cost-efficiency, but does not reverse the pattern of cost-efficiency of the individual design alternatives. Our results clearly indicate that it is important to consider cost-efficiency in the development of forest carbon stock assessments and the selection of remote sensing techniques. The development of MRV-systems for REDD need to be based on a sound optimization process that compares different data sources and sampling designs with respect to their cost-efficiency. This helps to reduce the uncertainties related with the quantification of carbon stocks and to increase the financial benefits from adopting a REDD regime.

  1. Developing Criteria for Sample Sizes in Jet Engine Analytical Component Inspections and the Associated Confidence Levels

    Science.gov (United States)

    1988-09-01

    5 Sample The samples taken from each population will not be random samples . They will be nonprobability , purposive samples . More specifically, they...section will justify why statistical techniques based on the assumption of a random sample , will be used. First, this is the only possible method of...w lu 88 12 21 029 AFIT/GSM/LSM/88S-22 DEVELOPING CRITERIA FOR SAMPLE SIZES IN JET ENGINE ANALYTICAL COMPONENT INSPECTIONS AND THE ASSOCIATED

  2. Optimal adaptive group sequential design with flexible timing of sample size determination.

    Science.gov (United States)

    Cui, Lu; Zhang, Lanju; Yang, Bo

    2017-04-26

    Flexible sample size designs, including group sequential and sample size re-estimation designs, have been used as alternatives to fixed sample size designs to achieve more robust statistical power and better trial efficiency. In this work, a new representation of sample size re-estimation design suggested by Cui et al. [5,6] is introduced as an adaptive group sequential design with flexible timing of sample size determination. This generalized adaptive group sequential design allows one time sample size determination either before the start of or in the mid-course of a clinical study. The new approach leads to possible design optimization on an expanded space of design parameters. Its equivalence to sample size re-estimation design proposed by Cui et al. provides further insight on re-estimation design and helps to address common confusions and misunderstanding. Issues in designing flexible sample size trial, including design objective, performance evaluation and implementation are touched upon with an example to illustrate. Copyright © 2017. Published by Elsevier Inc.

  3. Effect of sample size on the fluid flow through a single fractured granitoid

    Directory of Open Access Journals (Sweden)

    Kunal Kumar Singh

    2016-06-01

    Full Text Available Most of deep geological engineered structures, such as rock caverns, nuclear waste disposal repositories, metro rail tunnels, multi-layer underground parking, are constructed within hard crystalline rocks because of their high quality and low matrix permeability. In such rocks, fluid flows mainly through fractures. Quantification of fractures along with the behavior of the fluid flow through them, at different scales, becomes quite important. Earlier studies have revealed the influence of sample size on the confining stress–permeability relationship and it has been demonstrated that permeability of the fractured rock mass decreases with an increase in sample size. However, most of the researchers have employed numerical simulations to model fluid flow through the fracture/fracture network, or laboratory investigations on intact rock samples with diameter ranging between 38 mm and 45 cm and the diameter-to-length ratio of 1:2 using different experimental methods. Also, the confining stress, σ3, has been considered to be less than 30 MPa and the effect of fracture roughness has been ignored. In the present study, an extension of the previous studies on “laboratory simulation of flow through single fractured granite” was conducted, in which consistent fluid flow experiments were performed on cylindrical samples of granitoids of two different sizes (38 mm and 54 mm in diameters, containing a “rough walled single fracture”. These experiments were performed under varied confining pressure (σ3 = 5–40 MPa, fluid pressure (fp ≤ 25 MPa, and fracture roughness. The results indicate that a nonlinear relationship exists between the discharge, Q, and the effective confining pressure, σeff., and Q decreases with an increase in σeff.. Also, the effects of sample size and fracture roughness do not persist when σeff. ≥ 20 MPa. It is expected that such a study will be quite useful in correlating and extrapolating the laboratory

  4. Thermomagnetic behavior of magnetic susceptibility – heating rate and sample size effects

    Directory of Open Access Journals (Sweden)

    Diana eJordanova

    2016-01-01

    Full Text Available Thermomagnetic analysis of magnetic susceptibility k(T was carried out for a number of natural powder materials from soils, baked clay and anthropogenic dust samples using fast (11oC/min and slow (6.5oC/min heating rates available in the furnace of Kappabridge KLY2 (Agico. Based on the additional data for mineralogy, grain size and magnetic properties of the studied samples, behaviour of k(T cycles and the observed differences in the curves for fast and slow heating rate are interpreted in terms of mineralogical transformations and Curie temperatures (Tc. The effect of different sample size is also explored, using large volume and small volume of powder material. It is found that soil samples show enhanced information on mineralogical transformations and appearance of new strongly magnetic phases when using fast heating rate and large sample size. This approach moves the transformation at higher temperature, but enhances the amplitude of the signal of newly created phase. Large sample size gives prevalence of the local micro- environment, created by evolving gases, released during transformations. The example from archeological brick reveals the effect of different sample sizes on the observed Curie temperatures on heating and cooling curves, when the magnetic carrier is substituted magnetite (Mn0.2Fe2.70O4. Large sample size leads to bigger differences in Tcs on heating and cooling, while small sample size results in similar Tcs for both heating rates.

  5. Model catalysis by size-selected cluster deposition

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Scott [Univ. of Utah, Salt Lake City, UT (United States)

    2015-11-20

    This report summarizes the accomplishments during the last four years of the subject grant. Results are presented for experiments in which size-selected model catalysts were studied under surface science and aqueous electrochemical conditions. Strong effects of cluster size were found, and by correlating the size effects with size-dependent physical properties of the samples measured by surface science methods, it was possible to deduce mechanistic insights, such as the factors that control the rate-limiting step in the reactions. Results are presented for CO oxidation, CO binding energetics and geometries, and electronic effects under surface science conditions, and for the electrochemical oxygen reduction reaction, ethanol oxidation reaction, and for oxidation of carbon by water.

  6. Sample Size Estimation for Non-Inferiority Trials: Frequentist Approach versus Decision Theory Approach.

    Directory of Open Access Journals (Sweden)

    A C Bouman

    Full Text Available Non-inferiority trials are performed when the main therapeutic effect of the new therapy is expected to be not unacceptably worse than that of the standard therapy, and the new therapy is expected to have advantages over the standard therapy in costs or other (health consequences. These advantages however are not included in the classic frequentist approach of sample size calculation for non-inferiority trials. In contrast, the decision theory approach of sample size calculation does include these factors. The objective of this study is to compare the conceptual and practical aspects of the frequentist approach and decision theory approach of sample size calculation for non-inferiority trials, thereby demonstrating that the decision theory approach is more appropriate for sample size calculation of non-inferiority trials.The frequentist approach and decision theory approach of sample size calculation for non-inferiority trials are compared and applied to a case of a non-inferiority trial on individually tailored duration of elastic compression stocking therapy compared to two years elastic compression stocking therapy for the prevention of post thrombotic syndrome after deep vein thrombosis.The two approaches differ substantially in conceptual background, analytical approach, and input requirements. The sample size calculated according to the frequentist approach yielded 788 patients, using a power of 80% and a one-sided significance level of 5%. The decision theory approach indicated that the optimal sample size was 500 patients, with a net value of €92 million.This study demonstrates and explains the differences between the classic frequentist approach and the decision theory approach of sample size calculation for non-inferiority trials. We argue that the decision theory approach of sample size estimation is most suitable for sample size calculation of non-inferiority trials.

  7. Are sample sizes clear and justified in RCTs published in dental journals?

    Directory of Open Access Journals (Sweden)

    Despina Koletsi

    Full Text Available Sample size calculations are advocated by the CONSORT group to justify sample sizes in randomized controlled trials (RCTs. The aim of this study was primarily to evaluate the reporting of sample size calculations, to establish the accuracy of these calculations in dental RCTs and to explore potential predictors associated with adequate reporting. Electronic searching was undertaken in eight leading specific and general dental journals. Replication of sample size calculations was undertaken where possible. Assumed variances or odds for control and intervention groups were also compared against those observed. The relationship between parameters including journal type, number of authors, trial design, involvement of methodologist, single-/multi-center study and region and year of publication, and the accuracy of sample size reporting was assessed using univariable and multivariable logistic regression. Of 413 RCTs identified, sufficient information to allow replication of sample size calculations was provided in only 121 studies (29.3%. Recalculations demonstrated an overall median overestimation of sample size of 15.2% after provisions for losses to follow-up. There was evidence that journal, methodologist involvement (OR = 1.97, CI: 1.10, 3.53, multi-center settings (OR = 1.86, CI: 1.01, 3.43 and time since publication (OR = 1.24, CI: 1.12, 1.38 were significant predictors of adequate description of sample size assumptions. Among journals JCP had the highest odds of adequately reporting sufficient data to permit sample size recalculation, followed by AJODO and JDR, with 61% (OR = 0.39, CI: 0.19, 0.80 and 66% (OR = 0.34, CI: 0.15, 0.75 lower odds, respectively. Both assumed variances and odds were found to underestimate the observed values. Presentation of sample size calculations in the dental literature is suboptimal; incorrect assumptions may have a bearing on the power of RCTs.

  8. Are sample sizes clear and justified in RCTs published in dental journals?

    Science.gov (United States)

    Koletsi, Despina; Fleming, Padhraig S; Seehra, Jadbinder; Bagos, Pantelis G; Pandis, Nikolaos

    2014-01-01

    Sample size calculations are advocated by the CONSORT group to justify sample sizes in randomized controlled trials (RCTs). The aim of this study was primarily to evaluate the reporting of sample size calculations, to establish the accuracy of these calculations in dental RCTs and to explore potential predictors associated with adequate reporting. Electronic searching was undertaken in eight leading specific and general dental journals. Replication of sample size calculations was undertaken where possible. Assumed variances or odds for control and intervention groups were also compared against those observed. The relationship between parameters including journal type, number of authors, trial design, involvement of methodologist, single-/multi-center study and region and year of publication, and the accuracy of sample size reporting was assessed using univariable and multivariable logistic regression. Of 413 RCTs identified, sufficient information to allow replication of sample size calculations was provided in only 121 studies (29.3%). Recalculations demonstrated an overall median overestimation of sample size of 15.2% after provisions for losses to follow-up. There was evidence that journal, methodologist involvement (OR = 1.97, CI: 1.10, 3.53), multi-center settings (OR = 1.86, CI: 1.01, 3.43) and time since publication (OR = 1.24, CI: 1.12, 1.38) were significant predictors of adequate description of sample size assumptions. Among journals JCP had the highest odds of adequately reporting sufficient data to permit sample size recalculation, followed by AJODO and JDR, with 61% (OR = 0.39, CI: 0.19, 0.80) and 66% (OR = 0.34, CI: 0.15, 0.75) lower odds, respectively. Both assumed variances and odds were found to underestimate the observed values. Presentation of sample size calculations in the dental literature is suboptimal; incorrect assumptions may have a bearing on the power of RCTs.

  9. Sample Size Estimation for Non-Inferiority Trials: Frequentist Approach versus Decision Theory Approach.

    Science.gov (United States)

    Bouman, A C; ten Cate-Hoek, A J; Ramaekers, B L T; Joore, M A

    2015-01-01

    Non-inferiority trials are performed when the main therapeutic effect of the new therapy is expected to be not unacceptably worse than that of the standard therapy, and the new therapy is expected to have advantages over the standard therapy in costs or other (health) consequences. These advantages however are not included in the classic frequentist approach of sample size calculation for non-inferiority trials. In contrast, the decision theory approach of sample size calculation does include these factors. The objective of this study is to compare the conceptual and practical aspects of the frequentist approach and decision theory approach of sample size calculation for non-inferiority trials, thereby demonstrating that the decision theory approach is more appropriate for sample size calculation of non-inferiority trials. The frequentist approach and decision theory approach of sample size calculation for non-inferiority trials are compared and applied to a case of a non-inferiority trial on individually tailored duration of elastic compression stocking therapy compared to two years elastic compression stocking therapy for the prevention of post thrombotic syndrome after deep vein thrombosis. The two approaches differ substantially in conceptual background, analytical approach, and input requirements. The sample size calculated according to the frequentist approach yielded 788 patients, using a power of 80% and a one-sided significance level of 5%. The decision theory approach indicated that the optimal sample size was 500 patients, with a net value of €92 million. This study demonstrates and explains the differences between the classic frequentist approach and the decision theory approach of sample size calculation for non-inferiority trials. We argue that the decision theory approach of sample size estimation is most suitable for sample size calculation of non-inferiority trials.

  10. Detecting spatial structures in throughfall data: The effect of extent, sample size, sampling design, and variogram estimation method

    Science.gov (United States)

    Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander

    2016-09-01

    In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous

  11. Integrated spatial sampling modeling of geospatial data

    Institute of Scientific and Technical Information of China (English)

    LI Lianfa; WANG Jinfeng

    2004-01-01

    Spatial sampling is a necessary and important method for extracting geospatial data and its methodology directly affects the geo-analysis results. Counter to the deficiency of separate models of spatial sampling, this article analyzes three crucial elements of spatial sampling (frame, correlation and decision diagram) and induces its general integrated model. The program of Spatial Sampling Integration (SSI) has been developed with Component Object Model (COM) to realize the general integrated model. In two practical applications, i.e. design of the monitoring network of natural disasters and sampling survey of the areas of non-cultivated land, SSI has produced accurate results at less cost, better realizing the cost-effective goal of sampling toward the geo-objects with spatial correlation. The two cases exemplify expanded application and convenient implementation of the general integrated model with inset components in an integrated environment, which can also be extended to other modeling of spatial analysis.

  12. The Effects of Sample Size on Expected Value, Variance and Fraser Efficiency for Nonparametric Independent Two Sample Tests

    Directory of Open Access Journals (Sweden)

    Ismet DOGAN

    2015-10-01

    Full Text Available Objective: Choosing the most efficient statistical test is one of the essential problems of statistics. Asymptotic relative efficiency is a notion which enables to implement in large samples the quantitative comparison of two different tests used for testing of the same statistical hypothesis. The notion of the asymptotic efficiency of tests is more complicated than that of asymptotic efficiency of estimates. This paper discusses the effect of sample size on expected values and variances of non-parametric tests for independent two samples and determines the most effective test for different sample sizes using Fraser efficiency value. Material and Methods: Since calculating the power value in comparison of the tests is not practical most of the time, using the asymptotic relative efficiency value is favorable. Asymptotic relative efficiency is an indispensable technique for comparing and ordering statistical test in large samples. It is especially useful in nonparametric statistics where there exist numerous heuristic tests such as the linear rank tests. In this study, the sample size is determined as 2 ≤ n ≤ 50. Results: In both balanced and unbalanced cases, it is found that, as the sample size increases expected values and variances of all the tests discussed in this paper increase as well. Additionally, considering the Fraser efficiency, Mann-Whitney U test is found as the most efficient test among the non-parametric tests that are used in comparison of independent two samples regardless of their sizes. Conclusion: According to Fraser efficiency, Mann-Whitney U test is found as the most efficient test.

  13. The PowerAtlas: a power and sample size atlas for microarray experimental design and research

    Directory of Open Access Journals (Sweden)

    Wang Jelai

    2006-02-01

    Full Text Available Abstract Background Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips needed to address the multiple hypotheses with acceptable accuracy? Statistical methods exist for calculating power based upon a single hypothesis, using estimates of the variability in data from pilot studies. There is, however, a need for methods to estimate power and/or required sample sizes in situations where multiple hypotheses are being tested, such as in microarray experiments. In addition, investigators frequently do not have pilot data to estimate the sample sizes required for microarray studies. Results To address this challenge, we have developed a Microrarray PowerAtlas 1. The atlas enables estimation of statistical power by allowing investigators to appropriately plan studies by building upon previous studies that have similar experimental characteristics. Currently, there are sample sizes and power estimates based on 632 experiments from Gene Expression Omnibus (GEO. The PowerAtlas also permits investigators to upload their own pilot data and derive power and sample size estimates from these data. This resource will be updated regularly with new datasets from GEO and other databases such as The Nottingham Arabidopsis Stock Center (NASC. Conclusion This resource provides a valuable tool for investigators who are planning efficient microarray studies and estimating required sample sizes.

  14. Optimal designs of the median run length based double sampling X chart for minimizing the average sample size.

    Directory of Open Access Journals (Sweden)

    Wei Lin Teoh

    Full Text Available Designs of the double sampling (DS X chart are traditionally based on the average run length (ARL criterion. However, the shape of the run length distribution changes with the process mean shifts, ranging from highly skewed when the process is in-control to almost symmetric when the mean shift is large. Therefore, we show that the ARL is a complicated performance measure and that the median run length (MRL is a more meaningful measure to depend on. This is because the MRL provides an intuitive and a fair representation of the central tendency, especially for the rightly skewed run length distribution. Since the DS X chart can effectively reduce the sample size without reducing the statistical efficiency, this paper proposes two optimal designs of the MRL-based DS X chart, for minimizing (i the in-control average sample size (ASS and (ii both the in-control and out-of-control ASSs. Comparisons with the optimal MRL-based EWMA X and Shewhart X charts demonstrate the superiority of the proposed optimal MRL-based DS X chart, as the latter requires a smaller sample size on the average while maintaining the same detection speed as the two former charts. An example involving the added potassium sorbate in a yoghurt manufacturing process is used to illustrate the effectiveness of the proposed MRL-based DS X chart in reducing the sample size needed.

  15. New method to estimate the sample size for calculation of a proportion assuming binomial distribution.

    Science.gov (United States)

    Vallejo, Adriana; Muniesa, Ana; Ferreira, Chelo; de Blas, Ignacio

    2013-10-01

    Nowadays the formula to calculate the sample size for estimate a proportion (as prevalence) is based on the Normal distribution, however it would be based on a Binomial distribution which confidence interval was possible to be calculated using the Wilson Score method. By comparing the two formulae (Normal and Binomial distributions), the variation of the amplitude of the confidence intervals is relevant in the tails and the center of the curves. In order to calculate the needed sample size we have simulated an iterative sampling procedure, which shows an underestimation of the sample size for values of prevalence closed to 0 or 1, and also an overestimation for values closed to 0.5. Attending to these results we proposed an algorithm based on Wilson Score method that provides similar values for the sample size than empirically obtained by simulation.

  16. PIXE-PIGE analysis of size-segregated aerosol samples from remote areas

    Science.gov (United States)

    Calzolai, G.; Chiari, M.; Lucarelli, F.; Nava, S.; Taccetti, F.; Becagli, S.; Frosini, D.; Traversi, R.; Udisti, R.

    2014-01-01

    The chemical characterization of size-segregated samples is helpful to study the aerosol effects on both human health and environment. The sampling with multi-stage cascade impactors (e.g., Small Deposit area Impactor, SDI) produces inhomogeneous samples, with a multi-spot geometry and a non-negligible particle stratification.

  17. Light propagation in tissues: effect of finite size of tissue sample

    Science.gov (United States)

    Melnik, Ivan S.; Dets, Sergiy M.; Rusina, Tatyana V.

    1995-12-01

    Laser beam propagation inside tissues with different lateral dimensions has been considered. Scattering and anisotropic properties of tissue critically determine spatial fluence distribution and predict sizes of tissue specimens when deviations of this distribution can be neglected. Along the axis of incident beam the fluence rate weakly depends on sample size whereas its relative increase (more than 20%) towards the lateral boundaries. The finite sizes were considered to be substantial only for samples with sizes comparable with the diameter of the laser beam. Interstitial irradiance patterns simulated by Monte Carlo method were compared with direct measurements in human brain specimens.

  18. Sample size adjustment designs with time-to-event outcomes: A caution.

    Science.gov (United States)

    Freidlin, Boris; Korn, Edward L

    2017-08-01

    Sample size adjustment designs, which allow increasing the study sample size based on interim analysis of outcome data from a randomized clinical trial, have been increasingly promoted in the biostatistical literature. Although it is recognized that group sequential designs can be at least as efficient as sample size adjustment designs, many authors argue that a key advantage of these designs is their flexibility; interim sample size adjustment decisions can incorporate information and business interests external to the trial. Recently, Chen et al. (Clinical Trials 2015) considered sample size adjustment applications in the time-to-event setting using a design (CDL) that limits adjustments to situations where the interim results are promising. The authors demonstrated that while CDL provides little gain in unconditional power (versus fixed-sample-size designs), there is a considerable increase in conditional power for trials in which the sample size is adjusted. In time-to-event settings, sample size adjustment allows an increase in the number of events required for the final analysis. This can be achieved by either (a) following the original study population until the additional events are observed thus focusing on the tail of the survival curves or (b) enrolling a potentially large number of additional patients thus focusing on the early differences in survival curves. We use the CDL approach to investigate performance of sample size adjustment designs in time-to-event trials. Through simulations, we demonstrate that when the magnitude of the true treatment effect changes over time, interim information on the shape of the survival curves can be used to enrich the final analysis with events from the time period with the strongest treatment effect. In particular, interested parties have the ability to make the end-of-trial treatment effect larger (on average) based on decisions using interim outcome data. Furthermore, in "clinical null" cases where there is no

  19. A NONPARAMETRIC PROCEDURE OF THE SAMPLE SIZE DETERMINATION FOR SURVIVAL RATE TEST

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Objective This paper proposes a nonparametric procedure of the sample size determination for survival rate test. Methods Using the classical asymptotic normal procedure yields the required homogenetic effective sample size and using the inverse operation with the prespecified value of the survival function of censoring times yields the required sample size. Results It is matched with the rate test for censored data, does not involve survival distributions, and reduces to its classical counterpart when there is no censoring. The observed power of the test coincides with the prescribed power under usual clinical conditions. Conclusion It can be used for planning survival studies of chronic diseases.

  20. Mixture model analysis of complex samples

    NARCIS (Netherlands)

    Wedel, M; ter Hofstede, F; Steenkamp, JBEM

    1998-01-01

    We investigate the effects of a complex sampling design on the estimation of mixture models. An approximate or pseudo likelihood approach is proposed to obtain consistent estimates of class-specific parameters when the sample arises from such a complex design. The effects of ignoring the sample desi

  1. Effect of sample aliquot size on the limit of detection and reproducibility of clinical assays.

    Science.gov (United States)

    Chen, Guorong; Kobayashi, Lori; Nazarenko, Irina

    2007-11-01

    Nucleic acid amplification technologies significantly improved the limit of detection (LOD) for diagnostic assays. The ability of these assays to amplify fewer than 10 target copies of DNA or RNA imposes new requirements on the preparation of clinical samples. We report a statistical method to determine how large of an aliquot is necessary to reproducibly provide a detectable number of cells. We determined the success probability (p) based on aliquot size and sample volume. The binomial distribution, based on p and the concentration of cells in sample, was used to calculate the probability of getting no target objects in an aliquot and to determine the minimum number of objects per aliquot necessary to generate a reproducible clinical assay. The described method was applied to find a minimum aliquot volume required for a set LOD, false-negative rate (FNR), and %CV. For example, to keep FNR FNRs are 47.2% and 44.9%. This probability model is a useful tool to predict the impact of aliquot volume on the LOD and reproducibility of clinical assays. Even for samples for which pathogens are homogeneously distributed, it is theoretically impossible to collect a single pathogen consistently if the concentration of pathogen is below a certain limit.

  2. Mineralogical, optical, geochemical, and particle size properties of four sediment samples for optical physics research

    Science.gov (United States)

    Bice, K.; Clement, S. C.

    1981-01-01

    X-ray diffraction and spectroscopy were used to investigate the mineralogical and chemical properties of the Calvert, Ball Old Mine, Ball Martin, and Jordan Sediments. The particle size distribution and index of refraction of each sample were determined. The samples are composed primarily of quartz, kaolinite, and illite. The clay minerals are most abundant in the finer particle size fractions. The chemical properties of the four samples are similar. The Calvert sample is most notably different in that it contains a relatively high amount of iron. The dominant particle size fraction in each sample is silt, with lesser amounts of clay and sand. The indices of refraction of the sediments are the same with the exception of the Calvert sample which has a slightly higher value.

  3. Importance Sampling for the Infinite Sites Model*

    OpenAIRE

    Hobolth, Asger; Uyenoyama, Marcy K; Wiuf, Carsten

    2008-01-01

    Importance sampling or Markov Chain Monte Carlo sampling is required for state-of-the-art statistical analysis of population genetics data. The applicability of these sampling-based inference techniques depends crucially on the proposal distribution. In this paper, we discuss importance sampling for the infinite sites model. The infinite sites assumption is attractive because it constraints the number of possible genealogies, thereby allowing for the analysis of larger data sets. We recall th...

  4. Frictional behaviour of sandstone: A sample-size dependent triaxial investigation

    Science.gov (United States)

    Roshan, Hamid; Masoumi, Hossein; Regenauer-Lieb, Klaus

    2017-01-01

    Frictional behaviour of rocks from the initial stage of loading to final shear displacement along the formed shear plane has been widely investigated in the past. However the effect of sample size on such frictional behaviour has not attracted much attention. This is mainly related to the limitations in rock testing facilities as well as the complex mechanisms involved in sample-size dependent frictional behaviour of rocks. In this study, a suite of advanced triaxial experiments was performed on Gosford sandstone samples at different sizes and confining pressures. The post-peak response of the rock along the formed shear plane has been captured for the analysis with particular interest in sample-size dependency. Several important phenomena have been observed from the results of this study: a) the rate of transition from brittleness to ductility in rock is sample-size dependent where the relatively smaller samples showed faster transition toward ductility at any confining pressure; b) the sample size influences the angle of formed shear band and c) the friction coefficient of the formed shear plane is sample-size dependent where the relatively smaller sample exhibits lower friction coefficient compared to larger samples. We interpret our results in terms of a thermodynamics approach in which the frictional properties for finite deformation are viewed as encompassing a multitude of ephemeral slipping surfaces prior to the formation of the through going fracture. The final fracture itself is seen as a result of the self-organisation of a sufficiently large ensemble of micro-slip surfaces and therefore consistent in terms of the theory of thermodynamics. This assumption vindicates the use of classical rock mechanics experiments to constrain failure of pressure sensitive rocks and the future imaging of these micro-slips opens an exciting path for research in rock failure mechanisms.

  5. Sample size determination for logistic regression on a logit-normal distribution.

    Science.gov (United States)

    Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance

    2017-06-01

    Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

  6. Monotonicity in the Sample Size of the Length of Classical Confidence Intervals

    CERN Document Server

    Kagan, Abram M

    2012-01-01

    It is proved that the average length of standard confidence intervals for parameters of gamma and normal distributions monotonically decrease with the sample size. The proofs are based on fine properties of the classical gamma function.

  7. A Maximum Entropy Modelling of the Rain Drop Size Distribution

    Directory of Open Access Journals (Sweden)

    Francisco J. Tapiador

    2011-01-01

    Full Text Available This paper presents a maximum entropy approach to Rain Drop Size Distribution (RDSD modelling. It is shown that this approach allows (1 to use a physically consistent rationale to select a particular probability density function (pdf (2 to provide an alternative method for parameter estimation based on expectations of the population instead of sample moments and (3 to develop a progressive method of modelling by updating the pdf as new empirical information becomes available. The method is illustrated with both synthetic and real RDSD data, the latest coming from a laser disdrometer network specifically designed to measure the spatial variability of the RDSD.

  8. Sample size choices for XRCT scanning of highly unsaturated soil mixtures

    Directory of Open Access Journals (Sweden)

    Smith Jonathan C.

    2016-01-01

    Full Text Available Highly unsaturated soil mixtures (clay, sand and gravel are used as building materials in many parts of the world, and there is increasing interest in understanding their mechanical and hydraulic behaviour. In the laboratory, x-ray computed tomography (XRCT is becoming more widely used to investigate the microstructures of soils, however a crucial issue for such investigations is the choice of sample size, especially concerning the scanning of soil mixtures where there will be a range of particle and void sizes. In this paper we present a discussion (centred around a new set of XRCT scans on sample sizing for scanning of samples comprising soil mixtures, where a balance has to be made between realistic representation of the soil components and the desire for high resolution scanning, We also comment on the appropriateness of differing sample sizes in comparison to sample sizes used for other geotechnical testing. Void size distributions for the samples are presented and from these some hypotheses are made as to the roles of inter- and intra-aggregate voids in the mechanical behaviour of highly unsaturated soils.

  9. Sample size planning with the cost constraint for testing superiority and equivalence of two independent groups.

    Science.gov (United States)

    Guo, Jiin-Huarng; Chen, Hubert J; Luh, Wei-Ming

    2011-11-01

    The allocation of sufficient participants into different experimental groups for various research purposes under given constraints is an important practical problem faced by researchers. We address the problem of sample size determination between two independent groups for unequal and/or unknown variances when both the power and the differential cost are taken into consideration. We apply the well-known Welch approximate test to derive various sample size allocation ratios by minimizing the total cost or, equivalently, maximizing the statistical power. Two types of hypotheses including superiority/non-inferiority and equivalence of two means are each considered in the process of sample size planning. A simulation study is carried out and the proposed method is validated in terms of Type I error rate and statistical power. As a result, the simulation study reveals that the proposed sample size formulas are very satisfactory under various variances and sample size allocation ratios. Finally, a flowchart, tables, and figures of several sample size allocations are presented for practical reference.

  10. A margin based approach to determining sample sizes via tolerance bounds.

    Energy Technology Data Exchange (ETDEWEB)

    Newcomer, Justin T.; Freeland, Katherine Elizabeth

    2013-09-01

    This paper proposes a tolerance bound approach for determining sample sizes. With this new methodology we begin to think of sample size in the context of uncertainty exceeding margin. As the sample size decreases the uncertainty in the estimate of margin increases. This can be problematic when the margin is small and only a few units are available for testing. In this case there may be a true underlying positive margin to requirements but the uncertainty may be too large to conclude we have sufficient margin to those requirements with a high level of statistical confidence. Therefore, we provide a methodology for choosing a sample size large enough such that an estimated QMU uncertainty based on the tolerance bound approach will be smaller than the estimated margin (assuming there is positive margin). This ensures that the estimated tolerance bound will be within performance requirements and the tolerance ratio will be greater than one, supporting a conclusion that we have sufficient margin to the performance requirements. In addition, this paper explores the relationship between margin, uncertainty, and sample size and provides an approach and recommendations for quantifying risk when sample sizes are limited.

  11. Shrinkage anisotropy characteristics from soil structure and initial sample/layer size

    CERN Document Server

    Chertkov, V Y

    2014-01-01

    The objective of this work is a physical prediction of such soil shrinkage anisotropy characteristics as variation with drying of (i) different sample/layer sizes and (ii) the shrinkage geometry factor. With that, a new presentation of the shrinkage anisotropy concept is suggested through the sample/layer size ratios. The work objective is reached in two steps. First, the relations are derived between the indicated soil shrinkage anisotropy characteristics and three different shrinkage curves of a soil relating to: small samples (without cracking at shrinkage), sufficiently large samples (with internal cracking), and layers of similar thickness. Then, the results of a recent work with respect to the physical prediction of the three shrinkage curves are used. These results connect the shrinkage curves with the initial sample size/layer thickness as well as characteristics of soil texture and structure (both inter- and intra-aggregate) as physical parameters. The parameters determining the reference shrinkage c...

  12. Quantification of errors in ordinal outcome scales using shannon entropy: effect on sample size calculations.

    Directory of Open Access Journals (Sweden)

    Pitchaiah Mandava

    Full Text Available OBJECTIVE: Clinical trial outcomes often involve an ordinal scale of subjective functional assessments but the optimal way to quantify results is not clear. In stroke, the most commonly used scale, the modified Rankin Score (mRS, a range of scores ("Shift" is proposed as superior to dichotomization because of greater information transfer. The influence of known uncertainties in mRS assessment has not been quantified. We hypothesized that errors caused by uncertainties could be quantified by applying information theory. Using Shannon's model, we quantified errors of the "Shift" compared to dichotomized outcomes using published distributions of mRS uncertainties and applied this model to clinical trials. METHODS: We identified 35 randomized stroke trials that met inclusion criteria. Each trial's mRS distribution was multiplied with the noise distribution from published mRS inter-rater variability to generate an error percentage for "shift" and dichotomized cut-points. For the SAINT I neuroprotectant trial, considered positive by "shift" mRS while the larger follow-up SAINT II trial was negative, we recalculated sample size required if classification uncertainty was taken into account. RESULTS: Considering the full mRS range, error rate was 26.1%±5.31 (Mean±SD. Error rates were lower for all dichotomizations tested using cut-points (e.g. mRS 1; 6.8%±2.89; overall p<0.001. Taking errors into account, SAINT I would have required 24% more subjects than were randomized. CONCLUSION: We show when uncertainty in assessments is considered, the lowest error rates are with dichotomization. While using the full range of mRS is conceptually appealing, a gain of information is counter-balanced by a decrease in reliability. The resultant errors need to be considered since sample size may otherwise be underestimated. In principle, we have outlined an approach to error estimation for any condition in which there are uncertainties in outcome assessment. We

  13. Effect of sample volume size and sampling method on feline longitudinal myocardial velocity profiles from color tissue Doppler imaging.

    Science.gov (United States)

    Granström, Sara; Pipper, Christian Bressen; Møgelvang, Rasmus; Sogaard, Peter; Willesen, Jakob Lundgren; Koch, Jørgen

    2012-12-01

    The aims of this study were to compare the effect of sample volume (SV) size settings and sampling method on measurement variability and peak systolic (s'), and early (e') and late (a') diastolic longitudinal myocardial velocities using color tissue Doppler imaging (cTDI) in cats. Twenty cats with normal echocardiograms and 20 cats with hypertrophic cardiomyopathy. We quantified and compared empirical variance and average absolute values of s', e' and a' for three cardiac cycles using eight different SV settings (length 1,2,3 and 5 mm; width 1 and 2 mm) and three methods of sampling (end-diastolic sampling with manual tracking of the SV, end-systolic sampling without tracking, and random-frame sampling without tracking). No significant difference in empirical variance could be demonstrated between most of the tested SVs. However, the two settings with a length of 1 mm resulted in a significantly higher variance compared with all settings where the SV length exceeded 2 mm (p sampling method on the variability of measurements (p = 0.003) and manual tracking obtained the lowest variance. No difference in average values of s', e' or a' could be found between any of the SV settings or sampling methods. Within the tested range of SV settings, an SV length of 1 mm resulted in higher measurement variability compared with an SV length of 3 and 5 mm, and should therefore be avoided. Manual tracking of the sample volume is recommended. Copyright © 2012 Elsevier B.V. All rights reserved.

  14. Blinded sample size re-estimation in three-arm trials with 'gold standard' design.

    Science.gov (United States)

    Mütze, Tobias; Friede, Tim

    2017-10-15

    In this article, we study blinded sample size re-estimation in the 'gold standard' design with internal pilot study for normally distributed outcomes. The 'gold standard' design is a three-arm clinical trial design that includes an active and a placebo control in addition to an experimental treatment. We focus on the absolute margin approach to hypothesis testing in three-arm trials at which the non-inferiority of the experimental treatment and the assay sensitivity are assessed by pairwise comparisons. We compare several blinded sample size re-estimation procedures in a simulation study assessing operating characteristics including power and type I error. We find that sample size re-estimation based on the popular one-sample variance estimator results in overpowered trials. Moreover, sample size re-estimation based on unbiased variance estimators such as the Xing-Ganju variance estimator results in underpowered trials, as it is expected because an overestimation of the variance and thus the sample size is in general required for the re-estimation procedure to eventually meet the target power. To overcome this problem, we propose an inflation factor for the sample size re-estimation with the Xing-Ganju variance estimator and show that this approach results in adequately powered trials. Because of favorable features of the Xing-Ganju variance estimator such as unbiasedness and a distribution independent of the group means, the inflation factor does not depend on the nuisance parameter and, therefore, can be calculated prior to a trial. Moreover, we prove that the sample size re-estimation based on the Xing-Ganju variance estimator does not bias the effect estimate. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  15. Sampling Weights in Latent Variable Modeling

    Science.gov (United States)

    Asparouhov, Tihomir

    2005-01-01

    This article reviews several basic statistical tools needed for modeling data with sampling weights that are implemented in Mplus Version 3. These tools are illustrated in simulation studies for several latent variable models including factor analysis with continuous and categorical indicators, latent class analysis, and growth models. The…

  16. Bayesian forecasting of the recurrent earthquakes and its predictive performance for a small sample size

    Science.gov (United States)

    Nomura, S.; Ogata, Y.

    2010-12-01

    This study is concerned with the probability forecast by the Brownian Passage Time (BPT) model especially in case where only a few records of recurrent earthquakes from an active fault are available. We adopt the Bayesian predictive distribution that takes the relevant prior information and all possibilities for model parameters into account. We utilize the size of single-event displacements U and the slip rate V across the segment to calculate the mean recurrence time T=U/V that the past recurrence intervals are distributed around as Figure 1. We then make use of the best fitted prior distribution for the BPT variation coefficient (the shape parameter, α) selected by the Akaike Bayesian information criterion (ABIC), while the ERC uses the same common estimate α=0.24. Applying this prior distribution, we can see that α takes various values among the faults but has some locational tendencies from Figure 2. For example, α values tend to be higher in the center of Honshu island where the faults are densely populated. We compare the goodness of fit and probability forecasts between the conventional models and our proposed model by historical or simulated datasets. The Bayesian predictor shows very stable and superior performance for small samples or variant recurrence times. Figure 1: The relation between mean recurrence time from slip data and past recurrence intervals with error bars. Figure 2: The map of active faults in land and subduction-zones in Japan, whose colors show the Bayes estimates of variation coefficient α.

  17. SMALL SAMPLE SIZE IN 2X2 CROSS OVER DESIGNS: CONDITIONS OF DETERMINATION

    Directory of Open Access Journals (Sweden)

    B SOLEYMANI

    2001-09-01

    Full Text Available Introduction. Determination of small sample size in some clinical trials is a matter of importance. In cross-over studies which are one types of clinical trials, the matter is more significant. In this article, the conditions in which determination of small sample size in cross-over studies are possible were considered, and the effect of deviation from normality on the matter has been shown. Methods. The present study has been done on such 2x2 cross-over studies that variable of interest is quantitative one and is measurable by ratio or interval scale. The method of consideration is based on use of variable and sample mean"s distributions, central limit theorem, method of sample size determination in two groups, and cumulant or moment generating function. Results. In normal variables or transferable to normal variables, there is no restricting factors other than significant level and power of the test for determination of sample size, but in the case of non-normal variables, it should be determined such large that guarantee the normality of sample mean"s distribution. Discussion. In such cross over studies that because of existence of theoretical base, few samples can be computed, one should not do it without taking applied worth of results into consideration. While determining sample size, in addition to variance, it is necessary to consider distribution of variable, particularly through its skewness and kurtosis coefficients. the more deviation from normality, the more need of samples. Since in medical studies most of the continuous variables are closed to normal distribution, a few number of samples often seems to be adequate for convergence of sample mean to normal distribution.

  18. Age differences in body size stereotyping in a sample of preschool girls.

    Science.gov (United States)

    Harriger, Jennifer A

    2015-01-01

    Researchers have demonstrated that societal concerns about dieting and body size have led to an increase in negative attitudes toward obese people and that girls as young as 3 years old endorse similar body size stereotypes as have been previously found with adults. Few studies, however, have examined age differences in their participants. A sample of 102 girls (3-5-years-old) completed measures of body size stereotyping. Results indicate that while body-size stereotyping is present by age 3, pro-thin beliefs may develop prior to anti-fat beliefs. Implications and future directions for research with preschool children are discussed.

  19. Norm Block Sample Sizes: A Review of 17 Individually Administered Intelligence Tests

    Science.gov (United States)

    Norfolk, Philip A.; Farmer, Ryan L.; Floyd, Randy G.; Woods, Isaac L.; Hawkins, Haley K.; Irby, Sarah M.

    2015-01-01

    The representativeness, recency, and size of norm samples strongly influence the accuracy of inferences drawn from their scores. Inadequate norm samples may lead to inflated or deflated scores for individuals and poorer prediction of developmental and academic outcomes. The purpose of this study was to apply Kranzler and Floyd's method for…

  20. Page sample size in web accessibility testing: how many pages is enough?

    NARCIS (Netherlands)

    Velleman, Eric; Geest, van der Thea

    2013-01-01

    Various countries and organizations use a different sampling approach and sample size of web pages in accessibility conformance tests. We are conducting a systematic analysis to determine how many pages is enough for testing whether a website is compliant with standard accessibility guidelines. This

  1. Assessing terpene content variability of whitebark pine in order to estimate representative sample size

    Directory of Open Access Journals (Sweden)

    Stefanović Milena

    2013-01-01

    Full Text Available In studies of population variability, particular attention has to be paid to the selection of a representative sample. The aim of this study was to assess the size of the new representative sample on the basis of the variability of chemical content of the initial sample on the example of a whitebark pine population. Statistical analysis included the content of 19 characteristics (terpene hydrocarbons and their derivates of the initial sample of 10 elements (trees. It was determined that the new sample should contain 20 trees so that the mean value calculated from it represents a basic set with a probability higher than 95 %. Determination of the lower limit of the representative sample size that guarantees a satisfactory reliability of generalization proved to be very important in order to achieve cost efficiency of the research. [Projekat Ministarstva nauke Republike Srbije, br. OI-173011, br. TR-37002 i br. III-43007

  2. n4Studies: Sample Size Calculation for an Epidemiological Study on a Smart Device

    Directory of Open Access Journals (Sweden)

    Chetta Ngamjarus

    2016-05-01

    Full Text Available Objective: This study was to develop a sample size application (called “n4Studies” for free use on iPhone and Android devices and to compare sample size functions between n4Studies with other applications and software. Methods: Objective-C programming language was used to create the application for the iPhone OS (operating system while javaScript, jquery mobile, PhoneGap and jstat were used to develop it for Android phones. Other sample size applications were searched from the Apple app and Google play stores. The applications’ characteristics and sample size functions were collected. Spearman’s rank correlation was used to investigate the relationship between number of sample size functions and price. Results: “n4Studies” provides several functions for sample size and power calculations for various epidemiological study designs. It can be downloaded from the Apple App and Google play store. Comparing n4Studies with other applications, it covers several more types of epidemiological study designs, gives similar results for estimation of infinite/finite population mean and infinite/finite proportion from GRANMO, for comparing two independent means from BioStats, for comparing two independent proportions from EpiCal application. When using the same parameters, n4Studies gives similar results to STATA, epicalc package in R, PS, G*Power, and OpenEpi. Conclusion: “n4Studies” can be an alternative tool for calculating the sample size. It may be useful to students, lecturers and researchers in conducting their research projects.

  3. Size and complexity in model financial systems.

    Science.gov (United States)

    Arinaminpathy, Nimalan; Kapadia, Sujit; May, Robert M

    2012-11-06

    The global financial crisis has precipitated an increasing appreciation of the need for a systemic perspective toward financial stability. For example: What role do large banks play in systemic risk? How should capital adequacy standards recognize this role? How is stability shaped by concentration and diversification in the financial system? We explore these questions using a deliberately simplified, dynamic model of a banking system that combines three different channels for direct transmission of contagion from one bank to another: liquidity hoarding, asset price contagion, and the propagation of defaults via counterparty credit risk. Importantly, we also introduce a mechanism for capturing how swings in "confidence" in the system may contribute to instability. Our results highlight that the importance of relatively large, well-connected banks in system stability scales more than proportionately with their size: the impact of their collapse arises not only from their connectivity, but also from their effect on confidence in the system. Imposing tougher capital requirements on larger banks than smaller ones can thus enhance the resilience of the system. Moreover, these effects are more pronounced in more concentrated systems, and continue to apply, even when allowing for potential diversification benefits that may be realized by larger banks. We discuss some tentative implications for policy, as well as conceptual analogies in ecosystem stability and in the control of infectious diseases.

  4. Size and complexity in model financial systems

    Science.gov (United States)

    Arinaminpathy, Nimalan; Kapadia, Sujit; May, Robert M.

    2012-01-01

    The global financial crisis has precipitated an increasing appreciation of the need for a systemic perspective toward financial stability. For example: What role do large banks play in systemic risk? How should capital adequacy standards recognize this role? How is stability shaped by concentration and diversification in the financial system? We explore these questions using a deliberately simplified, dynamic model of a banking system that combines three different channels for direct transmission of contagion from one bank to another: liquidity hoarding, asset price contagion, and the propagation of defaults via counterparty credit risk. Importantly, we also introduce a mechanism for capturing how swings in “confidence” in the system may contribute to instability. Our results highlight that the importance of relatively large, well-connected banks in system stability scales more than proportionately with their size: the impact of their collapse arises not only from their connectivity, but also from their effect on confidence in the system. Imposing tougher capital requirements on larger banks than smaller ones can thus enhance the resilience of the system. Moreover, these effects are more pronounced in more concentrated systems, and continue to apply, even when allowing for potential diversification benefits that may be realized by larger banks. We discuss some tentative implications for policy, as well as conceptual analogies in ecosystem stability and in the control of infectious diseases. PMID:23091020

  5. Operating characteristics of sample size re-estimation with futility stopping based on conditional power.

    Science.gov (United States)

    Lachin, John M

    2006-10-15

    Various methods have been described for re-estimating the final sample size in a clinical trial based on an interim assessment of the treatment effect. Many re-weight the observations after re-sizing so as to control the pursuant inflation in the type I error probability alpha. Lan and Trost (Estimation of parameters and sample size re-estimation. Proceedings of the American Statistical Association Biopharmaceutical Section 1997; 48-51) proposed a simple procedure based on conditional power calculated under the current trend in the data (CPT). The study is terminated for futility if CPT or = CU, or re-sized by a factor m to yield CPT = CU if CL stopping for futility can balance the inflation due to sample size re-estimation, thus permitting any form of final analysis with no re-weighting. Herein the statistical properties of this approach are described including an evaluation of the probabilities of stopping for futility or re-sizing, the distribution of the re-sizing factor m, and the unconditional type I and II error probabilities alpha and beta. Since futility stopping does not allow a type I error but commits a type II error, then as the probability of stopping for futility increases, alpha decreases and beta increases. An iterative procedure is described for choice of the critical test value and the futility stopping boundary so as to ensure that specified alpha and beta are obtained. However, inflation in beta is controlled by reducing the probability of futility stopping, that in turn dramatically increases the possible re-sizing factor m. The procedure is also generalized to limit the maximum sample size inflation factor, such as at m max = 4. However, doing so then allows for a non-trivial fraction of studies to be re-sized at this level that still have low conditional power. These properties also apply to other methods for sample size re-estimation with a provision for stopping for futility. Sample size re-estimation procedures should be used with caution

  6. Effect of dislocation pile-up on size-dependent yield strength in finite single-crystal micro-samples

    Energy Technology Data Exchange (ETDEWEB)

    Pan, Bo; Shibutani, Yoji, E-mail: sibutani@mech.eng.osaka-u.ac.jp [Department of Mechanical Engineering, Osaka University, Suita 565-0871 (Japan); Zhang, Xu [State Key Laboratory for Strength and Vibration of Mechanical Structures, School of Aerospace, Xi' an Jiaotong University, Xi' an 710049 (China); School of Mechanics and Engineering Science, Zhengzhou University, Zhengzhou 450001 (China); Shang, Fulin [State Key Laboratory for Strength and Vibration of Mechanical Structures, School of Aerospace, Xi' an Jiaotong University, Xi' an 710049 (China)

    2015-07-07

    Recent research has explained that the steeply increasing yield strength in metals depends on decreasing sample size. In this work, we derive a statistical physical model of the yield strength of finite single-crystal micro-pillars that depends on single-ended dislocation pile-up inside the micro-pillars. We show that this size effect can be explained almost completely by considering the stochastic lengths of the dislocation source and the dislocation pile-up length in the single-crystal micro-pillars. The Hall–Petch-type relation holds even in a microscale single-crystal, which is characterized by its dislocation source lengths. Our quantitative conclusions suggest that the number of dislocation sources and pile-ups are significant factors for the size effect. They also indicate that starvation of dislocation sources is another reason for the size effect. Moreover, we investigated the explicit relationship between the stacking fault energy and the dislocation “pile-up” effect inside the sample: materials with low stacking fault energy exhibit an obvious dislocation pile-up effect. Our proposed physical model predicts a sample strength that agrees well with experimental data, and our model can give a more precise prediction than the current single arm source model, especially for materials with low stacking fault energy.

  7. Power and sample size calculations for Mendelian randomization studies using one genetic instrument.

    Science.gov (United States)

    Freeman, Guy; Cowling, Benjamin J; Schooling, C Mary

    2013-08-01

    Mendelian randomization, which is instrumental variable analysis using genetic variants as instruments, is an increasingly popular method of making causal inferences from observational studies. In order to design efficient Mendelian randomization studies, it is essential to calculate the sample sizes required. We present formulas for calculating the power of a Mendelian randomization study using one genetic instrument to detect an effect of a given size, and the minimum sample size required to detect effects for given levels of significance and power, using asymptotic statistical theory. We apply the formulas to some example data and compare the results with those from simulation methods. Power and sample size calculations using these formulas should be more straightforward to carry out than simulation approaches. These formulas make explicit that the sample size needed for Mendelian randomization study is inversely proportional to the square of the correlation between the genetic instrument and the exposure and proportional to the residual variance of the outcome after removing the effect of the exposure, as well as inversely proportional to the square of the effect size.

  8. Size Evolution and Stochastic Models: Explaining Ostracod Size through Probabilistic Distributions

    Science.gov (United States)

    Krawczyk, M.; Decker, S.; Heim, N. A.; Payne, J.

    2014-12-01

    The biovolume of animals has functioned as an important benchmark for measuring evolution throughout geologic time. In our project, we examined the observed average body size of ostracods over time in order to understand the mechanism of size evolution in these marine organisms. The body size of ostracods has varied since the beginning of the Ordovician, where the first true ostracods appeared. We created a stochastic branching model to create possible evolutionary trees of ostracod size. Using stratigraphic ranges for ostracods compiled from over 750 genera in the Treatise on Invertebrate Paleontology, we calculated overall speciation and extinction rates for our model. At each timestep in our model, new lineages can evolve or existing lineages can become extinct. Newly evolved lineages are assigned sizes based on their parent genera. We parameterized our model to generate neutral and directional changes in ostracod size to compare with the observed data. New sizes were chosen via a normal distribution, and the neutral model selected new sizes differentials centered on zero, allowing for an equal chance of larger or smaller ostracods at each speciation. Conversely, the directional model centered the distribution on a negative value, giving a larger chance of smaller ostracods. Our data strongly suggests that the overall direction of ostracod evolution has been following a model that directionally pushes mean ostracod size down, shying away from a neutral model. Our model was able to match the magnitude of size decrease. Our models had a constant linear decrease while the actual data had a much more rapid initial rate followed by a constant size. The nuance of the observed trends ultimately suggests a more complex method of size evolution. In conclusion, probabilistic methods can provide valuable insight into possible evolutionary mechanisms determining size evolution in ostracods.

  9. A simple nomogram for sample size for estimating sensitivity and specificity of medical tests

    Directory of Open Access Journals (Sweden)

    Malhotra Rajeev

    2010-01-01

    Full Text Available Sensitivity and specificity measure inherent validity of a diagnostic test against a gold standard. Researchers develop new diagnostic methods to reduce the cost, risk, invasiveness, and time. Adequate sample size is a must to precisely estimate the validity of a diagnostic test. In practice, researchers generally decide about the sample size arbitrarily either at their convenience, or from the previous literature. We have devised a simple nomogram that yields statistically valid sample size for anticipated sensitivity or anticipated specificity. MS Excel version 2007 was used to derive the values required to plot the nomogram using varying absolute precision, known prevalence of disease, and 95% confidence level using the formula already available in the literature. The nomogram plot was obtained by suitably arranging the lines and distances to conform to this formula. This nomogram could be easily used to determine the sample size for estimating the sensitivity or specificity of a diagnostic test with required precision and 95% confidence level. Sample size at 90% and 99% confidence level, respectively, can also be obtained by just multiplying 0.70 and 1.75 with the number obtained for the 95% confidence level. A nomogram instantly provides the required number of subjects by just moving the ruler and can be repeatedly used without redoing the calculations. This can also be applied for reverse calculations. This nomogram is not applicable for testing of the hypothesis set-up and is applicable only when both diagnostic test and gold standard results have a dichotomous category.

  10. Demonstration of Multi- and Single-Reader Sample Size Program for Diagnostic Studies software.

    Science.gov (United States)

    Hillis, Stephen L; Schartz, Kevin M

    2015-02-01

    The recently released software Multi- and Single-Reader Sample Size Sample Size Program for Diagnostic Studies, written by Kevin Schartz and Stephen Hillis, performs sample size computations for diagnostic reader-performance studies. The program computes the sample size needed to detect a specified difference in a reader performance measure between two modalities, when using the analysis methods initially proposed by Dorfman, Berbaum, and Metz (DBM) and Obuchowski and Rockette (OR), and later unified and improved by Hillis and colleagues. A commonly used reader performance measure is the area under the receiver-operating-characteristic curve. The program can be used with typical common reader-performance measures which can be estimated parametrically or nonparametrically. The program has an easy-to-use step-by-step intuitive interface that walks the user through the entry of the needed information. Features of the software include the following: (1) choice of several study designs; (2) choice of inputs obtained from either OR or DBM analyses; (3) choice of three different inference situations: both readers and cases random, readers fixed and cases random, and readers random and cases fixed; (4) choice of two types of hypotheses: equivalence or noninferiority; (6) choice of two output formats: power for specified case and reader sample sizes, or a listing of case-reader combinations that provide a specified power; (7) choice of single or multi-reader analyses; and (8) functionality in Windows, Mac OS, and Linux.

  11. An approximate approach to sample size determination in bioequivalence testing with multiple pharmacokinetic responses.

    Science.gov (United States)

    Tsai, Chen-An; Huang, Chih-Yang; Liu, Jen-Pei

    2014-08-30

    The approval of generic drugs requires the evidence of average bioequivalence (ABE) on both the area under the concentration-time curve and the peak concentration Cmax . The bioequivalence (BE) hypothesis can be decomposed into the non-inferiority (NI) and non-superiority (NS) hypothesis. Most of regulatory agencies employ the two one-sided tests (TOST) procedure to test ABE between two formulations. As it is based on the intersection-union principle, the TOST procedure is conservative in terms of the type I error rate. However, the type II error rate is the sum of the type II error rates with respect to each null hypothesis of NI and NS hypotheses. When the difference in population means between two treatments is not 0, no close-form solution for the sample size for the BE hypothesis is available. Current methods provide the sample sizes with either insufficient power or unnecessarily excessive power. We suggest an approximate method for sample size determination, which can also provide the type II rate for each of NI and NS hypotheses. In addition, the proposed method is flexible to allow extension from one pharmacokinetic (PK) response to determination of the sample size required for multiple PK responses. We report the results of a numerical study. An R code is provided to calculate the sample size for BE testing based on the proposed methods.

  12. The impact of particle size selective sampling methods on occupational assessment of airborne beryllium particulates.

    Science.gov (United States)

    Sleeth, Darrah K

    2013-05-01

    In 2010, the American Conference of Governmental Industrial Hygienists (ACGIH) formally changed its Threshold Limit Value (TLV) for beryllium from a 'total' particulate sample to an inhalable particulate sample. This change may have important implications for workplace air sampling of beryllium. A history of particle size-selective sampling methods, with a special focus on beryllium, will be provided. The current state of the science on inhalable sampling will also be presented, including a look to the future at what new methods or technology may be on the horizon. This includes new sampling criteria focused on particle deposition in the lung, proposed changes to the existing inhalable convention, as well as how the issues facing beryllium sampling may help drive other changes in sampling technology.

  13. Analysis of hysteretic spin transition and size effect in 3D spin crossover compounds investigated by Monte Carlo Entropic sampling technique in the framework of the Ising-type model

    National Research Council Canada - National Science Library

    Chiruta, Daniel; Linares, J; Dahoo, Pierre-Richard; Dimian, Mihai

    2015-01-01

    .... In this contribution we solve the corresponding Hamiltonian for a three-dimensional SCO system taking into account short-range and long-range interaction using a biased Monte Carlo entropic sampling...

  14. Ports: Definition and study of types, sizes and business models

    Directory of Open Access Journals (Sweden)

    Ivan Roa

    2013-09-01

    Full Text Available Purpose: In the world today there are thousands of port facilities of different types and sizes, competing to capture some market share of freight by sea, mainly. This article aims to determine the type of port and the most common size, in order to find out which business model is applied in that segment and what is the legal status of the companies of such infrastructure.Design/methodology/approach: To achieve this goal, we develop a research on a representative sample of 800 ports worldwide, which manage 90% of the containerized port loading. Then you can find out the legal status of the companies that manage them.Findings: The results indicate a port type and a dominant size, which are mostly managed by companies subject to a concession model.Research limitations/implications: In this research, we study only those ports that handle freight (basically containerized, ignoring other activities such as fishing, military, tourism or recreational.Originality/value: This is an investigation to show that the vast majority of the studied segment port facilities are governed by a similar corporate model and subject to pressure from the markets, which increasingly demand efficiency and service. Consequently, we tend to concession terminals to private operators in a process that might be called privatization, but in the strictest sense of the term, is not entirely realistic because the ownership of the land never ceases to be public

  15. Information-based sample size re-estimation in group sequential design for longitudinal trials.

    Science.gov (United States)

    Zhou, Jing; Adewale, Adeniyi; Shentu, Yue; Liu, Jiajun; Anderson, Keaven

    2014-09-28

    Group sequential design has become more popular in clinical trials because it allows for trials to stop early for futility or efficacy to save time and resources. However, this approach is less well-known for longitudinal analysis. We have observed repeated cases of studies with longitudinal data where there is an interest in early stopping for a lack of treatment effect or in adapting sample size to correct for inappropriate variance assumptions. We propose an information-based group sequential design as a method to deal with both of these issues. Updating the sample size at each interim analysis makes it possible to maintain the target power while controlling the type I error rate. We will illustrate our strategy with examples and simulations and compare the results with those obtained using fixed design and group sequential design without sample size re-estimation.

  16. A comparison of methods for sample size estimation for non-inferiority studies with binary outcomes.

    Science.gov (United States)

    Julious, Steven A; Owen, Roger J

    2011-12-01

    Non-inferiority trials are motivated in the context of clinical research where a proven active treatment exists and placebo-controlled trials are no longer acceptable for ethical reasons. Instead, active-controlled trials are conducted where a treatment is compared to an established treatment with the objective of demonstrating that it is non-inferior to this treatment. We review and compare the methodologies for calculating sample sizes and suggest appropriate methods to use. We demonstrate how the simplest method of using the anticipated response is predominantly consistent with simulations. In the context of trials with binary outcomes with expected high proportions of positive responses, we show how the sample size is quite sensitive to assumptions about the control response. We recommend when designing such a study that sensitivity analyses be performed with respect to the underlying assumptions and that the Bayesian methods described in this article be adopted to assess sample size.

  17. Modelling the effect of size-asymmetric competition on size inequality

    DEFF Research Database (Denmark)

    Rasmussen, Camilla Ruø; Weiner, Jacob

    2017-01-01

    Abstract The concept of size asymmetry in resource competition among plants, in which larger individuals obtain a disproportionate share of contested resources, appears to be very straightforward, but the effects of size asymmetry on growth and size variation among individuals have proved...... to be controversial. It has often been assumed that competition among individual plants in a population has to be size-asymmetric to result in higher size inequality than in the absence of competition, but here we question this inference. Using very simple, individual-based models, we investigate how size symmetry...... irrespective of their sizes, can, under some assumptions, result in higher size inequality than when competition is absent. We demonstrate our approach by applying it to data from a greenhouse experiment investigating the size symmetry of belowground competition between pairs of Triticum aestivum (wheat...

  18. Sample size requirements for indirect association studies of gene-environment interactions (G x E).

    Science.gov (United States)

    Hein, Rebecca; Beckmann, Lars; Chang-Claude, Jenny

    2008-04-01

    Association studies accounting for gene-environment interactions (G x E) may be useful for detecting genetic effects. Although current technology enables very dense marker spacing in genetic association studies, the true disease variants may not be genotyped. Thus, causal genes are searched for by indirect association using genetic markers in linkage disequilibrium (LD) with the true disease variants. Sample sizes needed to detect G x E effects in indirect case-control association studies depend on the true genetic main effects, disease allele frequencies, whether marker and disease allele frequencies match, LD between loci, main effects and prevalence of environmental exposures, and the magnitude of interactions. We explored variables influencing sample sizes needed to detect G x E, compared these sample sizes with those required to detect genetic marginal effects, and provide an algorithm for power and sample size estimations. Required sample sizes may be heavily inflated if LD between marker and disease loci decreases. More than 10,000 case-control pairs may be required to detect G x E. However, given weak true genetic main effects, moderate prevalence of environmental exposures, as well as strong interactions, G x E effects may be detected with smaller sample sizes than those needed for the detection of genetic marginal effects. Moreover, in this scenario, rare disease variants may only be detectable when G x E is included in the analyses. Thus, the analysis of G x E appears to be an attractive option for the detection of weak genetic main effects of rare variants that may not be detectable in the analysis of genetic marginal effects only.

  19. Estimation of grain size in asphalt samples using digital image analysis

    Science.gov (United States)

    Källén, Hanna; Heyden, Anders; Lindh, Per

    2014-09-01

    Asphalt is made of a mixture of stones of different sizes and a binder called bitumen, the size distribution of the stones is determined by the recipe of the asphalt. One quality check of asphalt is to see if the real size distribution of asphalt samples is consistent with the recipe. This is usually done by first extracting the binder using methylenchloride and the sieving the stones and see how much that pass every sieve size. Methylenchloride is highly toxic and it is desirable to find the size distribution in some other way. In this paper we find the size distribution by slicing up the asphalt sample and using image analysis techniques to analyze the cross-sections. First the stones are segmented from the background, bitumen, and then rectangles are fit to the detected stones. We then estimate the sizes of the stones by using the width of the rectangle. The result is compared with both the recipe for the asphalt and with the result from the standard analysis method, and our method shows good correlation with those.

  20. The impact of metrology study sample size on uncertainty in IAEA safeguards calculations

    Directory of Open Access Journals (Sweden)

    Burr Tom

    2016-01-01

    Full Text Available Quantitative conclusions by the International Atomic Energy Agency (IAEA regarding States' nuclear material inventories and flows are provided in the form of material balance evaluations (MBEs. MBEs use facility estimates of the material unaccounted for together with verification data to monitor for possible nuclear material diversion. Verification data consist of paired measurements (usually operators' declarations and inspectors' verification results that are analysed one-item-at-a-time to detect significant differences. Also, to check for patterns, an overall difference of the operator-inspector values using a “D (difference statistic” is used. The estimated DP and false alarm probability (FAP depend on the assumed measurement error model and its random and systematic error variances, which are estimated using data from previous inspections (which are used for metrology studies to characterize measurement error variance components. Therefore, the sample sizes in both the previous and current inspections will impact the estimated DP and FAP, as is illustrated by simulated numerical examples. The examples include application of a new expression for the variance of the D statistic assuming the measurement error model is multiplicative and new application of both random and systematic error variances in one-item-at-a-time testing.

  1. Impact of multicollinearity on small sample hydrologic regression models

    Science.gov (United States)

    Kroll, Charles N.; Song, Peter

    2013-06-01

    Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.

  2. Sample size and sampling errors as the source of dispersion in chemical analyses. [for high-Ti lunar basalt

    Science.gov (United States)

    Clanton, U. S.; Fletcher, C. R.

    1976-01-01

    The paper describes a Monte Carlo model for simulation of two-dimensional representations of thin sections of some of the more common igneous rock textures. These representations are extrapolated to three dimensions to develop a volume of 'rock'. The model (here applied to a medium-grained high-Ti basalt) can be used to determine a statistically significant sample for a lunar rock or to predict the probable errors in the oxide contents that can occur during the analysis of a sample that is not representative of the parent rock.

  3. Estimating sample size for landscape-scale mark-recapture studies of North American migratory tree bats

    Science.gov (United States)

    Ellison, Laura E.; Lukacs, Paul M.

    2014-01-01

    Concern for migratory tree-roosting bats in North America has grown because of possible population declines from wind energy development. This concern has driven interest in estimating population-level changes. Mark-recapture methodology is one possible analytical framework for assessing bat population changes, but sample size requirements to produce reliable estimates have not been estimated. To illustrate the sample sizes necessary for a mark-recapture-based monitoring program we conducted power analyses using a statistical model that allows reencounters of live and dead marked individuals. We ran 1,000 simulations for each of five broad sample size categories in a Burnham joint model, and then compared the proportion of simulations in which 95% confidence intervals overlapped between and among years for a 4-year study. Additionally, we conducted sensitivity analyses of sample size to various capture probabilities and recovery probabilities. More than 50,000 individuals per year would need to be captured and released to accurately determine 10% and 15% declines in annual survival. To detect more dramatic declines of 33% or 50% survival over four years, then sample sizes of 25,000 or 10,000 per year, respectively, would be sufficient. Sensitivity analyses reveal that increasing recovery of dead marked individuals may be more valuable than increasing capture probability of marked individuals. Because of the extraordinary effort that would be required, we advise caution should such a mark-recapture effort be initiated because of the difficulty in attaining reliable estimates. We make recommendations for what techniques show the most promise for mark-recapture studies of bats because some techniques violate the assumptions of mark-recapture methodology when used to mark bats.

  4. Building predictive models of soil particle-size distribution

    Directory of Open Access Journals (Sweden)

    Alessandro Samuel-Rosa

    2013-04-01

    Full Text Available Is it possible to build predictive models (PMs of soil particle-size distribution (psd in a region with complex geology and a young and unstable land-surface? The main objective of this study was to answer this question. A set of 339 soil samples from a small slope catchment in Southern Brazil was used to build PMs of psd in the surface soil layer. Multiple linear regression models were constructed using terrain attributes (elevation, slope, catchment area, convergence index, and topographic wetness index. The PMs explained more than half of the data variance. This performance is similar to (or even better than that of the conventional soil mapping approach. For some size fractions, the PM performance can reach 70 %. Largest uncertainties were observed in geologically more complex areas. Therefore, significant improvements in the predictions can only be achieved if accurate geological data is made available. Meanwhile, PMs built on terrain attributes are efficient in predicting the particle-size distribution (psd of soils in regions of complex geology.

  5. Estimating design effect and calculating sample size for respondent-driven sampling studies of injection drug users in the United States.

    Science.gov (United States)

    Wejnert, Cyprian; Pham, Huong; Krishna, Nevin; Le, Binh; DiNenno, Elizabeth

    2012-05-01

    Respondent-driven sampling (RDS) has become increasingly popular for sampling hidden populations, including injecting drug users (IDU). However, RDS data are unique and require specialized analysis techniques, many of which remain underdeveloped. RDS sample size estimation requires knowing design effect (DE), which can only be calculated post hoc. Few studies have analyzed RDS DE using real world empirical data. We analyze estimated DE from 43 samples of IDU collected using a standardized protocol. We find the previous recommendation that sample size be at least doubled, consistent with DE = 2, underestimates true DE and recommend researchers use DE = 4 as an alternate estimate when calculating sample size. A formula for calculating sample size for RDS studies among IDU is presented. Researchers faced with limited resources may wish to accept slightly higher standard errors to keep sample size requirements low. Our results highlight dangers of ignoring sampling design in analysis.

  6. Data size reduction strategy for the classification of breath and air samples using multicapillary column-ion mobility spectrometry.

    Science.gov (United States)

    Szymańska, Ewa; Brodrick, Emma; Williams, Mark; Davies, Antony N; van Manen, Henk-Jan; Buydens, Lutgarde M C

    2015-01-20

    Ion mobility spectrometry combined with multicapillary column separation (MCC-IMS) is a well-known technology for detecting volatile organic compounds (VOCs) in gaseous samples. Due to their large data size, processing of MCC-IMS spectra is still the main bottleneck of data analysis, and there is an increasing need for data analysis strategies in which the size of MCC-IMS data is reduced to enable further analysis. In our study, the first untargeted chemometric strategy is developed and employed in the analysis of MCC-IMS spectra from 264 breath and ambient air samples. This strategy does not comprise identification of compounds as a primary step but includes several preprocessing steps and a discriminant analysis. Data size is significantly reduced in three steps. Wavelet transform, mask construction, and sparse-partial least squares-discriminant analysis (s-PLS-DA) allow data size reduction with down to 50 variables relevant to the goal of analysis. The influence and compatibility of the data reduction tools are studied by applying different settings of the developed strategy. Loss of information after preprocessing is evaluated, e.g., by comparing the performance of classification models for different classes of samples. Finally, the interpretability of the classification models is evaluated, and regions of spectra that are related to the identification of potential analytical biomarkers are successfully determined. This work will greatly enable the standardization of analytical procedures across different instrumentation types promoting the adoption of MCC-IMS technology in a wide range of diverse application fields.

  7. Effect of sample moisture content on XRD-estimated cellulose crystallinity index and crystallite size

    Science.gov (United States)

    Umesh P. Agarwal; Sally A. Ralph; Carlos Baez; Richard S. Reiner; Steve P. Verrill

    2017-01-01

    Although X-ray diffraction (XRD) has been the most widely used technique to investigate crystallinity index (CrI) and crystallite size (L200) of cellulose materials, there are not many studies that have taken into account the role of sample moisture on these measurements. The present investigation focuses on a variety of celluloses and cellulose...

  8. Size Distributions and Characterization of Native and Ground Samples for Toxicology Studies

    Science.gov (United States)

    McKay, David S.; Cooper, Bonnie L.; Taylor, Larry A.

    2010-01-01

    This slide presentation shows charts and graphs that review the particle size distribution and characterization of natural and ground samples for toxicology studies. There are graphs which show the volume distribution versus the number distribution for natural occurring dust, jet mill ground dust, and ball mill ground dust.

  9. Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient

    Science.gov (United States)

    Krishnamoorthy, K.; Xia, Yanping

    2008-01-01

    The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…

  10. Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient

    Science.gov (United States)

    Krishnamoorthy, K.; Xia, Yanping

    2008-01-01

    The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…

  11. Analysis of variograms with various sample sizes from a multispectral image

    Science.gov (United States)

    Variogram plays a crucial role in remote sensing application and geostatistics. It is very important to estimate variogram reliably from sufficient data. In this study, the analysis of variograms with various sample sizes of remotely sensed data was conducted. A 100x100-pixel subset was chosen from ...

  12. B-graph sampling to estimate the size of a hidden population

    NARCIS (Netherlands)

    Spreen, M.; Bogaerts, S.

    2015-01-01

    Link-tracing designs are often used to estimate the size of hidden populations by utilizing the relational links between their members. A major problem in studies of hidden populations is the lack of a convenient sampling frame. The most frequently applied design in studies of hidden populations is

  13. Got Power? A Systematic Review of Sample Size Adequacy in Health Professions Education Research

    Science.gov (United States)

    Cook, David A.; Hatala, Rose

    2015-01-01

    Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011,…

  14. The Influence of Virtual Sample Size on Confidence and Causal-Strength Judgments

    Science.gov (United States)

    Liljeholm, Mimi; Cheng, Patricia W.

    2009-01-01

    The authors investigated whether confidence in causal judgments varies with virtual sample size--the frequency of cases in which the outcome is (a) absent before the introduction of a generative cause or (b) present before the introduction of a preventive cause. Participants were asked to evaluate the influence of various candidate causes on an…

  15. Required sample size for monitoring stand dynamics in strict forest reserves: a case study

    Science.gov (United States)

    Diego Van Den Meersschaut; Bart De Cuyper; Kris Vandekerkhove; Noel Lust

    2000-01-01

    Stand dynamics in European strict forest reserves are commonly monitored using inventory densities of 5 to 15 percent of the total surface. The assumption that these densities guarantee a representative image of certain parameters is critically analyzed in a case study for the parameters basal area and stem number. The required sample sizes for different accuracy and...

  16. Sample size planning for the coefficient of variation from the accuracy in parameter estimation approach.

    Science.gov (United States)

    Kelley, Ken

    2007-11-01

    The accuracy in parameter estimation approach to sample size planning is developed for the coefficient of variation, where the goal of the method is to obtain an accurate parameter estimate by achieving a sufficiently narrow confidence interval. The first method allows researchers to plan sample size so that the expected width of the confidence interval for the population coefficient of variation is sufficiently narrow. A modification allows a desired degree of assurance to be incorporated into the method, so that the obtained confidence interval will be sufficiently narrow with some specified probability (e.g., 85% assurance that the 95 confidence interval width will be no wider than to units). Tables of necessary sample size are provided for a variety of scenarios that may help researchers planning a study where the coefficient of variation is of interest plan an appropriate sample size in order to have a sufficiently narrow confidence interval, optionally with somespecified assurance of the confidence interval being sufficiently narrow. Freely available computer routines have been developed that allow researchers to easily implement all of the methods discussed in the article.

  17. 46 CFR 160.002-2 - Size and models.

    Science.gov (United States)

    2010-10-01

    ...: SPECIFICATIONS AND APPROVAL LIFESAVING EQUIPMENT Life Preservers, Kapok, Adult and Child (Jacket Type), Models 3 and 5 § 160.002-2 Size and models. Each life preserver specified in this subpart is to be a: (a) Model...

  18. Gutenberg-Richter b-value maximum likelihood estimation and sample size

    Science.gov (United States)

    Nava, F. A.; Márquez-Ramírez, V. H.; Zúñiga, F. R.; Ávila-Barrientos, L.; Quinteros, C. B.

    2017-01-01

    The Aki-Utsu maximum likelihood method is widely used for estimation of the Gutenberg-Richter b-value, but not all authors are conscious of the method's limitations and implicit requirements. The Aki/Utsu method requires a representative estimate of the population mean magnitude; a requirement seldom satisfied in b-value studies, particularly in those that use data from small geographic and/or time windows, such as b-mapping and b-vs-time studies. Monte Carlo simulation methods are used to determine how large a sample is necessary to achieve representativity, particularly for rounded magnitudes. The size of a representative sample weakly depends on the actual b-value. It is shown that, for commonly used precisions, small samples give meaningless estimations of b. Our results give estimates on the probabilities of getting correct estimates of b for a given desired precision for samples of different sizes. We submit that all published studies reporting b-value estimations should include information about the size of the samples used.

  19. Factors Influencing Sample Size for Internal Audit Evidence Collection in the Public Sector in Kenya

    Directory of Open Access Journals (Sweden)

    Kamau Charles Guandaru

    2017-01-01

    Full Text Available The internal audit department has a role of providing objective assurance and consulting services designed to add value and improve an organization’s operations. In performing this role the internal auditors are required to provide an auditor’s opinion which is supported by sufficient and reliable audit evidence. Since auditors are not in a position to examine 100% of the records and transactions, they are required to sample a few and make conclusions on the basis of the sample selected. The literature suggests several factors which affects the sample size for audit purposes of the internal auditors in the public sector in Kenya. This research collected data from 32 public sector internal auditors. The research carried out simple regression and correlation analysis on the data collected so as to test hypotheses and make conclusions on the factors affecting the sample size for audit purposes of the internal auditors in the public sector in Kenya. The study found out that that materiality of audit issue, type of information available, source of information, degree of risk of misstatement and auditor skills and independence are some of the factors influencing the sample size determination for the purposes of internal audit evidence collection in public sector in Kenya.

  20. Size selective isocyanate aerosols personal air sampling using porous plastic foams

    Energy Technology Data Exchange (ETDEWEB)

    Cong Khanh Huynh; Trinh Vu Duc, E-mail: chuynh@hospvd.c [Institut Universitaire Romand de Sante au Travail (IST), 21 rue du Bugnon - CH-1011 Lausanne (Switzerland)

    2009-02-01

    As part of a European project (SMT4-CT96-2137), various European institutions specialized in occupational hygiene (BGIA, HSL, IOM, INRS, IST, Ambiente e Lavoro) have established a program of scientific collaboration to develop one or more prototypes of European personal samplers for the collection of simultaneous three dust fractions: inhalable, thoracic and respirable. These samplers based on existing sampling heads (IOM, GSP and cassettes) use Polyurethane Plastic Foam (PUF) according to their porosity to support sampling and separator size of the particles. In this study, the authors present an original application of size selective personal air sampling using chemical impregnated PUF to perform isocyanate aerosols capturing and derivatizing in industrial spray-painting shops.

  1. Evaluation of different sized blood sampling tubes for thromboelastometry, platelet function, and platelet count

    DEFF Research Database (Denmark)

    Andreasen, Jo Bønding; Pistor-Riebold, Thea Unger; Knudsen, Ingrid Hell;

    2014-01-01

    count remained stable using a 3.6 mL tube during the entire observation period of 120 min (p=0.74), but decreased significantly after 60 min when using tubes smaller than 3.6 mL (pblood sampling tubes. Therefore, 1.8 mL tubes should...... be preferred for RoTEM® analyses in order to minimise the volume of blood drawn. With regard to platelet aggregation analysed by impedance aggregometry tubes of different size cannot be used interchangeably. If platelet count is determined later than 10 min after blood sampling using tubes containing citrate......Background: To minimise the volume of blood used for diagnostic procedures, especially in children, we investigated whether the size of sample tubes affected whole blood coagulation analyses. Methods: We included 20 healthy individuals for rotational thromboelastometry (RoTEM®) analyses...

  2. {sup 10}Be measurements at MALT using reduced-size samples of bulk sediments

    Energy Technology Data Exchange (ETDEWEB)

    Horiuchi, Kazuho, E-mail: kh@cc.hirosaki-u.ac.jp [Graduate School of Science and Technology, Hirosaki University, 3, Bunkyo-chou, Hirosaki, Aomori 036-8561 (Japan); Oniyanagi, Itsumi [Graduate School of Science and Technology, Hirosaki University, 3, Bunkyo-chou, Hirosaki, Aomori 036-8561 (Japan); Wasada, Hiroshi [Institute of Geology and Paleontology, Graduate school of Science, Tohoku University, 6-3, Aramaki Aza-Aoba, Aoba-ku, Sendai 980-8578 (Japan); Matsuzaki, Hiroyuki [MALT, School of Engineering, University of Tokyo, 2-11-16, Yayoi, Bunkyo-ku, Tokyo 113-0032 (Japan)

    2013-01-15

    In order to establish {sup 10}Be measurements on reduced-size (1-10 mg) samples of bulk sediments, we investigated four different pretreatment designs using lacustrine and marginal-sea sediments and the AMS system of the Micro Analysis Laboratory, Tandem accelerator (MALT) at University of Tokyo. The {sup 10}Be concentrations obtained from the samples of 1-10 mg agreed within a precision of 3-5% with the values previously determined using corresponding ordinary-size ({approx}200 mg) samples and the same AMS system. This fact demonstrates reliable determinations of {sup 10}Be with milligram levels of recent bulk sediments at MALT. On the other hand, a clear decline of the BeO{sup -} beam with tens of micrograms of {sup 9}Be carrier suggests that the combination of ten milligrams of sediments and a few hundred micrograms of the {sup 9}Be carrier is more convenient at this stage.

  3. Performance of a reciprocal shaker in mechanical dispersion of soil samples for particle-size analysis

    Directory of Open Access Journals (Sweden)

    Thayse Aparecida Dourado

    2012-08-01

    Full Text Available The dispersion of the samples in soil particle-size analysis is a fundamental step, which is commonly achieved with a combination of chemical agents and mechanical agitation. The purpose of this study was to evaluate the efficiency of a low-speed reciprocal shaker for the mechanical dispersion of soil samples of different textural classes. The particle size of 61 soil samples was analyzed in four replications, using the pipette method to determine the clay fraction and sieving to determine coarse, fine and total sand fractions. The silt content was obtained by difference. To evaluate the performance, the results of the reciprocal shaker (RSh were compared with data of the same soil samples available in reports of the Proficiency testing for Soil Analysis Laboratories of the Agronomic Institute of Campinas (Prolab/IAC. The accuracy was analyzed based on the maximum and minimum values defining the confidence intervals for the particle-size fractions of each soil sample. Graphical indicators were also used for data comparison, based on dispersion and linear adjustment. The descriptive statistics indicated predominantly low variability in more than 90 % of the results for sand, medium-textured and clay samples, and for 68 % of the results for heavy clay samples, indicating satisfactory repeatability of measurements with the RSh. Medium variability was frequently associated with silt, followed by the fine sand fraction. The sensitivity analyses indicated an accuracy of 100 % for the three main separates (total sand, silt and clay, in all 52 samples of the textural classes heavy clay, clay and medium. For the nine sand soil samples, the average accuracy was 85.2 %; highest deviations were observed for the silt fraction. In relation to the linear adjustments, the correlation coefficients of 0.93 (silt or > 0.93 (total sand and clay, as well as the differences between the angular coefficients and the unit < 0.16, indicated a high correlation between the

  4. Cancer progression modeling using static sample data.

    Science.gov (United States)

    Sun, Yijun; Yao, Jin; Nowak, Norma J; Goodison, Steve

    2014-01-01

    As molecular profiling data continues to accumulate, the design of integrative computational analyses that can provide insights into the dynamic aspects of cancer progression becomes feasible. Here, we present a novel computational method for the construction of cancer progression models based on the analysis of static tumor samples. We demonstrate the reliability of the method with simulated data, and describe the application to breast cancer data. Our findings support a linear, branching model for breast cancer progression. An interactive model facilitates the identification of key molecular events in the advance of disease to malignancy.

  5. PIXE–PIGE analysis of size-segregated aerosol samples from remote areas

    Energy Technology Data Exchange (ETDEWEB)

    Calzolai, G., E-mail: calzolai@fi.infn.it [Department of Physics and Astronomy, University of Florence and National Institute of Nuclear Physics (INFN), Via G. Sansone 1, 50019 Sesto Fiorentino (Italy); Chiari, M.; Lucarelli, F.; Nava, S.; Taccetti, F. [Department of Physics and Astronomy, University of Florence and National Institute of Nuclear Physics (INFN), Via G. Sansone 1, 50019 Sesto Fiorentino (Italy); Becagli, S.; Frosini, D.; Traversi, R.; Udisti, R. [Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino (Italy)

    2014-01-01

    The chemical characterization of size-segregated samples is helpful to study the aerosol effects on both human health and environment. The sampling with multi-stage cascade impactors (e.g., Small Deposit area Impactor, SDI) produces inhomogeneous samples, with a multi-spot geometry and a non-negligible particle stratification. At LABEC (Laboratory of nuclear techniques for the Environment and the Cultural Heritage), an external beam line is fully dedicated to PIXE–PIGE analysis of aerosol samples. PIGE is routinely used as a sidekick of PIXE to correct the underestimation of PIXE in quantifying the concentration of the lightest detectable elements, like Na or Al, due to X-ray absorption inside the individual aerosol particles. In this work PIGE has been used to study proper attenuation correction factors for SDI samples: relevant attenuation effects have been observed also for stages collecting smaller particles, and consequent implications on the retrieved aerosol modal structure have been evidenced.

  6. Sample size calculation for microarray experiments with blocked one-way design

    Directory of Open Access Journals (Sweden)

    Jung Sin-Ho

    2009-05-01

    Full Text Available Abstract Background One of the main objectives of microarray analysis is to identify differentially expressed genes for different types of cells or treatments. Many statistical methods have been proposed to assess the treatment effects in microarray experiments. Results In this paper, we consider discovery of the genes that are differentially expressed among K (> 2 treatments when each set of K arrays consists of a block. In this case, the array data among K treatments tend to be correlated because of block effect. We propose to use the blocked one-way ANOVA F-statistic to test if each gene is differentially expressed among K treatments. The marginal p-values are calculated using a permutation method accounting for the block effect, adjusting for the multiplicity of the testing procedure by controlling the false discovery rate (FDR. We propose a sample size calculation method for microarray experiments with a blocked one-way design. With FDR level and effect sizes of genes specified, our formula provides a sample size for a given number of true discoveries. Conclusion The calculated sample size is shown via simulations to provide an accurate number of true discoveries while controlling the FDR at the desired level.

  7. Prediction errors in learning drug response from gene expression data - influence of labeling, sample size, and machine learning algorithm.

    Science.gov (United States)

    Bayer, Immanuel; Groth, Philip; Schneckener, Sebastian

    2013-01-01

    Model-based prediction is dependent on many choices ranging from the sample collection and prediction endpoint to the choice of algorithm and its parameters. Here we studied the effects of such choices, exemplified by predicting sensitivity (as IC50) of cancer cell lines towards a variety of compounds. For this, we used three independent sample collections and applied several machine learning algorithms for predicting a variety of endpoints for drug response. We compared all possible models for combinations of sample collections, algorithm, drug, and labeling to an identically generated null model. The predictability of treatment effects varies among compounds, i.e. response could be predicted for some but not for all. The choice of sample collection plays a major role towards lowering the prediction error, as does sample size. However, we found that no algorithm was able to consistently outperform the other and there was no significant difference between regression and two- or three class predictors in this experimental setting. These results indicate that response-modeling projects should direct efforts mainly towards sample collection and data quality, rather than method adjustment.

  8. Bayesian sample size calculation for estimation of the difference between two binomial proportions.

    Science.gov (United States)

    Pezeshk, Hamid; Nematollahi, Nader; Maroufy, Vahed; Marriott, Paul; Gittins, John

    2013-12-01

    In this study, we discuss a decision theoretic or fully Bayesian approach to the sample size question in clinical trials with binary responses. Data are assumed to come from two binomial distributions. A Dirichlet distribution is assumed to describe prior knowledge of the two success probabilities p1 and p2. The parameter of interest is p = p1 - p2. The optimal size of the trial is obtained by maximising the expected net benefit function. The methodology presented in this article extends previous work by the assumption of dependent prior distributions for p1 and p2.

  9. Profit based phase II sample size determination when adaptation by design is adopted

    OpenAIRE

    Martini, D.

    2014-01-01

    Background. Adaptation by design consists in conservatively estimating the phase III sample size on the basis of phase II data, and can be applied in almost all therapeutic areas; it is based on the assumption that the effect size of the drug is the same in phase II and phase III trials, that is a very common scenario assumed in product development. Adaptation by design reduces the probability on underpowered experiments and can improve the overall success probability of phase II and III tria...

  10. "PowerUp"!: A Tool for Calculating Minimum Detectable Effect Sizes and Minimum Required Sample Sizes for Experimental and Quasi-Experimental Design Studies

    Science.gov (United States)

    Dong, Nianbo; Maynard, Rebecca

    2013-01-01

    This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and…

  11. The effects of focused transducer geometry and sample size on the measurement of ultrasonic transmission properties

    Energy Technology Data Exchange (ETDEWEB)

    Atkins, T J; Duck, F A; Tooley, M A [Department of Medical Physics and Bioengineering, Royal United Hospital, Combe Park, Bath BA1 3NG (United Kingdom); Humphrey, V F, E-mail: timothy.atkins@nhs.net [Institute of Sound and Vibration Research, University of Southampton, Southampton SO17 1BJ (United Kingdom)

    2011-02-01

    The response of two coaxially aligned weakly focused ultrasonic transducers, typical of those employed for measuring the attenuation of small samples using the immersion method, has been investigated. The effects of the sample size on transmission measurements have been analyzed by integrating the sound pressure distribution functions of the radiator and receiver over different limits to determine the size of the region that contributes to the system response. The results enable the errors introduced into measurements of attenuation to be estimated as a function of sample size. A theoretical expression has been used to examine how the transducer separation affects the receiver output. The calculations are compared with an experimental study of the axial response of three unpaired transducers in water. The separation of each transducer pair giving the maximum response was determined, and compared with the field characteristics of the individual transducers. The optimum transducer separation, for accurate estimation of sample properties, was found to fall between the sum of the focal distances and the sum of the geometric focal lengths as this reduced diffraction errors.

  12. The role of the upper sample size limit in two-stage bioequivalence designs.

    Science.gov (United States)

    Karalis, Vangelis

    2013-11-01

    Two-stage designs (TSDs) are currently recommended by the regulatory authorities for bioequivalence (BE) assessment. The TSDs presented until now rely on an assumed geometric mean ratio (GMR) value of the BE metric in stage I in order to avoid inflation of type I error. In contrast, this work proposes a more realistic TSD design where sample re-estimation relies not only on the variability of stage I, but also on the observed GMR. In these cases, an upper sample size limit (UL) is introduced in order to prevent inflation of type I error. The aim of this study is to unveil the impact of UL on two TSD bioequivalence approaches which are based entirely on the interim results. Monte Carlo simulations were used to investigate several different scenarios of UL levels, within-subject variability, different starting number of subjects, and GMR. The use of UL leads to no inflation of type I error. As UL values increase, the % probability of declaring BE becomes higher. The starting sample size and the variability of the study affect type I error. Increased UL levels result in higher total sample sizes of the TSD which are more pronounced for highly variable drugs.

  13. Use of High-Frequency In-Home Monitoring Data May Reduce Sample Sizes Needed in Clinical Trials.

    Directory of Open Access Journals (Sweden)

    Hiroko H Dodge

    Full Text Available Trials in Alzheimer's disease are increasingly focusing on prevention in asymptomatic individuals. This poses a challenge in examining treatment effects since currently available approaches are often unable to detect cognitive and functional changes among asymptomatic individuals. Resultant small effect sizes require large sample sizes using biomarkers or secondary measures for randomized controlled trials (RCTs. Better assessment approaches and outcomes capable of capturing subtle changes during asymptomatic disease stages are needed.We aimed to develop a new approach to track changes in functional outcomes by using individual-specific distributions (as opposed to group-norms of unobtrusive continuously monitored in-home data. Our objective was to compare sample sizes required to achieve sufficient power to detect prevention trial effects in trajectories of outcomes in two scenarios: (1 annually assessed neuropsychological test scores (a conventional approach, and (2 the likelihood of having subject-specific low performance thresholds, both modeled as a function of time.One hundred nineteen cognitively intact subjects were enrolled and followed over 3 years in the Intelligent Systems for Assessing Aging Change (ISAAC study. Using the difference in empirically identified time slopes between those who remained cognitively intact during follow-up (normal control, NC and those who transitioned to mild cognitive impairment (MCI, we estimated comparative sample sizes required to achieve up to 80% statistical power over a range of effect sizes for detecting reductions in the difference in time slopes between NC and MCI incidence before transition.Sample size estimates indicated approximately 2000 subjects with a follow-up duration of 4 years would be needed to achieve a 30% effect size when the outcome is an annually assessed memory test score. When the outcome is likelihood of low walking speed defined using the individual-specific distributions of

  14. Statistical power calculation and sample size determination for environmental studies with data below detection limits

    Science.gov (United States)

    Shao, Quanxi; Wang, You-Gan

    2009-09-01

    Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.

  15. Robust inference in sample selection models

    KAUST Repository

    Zhelonkin, Mikhail

    2015-11-20

    The problem of non-random sample selectivity often occurs in practice in many fields. The classical estimators introduced by Heckman are the backbone of the standard statistical analysis of these models. However, these estimators are very sensitive to small deviations from the distributional assumptions which are often not satisfied in practice. We develop a general framework to study the robustness properties of estimators and tests in sample selection models. We derive the influence function and the change-of-variance function of Heckman\\'s two-stage estimator, and we demonstrate the non-robustness of this estimator and its estimated variance to small deviations from the model assumed. We propose a procedure for robustifying the estimator, prove its asymptotic normality and give its asymptotic variance. Both cases with and without an exclusion restriction are covered. This allows us to construct a simple robust alternative to the sample selection bias test. We illustrate the use of our new methodology in an analysis of ambulatory expenditures and we compare the performance of the classical and robust methods in a Monte Carlo simulation study.

  16. Sample sizing of biological materials analyzed by energy dispersion X-ray fluorescence

    Energy Technology Data Exchange (ETDEWEB)

    Paiva, Jose D.S.; Franca, Elvis J.; Magalhaes, Marcelo R.L.; Almeida, Marcio E.S.; Hazin, Clovis A., E-mail: dan-paiva@hotmail.com, E-mail: ejfranca@cnen.gov.br, E-mail: marcelo_rlm@hotmail.com, E-mail: maensoal@yahoo.com.br, E-mail: chazin@cnen.gov.b [Centro Regional de Ciencias Nucleares do Nordeste (CRCN-NE/CNEN-PE), Recife, PE (Brazil)

    2013-07-01

    Analytical portions used in chemical analyses are usually less than 1g. Errors resulting from the sampling are barely evaluated, since this type of study is a time-consuming procedure, with high costs for the chemical analysis of large number of samples. The energy dispersion X-ray fluorescence - EDXRF is a non-destructive and fast analytical technique with the possibility of determining several chemical elements. Therefore, the aim of this study was to provide information on the minimum analytical portion for quantification of chemical elements in biological matrices using EDXRF. Three species were sampled in mangroves from the Pernambuco, Brazil. Tree leaves were washed with distilled water, oven-dried at 60 deg C and milled until 0.5 mm particle size. Ten test-portions of approximately 500 mg for each species were transferred to vials sealed with polypropylene film. The quality of the analytical procedure was evaluated from the reference materials IAEA V10 Hay Powder, SRM 2976 Apple Leaves. After energy calibration, all samples were analyzed under vacuum for 100 seconds for each group of chemical elements. The voltage used was 15 kV and 50 kV for chemical elements of atomic number lower than 22 and the others, respectively. For the best analytical conditions, EDXRF was capable of estimating the sample size uncertainty for further determination of chemical elements in leaves. (author)

  17. A Complete Sample of Megaparsec Size Double Radio Sources from SUMSS

    CERN Document Server

    Saripalli, L; Subramanian, R; Boyce, E

    2005-01-01

    We present a complete sample of megaparsec-size double radio sources compiled from the Sydney University Molonglo Sky Survey (SUMSS). Almost complete redshift information has been obtained for the sample. The sample has the following defining criteria: Galactic latitude |b| > 12.5 deg, declination 5 arcmin. All the sources have projected linear size larger than 0.7 Mpc (assuming H_o = 71 km/s/Mpc). The sample is chosen from a region of the sky covering 2100 square degrees. In this paper, we present 843-MHz radio images of the extended radio morphologies made using the Molonglo Observatory Synthesis Telescope (MOST), higher resolution radio observations of any compact radio structures using the Australia Telescope Compact Array (ATCA), and low resolution optical spectra of the host galaxies from the 2.3-m Australian National University (ANU) telescope at Siding Spring Observatory. The sample presented here is the first in the southern hemisphere and significantly enhances the database of known giant radio sou...

  18. Glycolytic activities in size-fractionated water samples: emphasis on rhamnosidase, arabinosidase and fucosidase activities

    OpenAIRE

    Vanessa Colombo-Corbi; Maria José Dellamano-Oliveira; Armando Augusto Henriques Vieira

    2011-01-01

    Glycolytic activities of eight enzymes in size-fractionated water samples from a eutrophic tropical reservoir are presented in this study, including enzymes assayed for the first time in a freshwater environment. Among these enzymes, rhamnosidase, arabinosidase and fucosidase presented high activity in the free-living fraction, while glucosidase, mannosidase and galactosidase exhibited high activity in the attached fraction. The low activity registered for rhamnosidase, arabinosidase and fuco...

  19. Comparing spectral densities of stationary time series with unequal sample sizes

    OpenAIRE

    Hildebrandt, Thimo; Preuß, Philip

    2012-01-01

    This paper deals with the comparison of several stationary processes with unequal sample sizes. We provide a detailed theoretical framework on the testing problem for equality of spectral densities in the bivariate case, after which the generalization of our approach to the m dimensional case and to other statistical applications (like testing for zero correlation or clustering of time series data with different length) is straightforward. We prove asymptotic normality of an appropriately sta...

  20. A contemporary decennial global Landsat sample of changing agricultural field sizes

    Science.gov (United States)

    White, Emma; Roy, David

    2014-05-01

    Agriculture has caused significant human induced Land Cover Land Use (LCLU) change, with dramatic cropland expansion in the last century and significant increases in productivity over the past few decades. Satellite data have been used for agricultural applications including cropland distribution mapping, crop condition monitoring, crop production assessment and yield prediction. Satellite based agricultural applications are less reliable when the sensor spatial resolution is small relative to the field size. However, to date, studies of agricultural field size distributions and their change have been limited, even though this information is needed to inform the design of agricultural satellite monitoring systems. Moreover, the size of agricultural fields is a fundamental description of rural landscapes and provides an insight into the drivers of rural LCLU change. In many parts of the world field sizes may have increased. Increasing field sizes cause a subsequent decrease in the number of fields and therefore decreased landscape spatial complexity with impacts on biodiversity, habitat, soil erosion, plant-pollinator interactions, and impacts on the diffusion of herbicides, pesticides, disease pathogens, and pests. The Landsat series of satellites provide the longest record of global land observations, with 30m observations available since 1982. Landsat data are used to examine contemporary field size changes in a period (1980 to 2010) when significant global agricultural changes have occurred. A multi-scale sampling approach is used to locate global hotspots of field size change by examination of a recent global agricultural yield map and literature review. Nine hotspots are selected where significant field size change is apparent and where change has been driven by technological advancements (Argentina and U.S.), abrupt societal changes (Albania and Zimbabwe), government land use and agricultural policy changes (China, Malaysia, Brazil), and/or constrained by

  1. 46 CFR 160.005-2 - Size and model.

    Science.gov (United States)

    2010-10-01

    ...: SPECIFICATIONS AND APPROVAL LIFESAVING EQUIPMENT Life Preservers, Fibrous Glass, Adult and Child (Jacket Type), Models 52 and 56 § 160.005-2 Size and model. Each life preserver specified in this subpart is a: (a...

  2. A random energy model for size dependence : recurrence vs. transience

    NARCIS (Netherlands)

    Külske, Christof

    1998-01-01

    We investigate the size dependence of disordered spin models having an infinite number of Gibbs measures in the framework of a simplified 'random energy model for size dependence'. We introduce two versions (involving either independent random walks or branching processes), that can be seen as gener

  3. Williams Test Required Sample Size For Determining The Minimum Effective Dose

    Directory of Open Access Journals (Sweden)

    Mustafa Agah TEKINDAL

    2016-04-01

    Full Text Available Objective: The biological activity of a substance may be explored through a series of experiments on increased or decreased doses of such substance. One of the purposes in studies of this sort is the determination of minimum effective dose. Use of appropriate sample size has an indisputable effect on the reliability of the decisions made in studies made for this purpose. This study attempts to provide a summary of sample sizes, in different scenarios, needed by researchers during the use of Williams test by taking into consideration the number of groups in dose-response studies as well as minimal clinically significant difference, standard deviation, and the test’s power through asymptotic power analyses. Material and Methods: When Type I error was taken as 0.05, scenarios were determined in different sample sizes for each group (5 to 100 with an increase of 5 at a time and different numbers of groups (from 3 to 10, with an increase of 1 at a time. Minimal clinically significant difference refers to the difference between the control group and the experimental group. In this instance, when the control group is zero and takes a specific average value, it refers to the difference from the experimental group. In the resent study, such differences are defined from 1 to 10 with an increase of 1 at a time. For the test’s power would change when the standard deviation changed, the relevant value was changed in all scenarios from 1 to 10 with an increase of 1 at a time to explore the test’s power. Dose-response distributions are skew. In the present study, data were derived from the Poisson distribution with λ= 1 parameter that was determined in accordance with dose-response curves. Results: When changes occurring in the determined scenarios are considered, it can be said, in general, that the significant difference must be set between 1 and 3; and standard deviation must be set between 1 and 2. Conclusion: It is certain that change in the number

  4. Paper coatings with multi-scale roughness evaluated at different sampling sizes

    Energy Technology Data Exchange (ETDEWEB)

    Samyn, Pieter, E-mail: Pieter.Samyn@UGent.be [Ghent University - Department of Textiles, Technologiepark 907, B-9052 Zwijnaarde (Belgium); Van Erps, Juergen; Thienpont, Hugo [Vrije Universiteit Brussels - Department of Applied Physics and Photonics, Pleinlaan 2, B-1050 Brussels (Belgium); Schoukens, Gustaaf [Ghent University - Department of Textiles, Technologiepark 907, B-9052 Zwijnaarde (Belgium)

    2011-04-15

    Papers have a complex hierarchical structure and the end-user functionalities such as hydrophobicity are controlled by a finishing layer. The application of an organic nanoparticle coating and drying of the aqueous dispersion results in an unique surface morphology with microscale domains that are internally patterned with nanoparticles. Better understanding of the multi-scale surface roughness patterns is obtained by monitoring the topography with non-contact profilometry (NCP) and atomic force microscopy (AFM) at different sampling areas ranging from 2000 {mu}m x 2000 {mu}m to 0.5 {mu}m x 0.5 {mu}m. The statistical roughness parameters are uniquely related to each other over the different measuring techniques and sampling sizes, as they are purely statistically determined. However, they cannot be directly extrapolated over the different sampling areas as they represent transitions at the nano-, micro-to-nano and microscale level. Therefore, the spatial roughness parameters including the correlation length and the specific frequency bandwidth should be taken into account for each measurement, which both allow for direct correlation of roughness data at different sampling sizes.

  5. Subspace Leakage Analysis and Improved DOA Estimation With Small Sample Size

    Science.gov (United States)

    Shaghaghi, Mahdi; Vorobyov, Sergiy A.

    2015-06-01

    Classical methods of DOA estimation such as the MUSIC algorithm are based on estimating the signal and noise subspaces from the sample covariance matrix. For a small number of samples, such methods are exposed to performance breakdown, as the sample covariance matrix can largely deviate from the true covariance matrix. In this paper, the problem of DOA estimation performance breakdown is investigated. We consider the structure of the sample covariance matrix and the dynamics of the root-MUSIC algorithm. The performance breakdown in the threshold region is associated with the subspace leakage where some portion of the true signal subspace resides in the estimated noise subspace. In this paper, the subspace leakage is theoretically derived. We also propose a two-step method which improves the performance by modifying the sample covariance matrix such that the amount of the subspace leakage is reduced. Furthermore, we introduce a phenomenon named as root-swap which occurs in the root-MUSIC algorithm in the low sample size region and degrades the performance of the DOA estimation. A new method is then proposed to alleviate this problem. Numerical examples and simulation results are given for uncorrelated and correlated sources to illustrate the improvement achieved by the proposed methods. Moreover, the proposed algorithms are combined with the pseudo-noise resampling method to further improve the performance.

  6. Evaluation of Sampling Recommendations From the Influenza Virologic Surveillance Right Size Roadmap for Idaho.

    Science.gov (United States)

    Rosenthal, Mariana; Anderson, Katey; Tengelsen, Leslie; Carter, Kris; Hahn, Christine; Ball, Christopher

    2017-08-24

    The Right Size Roadmap was developed by the Association of Public Health Laboratories and the Centers for Disease Control and Prevention to improve influenza virologic surveillance efficiency. Guidelines were provided to state health departments regarding representativeness and statistical estimates of specimen numbers needed for seasonal influenza situational awareness, rare or novel influenza virus detection, and rare or novel influenza virus investigation. The aim of this study was to compare Roadmap sampling recommendations with Idaho's influenza virologic surveillance to determine implementation feasibility. We calculated the proportion of medically attended influenza-like illness (MA-ILI) from Idaho's influenza-like illness surveillance among outpatients during October 2008 to May 2014, applied data to Roadmap-provided sample size calculators, and compared calculations with actual numbers of specimens tested for influenza by the Idaho Bureau of Laboratories (IBL). We assessed representativeness among patients' tested specimens to census estimates by age, sex, and health district residence. Among outpatients surveilled, Idaho's mean annual proportion of MA-ILI was 2.30% (20,834/905,818) during a 5-year period. Thus, according to Roadmap recommendations, Idaho needs to collect 128 specimens from MA-ILI patients/week for situational awareness, 1496 influenza-positive specimens/week for detection of a rare or novel influenza virus at 0.2% prevalence, and after detection, 478 specimens/week to confirm true prevalence is ≤2% of influenza-positive samples. The mean number of respiratory specimens Idaho tested for influenza/week, excluding the 2009-2010 influenza season, ranged from 6 to 24. Various influenza virus types and subtypes were collected and specimen submission sources were representative in terms of geographic distribution, patient age range and sex, and disease severity. Insufficient numbers of respiratory specimens are submitted to IBL for influenza

  7. Reducing sample size in experiments with animals: historical controls and related strategies.

    Science.gov (United States)

    Kramer, Matthew; Font, Enrique

    2017-02-01

    Reducing the number of animal subjects used in biomedical experiments is desirable for ethical and practical reasons. Previous reviews of the benefits of reducing sample sizes have focused on improving experimental designs and methods of statistical analysis, but reducing the size of control groups has been considered rarely. We discuss how the number of current control animals can be reduced, without loss of statistical power, by incorporating information from historical controls, i.e. subjects used as controls in similar previous experiments. Using example data from published reports, we describe how to incorporate information from historical controls under a range of assumptions that might be made in biomedical experiments. Assuming more similarities between historical and current controls yields higher savings and allows the use of smaller current control groups. We conducted simulations, based on typical designs and sample sizes, to quantify how different assumptions about historical controls affect the power of statistical tests. We show that, under our simulation conditions, the number of current control subjects can be reduced by more than half by including historical controls in the analyses. In other experimental scenarios, control groups may be unnecessary. Paying attention to both the function and to the statistical requirements of control groups would result in reducing the total number of animals used in experiments, saving time, effort and money, and bringing research with animals within ethically acceptable bounds. © 2015 Cambridge Philosophical Society.

  8. A Bounding Surface Plasticity Model for Intact Rock Exhibiting Size-Dependent Behaviour

    Science.gov (United States)

    Masoumi, Hossein; Douglas, Kurt J.; Russell, Adrian R.

    2016-01-01

    A new constitutive model for intact rock is presented recognising that rock strength, stiffness and stress-strain behaviour are affected by the size of the rock being subjected to loading. The model is formulated using bounding surface plasticity theory. It is validated against a new and extensive set of unconfined compression and triaxial compression test results for Gosford sandstone. The samples tested had diameters ranging from 19 to 145 mm and length-to-diameter ratios of 2. The model captures the continuous nonlinear stress-strain behaviour from initial loading, through peak strength to large shear strains, including transition from brittle to ductile behaviour. The size dependency was accounted for through a unified size effect law applied to the unconfined compressive strength—a key model input parameter. The unconfined compressive strength increases with sample size before peaking and then decreasing with further increasing sample size. Inside the constitutive model two hardening laws act simultaneously, each driven by plastic shear strains. The elasticity is stress level dependent. Simple linear loading and bounding surfaces are adopted, defined using the Mohr-Coulomb criterion, along with a non-associated flow rule. The model simulates well the stress-strain behaviour of Gosford sandstone at confining pressures ranging from 0 to 30 MPa for the variety of sample sizes considered.

  9. Influence of pH, Temperature and Sample Size on Natural and Enforced Syneresis of Precipitated Silica

    Directory of Open Access Journals (Sweden)

    Sebastian Wilhelm

    2015-12-01

    Full Text Available The production of silica is performed by mixing an inorganic, silicate-based precursor and an acid. Monomeric silicic acid forms and polymerizes to amorphous silica particles. Both further polymerization and agglomeration of the particles lead to a gel network. Since polymerization continues after gelation, the gel network consolidates. This rather slow process is known as “natural syneresis” and strongly influences the product properties (e.g., agglomerate size, porosity or internal surface. “Enforced syneresis” is the superposition of natural syneresis with a mechanical, external force. Enforced syneresis may be used either for analytical or preparative purposes. Hereby, two open key aspects are of particular interest. On the one hand, the question arises whether natural and enforced syneresis are analogous processes with respect to their dependence on the process parameters: pH, temperature and sample size. On the other hand, a method is desirable that allows for correlating natural and enforced syneresis behavior. We can show that the pH-, temperature- and sample size-dependency of natural and enforced syneresis are indeed analogous. It is possible to predict natural syneresis using a correlative model. We found that our model predicts maximum volume shrinkages between 19% and 30% in comparison to measured values of 20% for natural syneresis.

  10. Effects of LiDAR point density, sampling size and height threshold on estimation accuracy of crop biophysical parameters.

    Science.gov (United States)

    Luo, Shezhou; Chen, Jing M; Wang, Cheng; Xi, Xiaohuan; Zeng, Hongcheng; Peng, Dailiang; Li, Dong

    2016-05-30

    Vegetation leaf area index (LAI), height, and aboveground biomass are key biophysical parameters. Corn is an important and globally distributed crop, and reliable estimations of these parameters are essential for corn yield forecasting, health monitoring and ecosystem modeling. Light Detection and Ranging (LiDAR) is considered an effective technology for estimating vegetation biophysical parameters. However, the estimation accuracies of these parameters are affected by multiple factors. In this study, we first estimated corn LAI, height and biomass (R2 = 0.80, 0.874 and 0.838, respectively) using the original LiDAR data (7.32 points/m2), and the results showed that LiDAR data could accurately estimate these biophysical parameters. Second, comprehensive research was conducted on the effects of LiDAR point density, sampling size and height threshold on the estimation accuracy of LAI, height and biomass. Our findings indicated that LiDAR point density had an important effect on the estimation accuracy for vegetation biophysical parameters, however, high point density did not always produce highly accurate estimates, and reduced point density could deliver reasonable estimation results. Furthermore, the results showed that sampling size and height threshold were additional key factors that affect the estimation accuracy of biophysical parameters. Therefore, the optimal sampling size and the height threshold should be determined to improve the estimation accuracy of biophysical parameters. Our results also implied that a higher LiDAR point density, larger sampling size and height threshold were required to obtain accurate corn LAI estimation when compared with height and biomass estimations. In general, our results provide valuable guidance for LiDAR data acquisition and estimation of vegetation biophysical parameters using LiDAR data.

  11. Approaches to Sample Size Determination for Multivariate Data: Applications to PCA and PLS-DA of Omics Data.

    Science.gov (United States)

    Saccenti, Edoardo; Timmerman, Marieke E

    2016-08-01

    Sample size determination is a fundamental step in the design of experiments. Methods for sample size determination are abundant for univariate analysis methods, but scarce in the multivariate case. Omics data are multivariate in nature and are commonly investigated using multivariate statistical methods, such as principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA). No simple approaches to sample size determination exist for PCA and PLS-DA. In this paper we will introduce important concepts and offer strategies for (minimally) required sample size estimation when planning experiments to be analyzed using PCA and/or PLS-DA.

  12. A comment on sampling error in the standardized mean difference with unequal sample sizes: avoiding potential errors in meta-analytic and primary research.

    Science.gov (United States)

    Laczo, Roxanne M; Sackett, Paul R; Bobko, Philip; Cortina, José M

    2005-07-01

    The authors discuss potential confusion in conducting primary studies and meta-analyses on the basis of differences between groups. First, the authors show that a formula for the sampling error of the standardized mean difference (d) that is based on equal group sample sizes can produce substantially biased results if applied with markedly unequal group sizes. Second, the authors show that the same concerns are present when primary analyses or meta-analyses are conducted with point-biserial correlations, as the point-biserial correlation (r) is a transformation of d. Third, the authors examine the practice of correcting a point-biserial r for unequal sample sizes and note that such correction would also increase the sampling error of the corrected r. Correcting rs for unequal sample sizes, but using the standard formula for sampling error in uncorrected r, can result in bias. The authors offer a set of recommendations for conducting meta-analyses of group differences.

  13. Cluster-size dependent randomization traffic flow model

    Institute of Scientific and Technical Information of China (English)

    Gao Kun; Wang Bing-Hong; Fu Chuan-Ji; Lu Yu-Feng

    2007-01-01

    In order to exhibit the meta-stable states, several slow-to-start rules have been investigated as modification to Nagel-Schreckenberg (NS) model. These models can reproduce some realistic phenomena which are absent in the original NS model. But in these models, the size of cluster is still not considered as a useful parameter. In real traffic,the slow-to-start motion of a standing vehicle often depends on the degree of congestion which can be measured by the clusters'size. According to this idea, we propose a cluster-size dependent slow-to-start model based on the speeddependent slow-to-start rule (VDR) model. It gives expected results through simulations. Comparing with the VDR model, our new model has a better traffic efficiency and shows richer complex characters.

  14. Efficient adaptive designs with mid-course sample size adjustment in clinical trials

    CERN Document Server

    Bartroff, Jay

    2011-01-01

    Adaptive designs have been proposed for clinical trials in which the nuisance parameters or alternative of interest are unknown or likely to be misspecified before the trial. Whereas most previous works on adaptive designs and mid-course sample size re-estimation have focused on two-stage or group sequential designs in the normal case, we consider here a new approach that involves at most three stages and is developed in the general framework of multiparameter exponential families. Not only does this approach maintain the prescribed type I error probability, but it also provides a simple but asymptotically efficient sequential test whose finite-sample performance, measured in terms of the expected sample size and power functions, is shown to be comparable to the optimal sequential design, determined by dynamic programming, in the simplified normal mean case with known variance and prespecified alternative, and superior to the existing two-stage designs and also to adaptive group sequential designs when the al...

  15. Enhanced Z-LDA for Small Sample Size Training in Brain-Computer Interface Systems

    Directory of Open Access Journals (Sweden)

    Dongrui Gao

    2015-01-01

    Full Text Available Background. Usually the training set of online brain-computer interface (BCI experiment is small. For the small training set, it lacks enough information to deeply train the classifier, resulting in the poor classification performance during online testing. Methods. In this paper, on the basis of Z-LDA, we further calculate the classification probability of Z-LDA and then use it to select the reliable samples from the testing set to enlarge the training set, aiming to mine the additional information from testing set to adjust the biased classification boundary obtained from the small training set. The proposed approach is an extension of previous Z-LDA and is named enhanced Z-LDA (EZ-LDA. Results. We evaluated the classification performance of LDA, Z-LDA, and EZ-LDA on simulation and real BCI datasets with different sizes of training samples, and classification results showed EZ-LDA achieved the best classification performance. Conclusions. EZ-LDA is promising to deal with the small sample size training problem usually existing in online BCI system.

  16. Sample size calculations in human electrophysiology (EEG and ERP) studies: A systematic review and recommendations for increased rigor.

    Science.gov (United States)

    Larson, Michael J; Carbine, Kaylie A

    2017-01-01

    There is increasing focus across scientific fields on adequate sample sizes to ensure non-biased and reproducible effects. Very few studies, however, report sample size calculations or even the information needed to accurately calculate sample sizes for grants and future research. We systematically reviewed 100 randomly selected clinical human electrophysiology studies from six high impact journals that frequently publish electroencephalography (EEG) and event-related potential (ERP) research to determine the proportion of studies that reported sample size calculations, as well as the proportion of studies reporting the necessary components to complete such calculations. Studies were coded by the two authors blinded to the other's results. Inter-rater reliability was 100% for the sample size calculations and kappa above 0.82 for all other variables. Zero of the 100 studies (0%) reported sample size calculations. 77% utilized repeated-measures designs, yet zero studies (0%) reported the necessary variances and correlations among repeated measures to accurately calculate future sample sizes. Most studies (93%) reported study statistical values (e.g., F or t values). Only 40% reported effect sizes, 56% reported mean values, and 47% reported indices of variance (e.g., standard deviations/standard errors). Absence of such information hinders accurate determination of sample sizes for study design, grant applications, and meta-analyses of research and whether studies were adequately powered to detect effects of interest. Increased focus on sample size calculations, utilization of registered reports, and presenting information detailing sample size calculations and statistics for future researchers are needed and will increase sample size-related scientific rigor in human electrophysiology research.

  17. Soybean yield modeling using bootstrap methods for small samples

    Energy Technology Data Exchange (ETDEWEB)

    Dalposso, G.A.; Uribe-Opazo, M.A.; Johann, J.A.

    2016-11-01

    One of the problems that occur when working with regression models is regarding the sample size; once the statistical methods used in inferential analyzes are asymptotic if the sample is small the analysis may be compromised because the estimates will be biased. An alternative is to use the bootstrap methodology, which in its non-parametric version does not need to guess or know the probability distribution that generated the original sample. In this work we used a set of soybean yield data and physical and chemical soil properties formed with fewer samples to determine a multiple linear regression model. Bootstrap methods were used for variable selection, identification of influential points and for determination of confidence intervals of the model parameters. The results showed that the bootstrap methods enabled us to select the physical and chemical soil properties, which were significant in the construction of the soybean yield regression model, construct the confidence intervals of the parameters and identify the points that had great influence on the estimated parameters. (Author)

  18. Accelerating inference for diffusions observed with measurement error and large sample sizes using approximate Bayesian computation

    DEFF Research Database (Denmark)

    Picchini, Umberto; Forman, Julie Lyng

    2016-01-01

    In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers...... a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm...... applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general...

  19. An In Situ Method for Sizing Insoluble Residues in Precipitation and Other Aqueous Samples.

    Science.gov (United States)

    Axson, Jessica L; Creamean, Jessie M; Bondy, Amy L; Capracotta, Sonja S; Warner, Katy Y; Ault, Andrew P

    2015-01-01

    Particles are frequently incorporated into clouds or precipitation, influencing climate by acting as cloud condensation or ice nuclei, taking up coatings during cloud processing, and removing species through wet deposition. Many of these particles, particularly ice nuclei, can remain suspended within cloud droplets/crystals as insoluble residues. While previous studies have measured the soluble or bulk mass of species within clouds and precipitation, no studies to date have determined the number concentration and size distribution of insoluble residues in precipitation or cloud water using in situ methods. Herein, for the first time we demonstrate that Nanoparticle Tracking Analysis (NTA) is a powerful in situ method for determining the total number concentration, number size distribution, and surface area distribution of insoluble residues in precipitation, both of rain and melted snow. The method uses 500 μL or less of liquid sample and does not require sample modification. Number concentrations for the insoluble residues in aqueous precipitation samples ranged from 2.0-3.0(±0.3)×10(8) particles cm(-3), while surface area ranged from 1.8(±0.7)-3.2(±1.0)×10(7) μm(2) cm(-3). Number size distributions peaked between 133-150 nm, with both single and multi-modal character, while surface area distributions peaked between 173-270 nm. Comparison with electron microscopy of particles up to 10 μm show that, by number, > 97% residues are <1 μm in diameter, the upper limit of the NTA. The range of concentration and distribution properties indicates that insoluble residue properties vary with ambient aerosol concentrations, cloud microphysics, and meteorological dynamics. NTA has great potential for studying the role that insoluble residues play in critical atmospheric processes.

  20. A Bayesian cost-benefit approach to the determination of sample size in clinical trials.

    Science.gov (United States)

    Kikuchi, Takashi; Pezeshk, Hamid; Gittins, John

    2008-01-15

    Current practice for sample size computations in clinical trials is largely based on frequentist or classical methods. These methods have the drawback of requiring a point estimate of the variance of the treatment effect and are based on arbitrary settings of type I and II errors. They also do not directly address the question of achieving the best balance between the cost of the trial and the possible benefits from using the new treatment, and fail to consider the important fact that the number of users depends on the evidence for improvement compared with the current treatment. Our approach, Behavioural Bayes (or BeBay for short), assumes that the number of patients switching to the new medical treatment depends on the strength of the evidence that is provided by clinical trials, and takes a value between zero and the number of potential patients. The better a new treatment, the more the number of patients who want to switch to it and the more the benefit is obtained. We define the optimal sample size to be the sample size that maximizes the expected net benefit resulting from a clinical trial. Gittins and Pezeshk (Drug Inf. Control 2000; 34:355-363; The Statistician 2000; 49(2):177-187) used a simple form of benefit function and assumed paired comparisons between two medical treatments and that the variance of the treatment effect is known. We generalize this setting, by introducing a logistic benefit function, and by extending the more usual unpaired case, without assuming the variance to be known.

  1. Sample size planning for composite reliability coefficients: accuracy in parameter estimation via narrow confidence intervals.

    Science.gov (United States)

    Terry, Leann; Kelley, Ken

    2012-11-01

    Composite measures play an important role in psychology and related disciplines. Composite measures almost always have error. Correspondingly, it is important to understand the reliability of the scores from any particular composite measure. However, the point estimates of the reliability of composite measures are fallible and thus all such point estimates should be accompanied by a confidence interval. When confidence intervals are wide, there is much uncertainty in the population value of the reliability coefficient. Given the importance of reporting confidence intervals for estimates of reliability, coupled with the undesirability of wide confidence intervals, we develop methods that allow researchers to plan sample size in order to obtain narrow confidence intervals for population reliability coefficients. We first discuss composite reliability coefficients and then provide a discussion on confidence interval formation for the corresponding population value. Using the accuracy in parameter estimation approach, we develop two methods to obtain accurate estimates of reliability by planning sample size. The first method provides a way to plan sample size so that the expected confidence interval width for the population reliability coefficient is sufficiently narrow. The second method ensures that the confidence interval width will be sufficiently narrow with some desired degree of assurance (e.g., 99% assurance that the 95% confidence interval for the population reliability coefficient will be less than W units wide). The effectiveness of our methods was verified with Monte Carlo simulation studies. We demonstrate how to easily implement the methods with easy-to-use and freely available software. ©2011 The British Psychological Society.

  2. Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure

    National Research Council Canada - National Science Library

    S. Shahid Shaukat; Toqeer Ahmed Rao; Moazzam A. Khan

    2016-01-01

    ...) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22...

  3. Spatial Distribution and Minimum Sample Size for Overwintering Larvae of the Rice Stem Borer Chilo suppressalis (Walker) in Paddy Fields.

    Science.gov (United States)

    Arbab, A

    2014-10-01

    The rice stem borer, Chilo suppressalis (Walker), feeds almost exclusively in paddy fields in most regions of the world. The study of its spatial distribution is fundamental for designing correct control strategies, improving sampling procedures, and adopting precise agricultural techniques. Field experiments were conducted during 2011 and 2012 to estimate the spatial distribution pattern of the overwintering larvae. Data were analyzed using five distribution indices and two regression models (Taylor and Iwao). All of the indices and Taylor's model indicated random spatial distribution pattern of the rice stem borer overwintering larvae. Iwao's patchiness regression was inappropriate for our data as shown by the non-homogeneity of variance, whereas Taylor's power law fitted the data well. The coefficients of Taylor's power law for a combined 2 years of data were a = -0.1118, b = 0.9202 ± 0.02, and r (2) = 96.81. Taylor's power law parameters were used to compute minimum sample size needed to estimate populations at three fixed precision levels, 5, 10, and 25% at 0.05 probabilities. Results based on this equation parameters suggesting that minimum sample sizes needed for a precision level of 0.25 were 74 and 20 rice stubble for rice stem borer larvae when the average larvae is near 0.10 and 0.20 larvae per rice stubble, respectively.

  4. Magnetic response and critical current properties of mesoscopic-size YBCO superconducting samples

    Energy Technology Data Exchange (ETDEWEB)

    Lisboa-Filho, P N [UNESP - Universidade Estadual Paulista, Grupo de Materiais Avancados, Departamento de Fisica, Bauru (Brazil); Deimling, C V; Ortiz, W A, E-mail: plisboa@fc.unesp.b [Grupo de Supercondutividade e Magnetismo, Departamento de Fisica, Universidade Federal de Sao Carlos, Sao Carlos (Brazil)

    2010-01-15

    In this contribution superconducting specimens of YBa{sub 2}Cu{sub 3}O{sub 7-{delta}} were synthesized by a modified polymeric precursor method, yielding a ceramic powder with particles of mesoscopic-size. Samples of this powder were then pressed into pellets and sintered under different conditions. The critical current density was analyzed by isothermal AC-susceptibility measurements as a function of the excitation field, as well as with isothermal DC-magnetization runs at different values of the applied field. Relevant features of the magnetic response could be associated to the microstructure of the specimens and, in particular, to the superconducting intra- and intergranular critical current properties.

  5. QUANTUM SIZE EFFECTS IN THE ATTRACTIVE HUBBARD-MODEL

    NARCIS (Netherlands)

    BORMANN, D; SCHNEIDER, T; FRICK, M

    1992-01-01

    We investigate superconducting pair correlations in the attractive Hubbard model on a finite square lattice. Our aim is to understand the pronounced size dependence which they display in the weak and intermediate coupling regimes. These size effects originate from the electronic shell structure of f

  6. The theoretical foundations for size spectrum models of fish communities

    DEFF Research Database (Denmark)

    Andersen, Ken Haste; Jacobsen, Nis Sand; Farnsworth, K.D.

    2016-01-01

    assessment of fisheries. We describe the fundamental concepts in size-based models about food encounter and the bioenergetics budget of individuals. Within the general framework three model types have emerged that differs in their degree of complexity: the food-web, the trait-based and the community model....... We demonstrate the differences between the models through examples of their response to fishing and their dynamic behavior. We review implementations of size spectrum models and describe important variations concerning the functional response, whether growth is food-dependent or fixed...

  7. SIMPLIFIED MATHEMATICAL MODEL OF SMALL SIZED UNMANNED AIRCRAFT VEHICLE LAYOUT

    Directory of Open Access Journals (Sweden)

    2016-01-01

    Full Text Available Strong reduction of new aircraft design period using new technology based on artificial intelligence is the key problem mentioned in forecasts of leading aerospace industry research centers. This article covers the approach to devel- opment of quick aerodynamic design methods based on artificial intelligence neural system. The problem is being solved for the classical scheme of small sized unmanned aircraft vehicle (UAV. The principal parts of the method are the mathe- matical model of layout, layout generator of this type of aircraft is built on aircraft neural networks, automatic selection module for cleaning variety of layouts generated in automatic mode, robust direct computational fluid dynamics method, aerodynamic characteristics approximators on artificial neural networks.Methods based on artificial neural networks have intermediate position between computational fluid dynamics methods or experiments and simplified engineering approaches. The use of ANN for estimating aerodynamic characteris-tics put limitations on input data. For this task the layout must be presented as a vector with dimension not exceeding sev-eral hundred. Vector components must include all main parameters conventionally used for layouts description and com- pletely replicate the most important aerodynamics and structural properties.The first stage of the work is presented in the paper. Simplified mathematical model of small sized UAV was developed. To estimate the range of geometrical parameters of layouts the review of existing vehicle was done. The result of the work is the algorithm and computer software for generating the layouts based on ANN technolo-gy. 10000 samples were generated and the dataset containig geometrical and aerodynamic characteristics of layoutwas created.

  8. Waif goodbye! Average-size female models promote positive body image and appeal to consumers.

    Science.gov (United States)

    Diedrichs, Phillippa C; Lee, Christina

    2011-10-01

    Despite consensus that exposure to media images of thin fashion models is associated with poor body image and disordered eating behaviours, few attempts have been made to enact change in the media. This study sought to investigate an effective alternative to current media imagery, by exploring the advertising effectiveness of average-size female fashion models, and their impact on the body image of both women and men. A sample of 171 women and 120 men were assigned to one of three advertisement conditions: no models, thin models and average-size models. Women and men rated average-size models as equally effective in advertisements as thin and no models. For women with average and high levels of internalisation of cultural beauty ideals, exposure to average-size female models was associated with a significantly more positive body image state in comparison to exposure to thin models and no models. For men reporting high levels of internalisation, exposure to average-size models was also associated with a more positive body image state in comparison to viewing thin models. These findings suggest that average-size female models can promote positive body image and appeal to consumers.

  9. Multipartite geometric entanglement in finite size XY model

    Energy Technology Data Exchange (ETDEWEB)

    Blasone, Massimo; Dell' Anno, Fabio; De Siena, Silvio; Giampaolo, Salvatore Marco; Illuminati, Fabrizio, E-mail: blasone@sa.infn.i [Dipartimento di Matematica e Informatica, Universita degli Studi di Salerno, Via Ponte don Melillo, I-84084 Fisciano (Italy)

    2009-06-01

    We investigate the behavior of the multipartite entanglement in the finite size XY model by means of the hierarchical geometric measure of entanglement. By selecting specific components of the hierarchy, we study both global entanglement and genuinely multipartite entanglement.

  10. Effects of sample size and intraspecific variation in phylogenetic comparative studies: a meta-analytic review.

    Science.gov (United States)

    Garamszegi, László Z; Møller, Anders P

    2010-11-01

    Comparative analyses aim to explain interspecific variation in phenotype among taxa. In this context, phylogenetic approaches are generally applied to control for similarity due to common descent, because such phylogenetic relationships can produce spurious similarity in phenotypes (known as phylogenetic inertia or bias). On the other hand, these analyses largely ignore potential biases due to within-species variation. Phylogenetic comparative studies inherently assume that species-specific means from intraspecific samples of modest sample size are biologically meaningful. However, within-species variation is often significant, because measurement errors, within- and between-individual variation, seasonal fluctuations, and differences among populations can all reduce the repeatability of a trait. Although simulations revealed that low repeatability can increase the type I error in a phylogenetic study, researchers only exercise great care in accounting for similarity in phenotype due to common phylogenetic descent, while problems posed by intraspecific variation are usually neglected. A meta-analysis of 194 comparative analyses all adjusting for similarity due to common phylogenetic descent revealed that only a few studies reported intraspecific repeatabilities, and hardly any considered or partially dealt with errors arising from intraspecific variation. This is intriguing, because the meta-analytic data suggest that the effect of heterogeneous sampling can be as important as phylogenetic bias, and thus they should be equally controlled in comparative studies. We provide recommendations about how to handle such effects of heterogeneous sampling.

  11. Bolton tooth size ratio among qatari population sample: An odontometric study

    Science.gov (United States)

    Hashim, Hayder A; AL-Sayed, Najah; AL-Hussain, Hashim

    2017-01-01

    Objectives: To establish the overall and anterior Bolton ratio among a sample of Qatari population and to investigate whether there is a difference between males and females, as well as to compare the result obtained by Bolton. Materials and Methods: The current study consisted of 100 orthodontic study participants (50 males and 50 females) with different malocclusions and age ranging between 15 and 20 years. An electronic digital caliper was used to measure the mesiodistal tooth width of all maxillary and mandibular permanent teeth except second and third molars. The Student's t-test was used to compare tooth-size ratios between males and females and between the results of the present study and Bolton's result. Results: The anterior and overall ratio in Qatari individuals were 78.6 ± 3.4 and 91.8 ± 3.1, respectively. The tooth size ratios were slightly greater in males than that in females, however, the differences were not statistically significant (P > 0.05). There were no significant differences in the overall ratio between Qatari individuals and Bolton's results (P > 0.05), whereas statistical significant differences were observed in anterior ratio (P = 0.007). Conclusions: Within the limitation of the limitations of the present study, definite conclusion was difficult to establish. Thus, a further study with a large sample in each malocclusion group is required. PMID:28197399

  12. Exact calculation of power and sample size in bioequivalence studies using two one-sided tests.

    Science.gov (United States)

    Shen, Meiyu; Russek-Cohen, Estelle; Slud, Eric V

    2015-01-01

    The number of subjects in a pharmacokinetic two-period two-treatment crossover bioequivalence study is typically small, most often less than 60. The most common approach to testing for bioequivalence is the two one-sided tests procedure. No explicit mathematical formula for the power function in the context of the two one-sided tests procedure exists in the statistical literature, although the exact power based on Owen's special case of bivariate noncentral t-distribution has been tabulated and graphed. Several approximations have previously been published for the probability of rejection in the two one-sided tests procedure for crossover bioequivalence studies. These approximations and associated sample size formulas are reviewed in this article and compared for various parameter combinations with exact power formulas derived here, which are computed analytically as univariate integrals and which have been validated by Monte Carlo simulations. The exact formulas for power and sample size are shown to improve markedly in realistic parameter settings over the previous approximations.

  13. A simple model for sizing stand alone photovoltaic systems

    Energy Technology Data Exchange (ETDEWEB)

    Sidrach-de-Cardona, M. [Departamento Fisica Aplicada II, ETSI Informatica, Universidad de Malaga, 29071 Malaga (Spain); Mora Lopez, Ll. [Departamento Lenguajes y C. Computacion, ETSI Informatica, Universidad de Malaga, 29071 Malaga (Spain)

    1998-08-24

    We consider a general model for sizing a stand-alone photovoltaic system, using as energy input data the information available in any radiation atlas. The parameters of the model are estimated by multivariate linear regression. The results obtained from a numerical sizing method were used as initial input data to fit the model. The expression proposed allows us to determine the photovoltaic array size, with a coefficient of determination ranging from 0.94 to 0.98. System parameters and mean monthly values for daily global radiation on the solar modules surface are taken as independent variables in the model. It is also shown that the proposed model can be used with the same accuracy for other locations not considered in the estimation of the model

  14. Modeling particle size distributions by the Weibull distribution function

    Energy Technology Data Exchange (ETDEWEB)

    Fang, Zhigang (Rogers Tool Works, Rogers, AR (United States)); Patterson, B.R.; Turner, M.E. Jr (Univ. of Alabama, Birmingham, AL (United States))

    1993-10-01

    A method is proposed for modeling two- and three-dimensional particle size distributions using the Weibull distribution function. Experimental results show that, for tungsten particles in liquid phase sintered W-14Ni-6Fe, the experimental cumulative section size distributions were well fit by the Weibull probability function, which can also be used to compute the corresponding relative frequency distributions. Modeling the two-dimensional section size distributions facilitates the use of the Saltykov or other methods for unfolding three-dimensional (3-D) size distributions with minimal irregularities. Fitting the unfolded cumulative 3-D particle size distribution with the Weibull function enables computation of the statistical distribution parameters from the parameters of the fit Weibull function.

  15. What can we learn from studies based on small sample sizes? Comment on Regan, Lakhanpal, and Anguiano (2012).

    Science.gov (United States)

    Johnson, David R; Bachan, Lauren K

    2013-08-01

    In a recent article, Regan, Lakhanpal, and Anguiano (2012) highlighted the lack of evidence for different relationship outcomes between arranged and love-based marriages. Yet the sample size (n = 58) used in the study is insufficient for making such inferences. This reply discusses and demonstrates how small sample sizes reduce the utility of this research.

  16. Size-specific sensitivity: Applying a new structured population model

    Energy Technology Data Exchange (ETDEWEB)

    Easterling, M.R.; Ellner, S.P.; Dixon, P.M.

    2000-03-01

    Matrix population models require the population to be divided into discrete stage classes. In many cases, especially when classes are defined by a continuous variable, such as length or mass, there are no natural breakpoints, and the division is artificial. The authors introduce the integral projection model, which eliminates the need for division into discrete classes, without requiring any additional biological assumptions. Like a traditional matrix model, the integral projection model provides estimates of the asymptotic growth rate, stable size distribution, reproductive values, and sensitivities of the growth rate to changes in vital rates. However, where the matrix model represents the size distributions, reproductive value, and sensitivities as step functions (constant within a stage class), the integral projection model yields smooth curves for each of these as a function of individual size. The authors describe a method for fitting the model to data, and they apply this method to data on an endangered plant species, northern monkshood (Aconitum noveboracense), with individuals classified by stem diameter. The matrix and integral models yield similar estimates of the asymptotic growth rate, but the reproductive values and sensitivities in the matrix model are sensitive to the choice of stage classes. The integral projection model avoids this problem and yields size-specific sensitivities that are not affected by stage duration. These general properties of the integral projection model will make it advantageous for other populations where there is no natural division of individuals into stage classes.

  17. To discuss different calculation methods of sample size%样本含量估算方法探讨

    Institute of Scientific and Technical Information of China (English)

    喻宁芳

    2014-01-01

    目的:介绍和比较医学实验设计中不同的样本含量估算方法。方法:以PI3K抑制剂对小鼠气道炎症影响的实验研究*为例运用不同方法计算样本含量。结果:①公式法计算需12例②PASS软件Simple法计算需10例③Stata软件计算需8例,验算其检验效能:1-β>0.9结论:3种不同方法估算的样本含量都是合理有效的,实验研究人员可以以多种计算结果为依据,分析实验研究性质,综合考虑研究成本、可行性与伦理学要求对样本含量的影响确定合适的样本数。%Objective: To introduce and compare different calculation Methods of sample size in experiment design.Methods: As an example of PI3K inhibitor reduces respiratory tract inflammation in a murine model of Asthma.Results: In method of formula,12;in PASS software,8;in Stata software, 10.1-β>0.9.Conclusion: Proper analysis of the nature of research design,setting the correct parameters,Based on a variety of calculations to estimate the sample size, and then considering the research costs, feasibility and ethics requirements impact on sample size, and ultimately determine the most appropriate number of samples.

  18. Sample size requirements and analysis of tag recoveries for paired releases of lake trout

    Science.gov (United States)

    Elrod, Joseph H.; Frank, Anthony

    1990-01-01

    A simple chi-square test can be used to analyze recoveries from a paired-release experiment to determine whether differential survival occurs between two groups of fish. The sample size required for analysis is a function of (1) the proportion of fish stocked, (2) the expected proportion at recovery, (3) the level of significance (a) at which the null hypothesis is tested, and (4) the power (1-I?) of the statistical test. Detection of a 20% change from a stocking ratio of 50:50 requires a sample of 172 (I?=0.10; 1-I?=0.80) to 459 (I?=0.01; 1-I?=0.95) fish. Pooling samples from replicate pairs is sometimes an appropriate way to increase statistical precision without increasing numbers stocked or sampling intensity. Summing over time is appropriate if catchability or survival of the two groups of fish does not change relative to each other through time. Twelve pairs of identical groups of yearling lake trout Salvelinus namaycush were marked with coded wire tags and stocked into Lake Ontario. Recoveries of fish at ages 2-8 showed differences of 1-14% from the initial stocking ratios. Mean tag recovery rates were 0.217%, 0.156%, 0.128%, 0.121%, 0.093%, 0.042%, and 0.016% for ages 2-8, respectively. At these rates, stocking 12,100-29,700 fish per group would yield samples of 172-459 fish at ages 2-8 combined.

  19. Size-resolved culturable airborne bacteria sampled in rice field, sanitary landfill, and waste incineration sites.

    Science.gov (United States)

    Heo, Yongju; Park, Jiyeon; Lim, Sung-Il; Hur, Hor-Gil; Kim, Daesung; Park, Kihong

    2010-08-01

    Size-resolved bacterial concentrations in atmospheric aerosols sampled by using a six stage viable impactor at rice field, sanitary landfill, and waste incinerator sites were determined. Culture-based and Polymerase Chain Reaction (PCR) methods were used to identify the airborne bacteria. The culturable bacteria concentration in total suspended particles (TSP) was found to be the highest (848 Colony Forming Unit (CFU)/m(3)) at the sanitary landfill sampling site, while the rice field sampling site has the lowest (125 CFU/m(3)). The closed landfill would be the main source of the observed bacteria concentration at the sanitary landfill. The rice field sampling site was fully covered by rice grain with wetted conditions before harvest and had no significant contribution to the airborne bacteria concentration. This might occur because the dry conditions favor suspension of soil particles and this area had limited personnel and vehicle flow. The respirable fraction calculated by particles less than 3.3 mum was highest (26%) at the sanitary landfill sampling site followed by waste incinerator (19%) and rice field (10%), which showed a lower level of respiratory fraction compared to previous literature values. We identified 58 species in 23 genera of culturable bacteria, and the Microbacterium, Staphylococcus, and Micrococcus were the most abundant genera at the sanitary landfill, waste incinerator, and rice field sites, respectively. An antibiotic resistant test for the above bacteria (Micrococcus sp., Microbacterium sp., and Staphylococcus sp.) showed that the Staphylococcus sp. had the strongest resistance to both antibiotics (25.0% resistance for 32 microg ml(-1) of Chloramphenicol and 62.5% resistance for 4 microg ml(-1) of Gentamicin).

  20. Holonic Business Process Modeling in Small to Medium Sized Enterprises

    Directory of Open Access Journals (Sweden)

    Nur Budi Mulyono

    2012-01-01

    Full Text Available Holonic modeling analysis which is the application of system thinking in design, manage, and improvement, is used in a novel context for business process modeling. An approach and techniques of holon and holarchies is presented specifically for small and medium sized enterprise process modeling development. The fitness of the approach is compared with well known reductionist or task breakdown approach. The strength and weaknesses of the holonic modeling is discussed with illustrating case example in term of its suitability for an Indonesia’s small and medium sized industry. The novel ideas in this paper have great impact on the way analyst should perceive business process. Future research is applying the approach in supply chain context.Key words: Business process, holonic modeling, operations management, small to medium sized enterprise

  1. A cold finger cooling system for the efficient graphitisation of microgram-sized carbon samples

    Science.gov (United States)

    Yang, Bin; Smith, A. M.; Hua, Quan

    2013-01-01

    At ANSTO, we use the Bosch reaction to convert sample CO2 to graphite for production of our radiocarbon AMS targets. Key to the efficient graphitisation of ultra-small samples are the type of iron catalyst used and the effective trapping of water vapour during the reaction. Here we report a simple liquid nitrogen cooling system that enables us to rapidly adjust the temperature of the cold finger in our laser-heated microfurnace. This has led to an improvement in the graphitisation of microgram-sized carbon samples. This simple system uses modest amounts of liquid nitrogen (typically <200 mL/h during graphitisation) and is compact and reliable. We have used it to produce over 120 AMS targets containing between 5 and 20 μg of carbon, with conversion efficiencies for 5 μg targets ranging from 80% to 100%. In addition, this cooling system has been adapted for use with our conventional graphitisation reactors and has also improved their performance.

  2. Effects of sample size on estimation of rainfall extremes at high temperatures

    Directory of Open Access Journals (Sweden)

    B. Boessenkool

    2017-09-01

    Full Text Available High precipitation quantiles tend to rise with temperature, following the so-called Clausius–Clapeyron (CC scaling. It is often reported that the CC-scaling relation breaks down and even reverts for very high temperatures. In our study, we investigate this reversal using observational climate data from 142 stations across Germany. One of the suggested meteorological explanations for the breakdown is limited moisture supply. Here we argue that, instead, it could simply originate from undersampling. As rainfall frequency generally decreases with higher temperatures, rainfall intensities as dictated by CC scaling are less likely to be recorded than for moderate temperatures. Empirical quantiles are conventionally estimated from order statistics via various forms of plotting position formulas. They have in common that their largest representable return period is given by the sample size. In small samples, high quantiles are underestimated accordingly. The small-sample effect is weaker, or disappears completely, when using parametric quantile estimates from a generalized Pareto distribution (GPD fitted with L moments. For those, we obtain quantiles of rainfall intensities that continue to rise with temperature.

  3. Holonic Business Process Modeling in Small to Medium Sized Enterprises

    OpenAIRE

    Nur Budi Mulyono; Tezar Yuliansyah Saputra; Nur Arief Rahmatsyah

    2012-01-01

    Holonic modeling analysis which is the application of system thinking in design, manage, and improvement, is used in a novel context for business process modeling. An approach and techniques of holon and holarchies is presented specifically for small and medium sized enterprise process modeling development. The fitness of the approach is compared with well known reductionist or task breakdown approach. The strength and weaknesses of the holonic modeling is discussed with illustrating case exa...

  4. What about N? A methodological study of sample-size reporting in focus group studies

    Directory of Open Access Journals (Sweden)

    Glenton Claire

    2011-03-01

    Full Text Available Abstract Background Focus group studies are increasingly published in health related journals, but we know little about how researchers use this method, particularly how they determine the number of focus groups to conduct. The methodological literature commonly advises researchers to follow principles of data saturation, although practical advise on how to do this is lacking. Our objectives were firstly, to describe the current status of sample size in focus group studies reported in health journals. Secondly, to assess whether and how researchers explain the number of focus groups they carry out. Methods We searched PubMed for studies that had used focus groups and that had been published in open access journals during 2008, and extracted data on the number of focus groups and on any explanation authors gave for this number. We also did a qualitative assessment of the papers with regard to how number of groups was explained and discussed. Results We identified 220 papers published in 117 journals. In these papers insufficient reporting of sample sizes was common. The number of focus groups conducted varied greatly (mean 8.4, median 5, range 1 to 96. Thirty seven (17% studies attempted to explain the number of groups. Six studies referred to rules of thumb in the literature, three stated that they were unable to organize more groups for practical reasons, while 28 studies stated that they had reached a point of saturation. Among those stating that they had reached a point of saturation, several appeared not to have followed principles from grounded theory where data collection and analysis is an iterative process until saturation is reached. Studies with high numbers of focus groups did not offer explanations for number of groups. Too much data as a study weakness was not an issue discussed in any of the reviewed papers. Conclusions Based on these findings we suggest that journals adopt more stringent requirements for focus group method

  5. A two-stage Bayesian design with sample size reestimation and subgroup analysis for phase II binary response trials.

    Science.gov (United States)

    Zhong, Wei; Koopmeiners, Joseph S; Carlin, Bradley P

    2013-11-01

    Frequentist sample size determination for binary outcome data in a two-arm clinical trial requires initial guesses of the event probabilities for the two treatments. Misspecification of these event rates may lead to a poor estimate of the necessary sample size. In contrast, the Bayesian approach that considers the treatment effect to be random variable having some distribution may offer a better, more flexible approach. The Bayesian sample size proposed by (Whitehead et al., 2008) for exploratory studies on efficacy justifies the acceptable minimum sample size by a "conclusiveness" condition. In this work, we introduce a new two-stage Bayesian design with sample size reestimation at the interim stage. Our design inherits the properties of good interpretation and easy implementation from Whitehead et al. (2008), generalizes their method to a two-sample setting, and uses a fully Bayesian predictive approach to reduce an overly large initial sample size when necessary. Moreover, our design can be extended to allow patient level covariates via logistic regression, now adjusting sample size within each subgroup based on interim analyses. We illustrate the benefits of our approach with a design in non-Hodgkin lymphoma with a simple binary covariate (patient gender), offering an initial step toward within-trial personalized medicine. Copyright © 2013 Elsevier Inc. All rights reserved.

  6. The quality of the reported sample size calculations in randomized controlled trials indexed in PubMed.

    Science.gov (United States)

    Lee, Paul H; Tse, Andy C Y

    2017-05-01

    There are limited data on the quality of reporting of information essential for replication of the calculation as well as the accuracy of the sample size calculation. We examine the current quality of reporting of the sample size calculation in randomized controlled trials (RCTs) published in PubMed and to examine the variation in reporting across study design, study characteristics, and journal impact factor. We also reviewed the targeted sample size reported in trial registries. We reviewed and analyzed all RCTs published in December 2014 with journals indexed in PubMed. The 2014 Impact Factors for the journals were used as proxies for their quality. Of the 451 analyzed papers, 58.1% reported an a priori sample size calculation. Nearly all papers provided the level of significance (97.7%) and desired power (96.6%), and most of the papers reported the minimum clinically important effect size (73.3%). The median (inter-quartile range) of the percentage difference of the reported and calculated sample size calculation was 0.0% (IQR -4.6%;3.0%). The accuracy of the reported sample size was better for studies published in journals that endorsed the CONSORT statement and journals with an impact factor. A total of 98 papers had provided targeted sample size on trial registries and about two-third of these papers (n=62) reported sample size calculation, but only 25 (40.3%) had no discrepancy with the reported number in the trial registries. The reporting of the sample size calculation in RCTs published in PubMed-indexed journals and trial registries were poor. The CONSORT statement should be more widely endorsed. Copyright © 2016 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.

  7. OPC model sampling evaluation and weakpoint "in-situ" improvement

    Science.gov (United States)

    Fu, Nan; Elshafie, Shady; Ning, Guoxiang; Roling, Stefan

    2016-10-01

    One of the major challenges of optical proximity correction (OPC) models is to maximize the coverage of real design features using sampling pattern. Normally, OPC model building is based on 1-D and 2-D test patterns with systematically changing pitches alignment with design rules. However, those features with different optical and geometric properties will generate weak-points where OPC simulation cannot precisely predict resist contours on wafer due to the nature of infinite IC designs and limited number of model test patterns. In this paper, optical property data of real design features were collected from full chips and classified to compare with the same kind of data from OPC test patterns. Therefore sample coverage could be visually mapped according to different optical properties. Design features, which are out of OPC capability, were distinguished by their optical properties and marked as weak-points. New patterns with similar optical properties would be added into model build site-list. Further, an alternative and more efficient method was created in this paper to improve the treatment of issue features and remove weak-points without rebuilding models. Since certain classification of optical properties will generate weak-points, an OPC-integrated repair algorithm was developed and implemented to scan full chip for optical properties, locate those features and then optimize OPC treatment or apply precise sizing on site. This is a named "in-situ" weak-point improvement flow which includes issue feature definition, allocation in full chip and real-time improvement.

  8. Determination of reference limits: statistical concepts and tools for sample size calculation.

    Science.gov (United States)

    Wellek, Stefan; Lackner, Karl J; Jennen-Steinmetz, Christine; Reinhard, Iris; Hoffmann, Isabell; Blettner, Maria

    2014-12-01

    Reference limits are estimators for 'extreme' percentiles of the distribution of a quantitative diagnostic marker in the healthy population. In most cases, interest will be in the 90% or 95% reference intervals. The standard parametric method of determining reference limits consists of computing quantities of the form X̅±c·S. The proportion of covered values in the underlying population coincides with the specificity obtained when a measurement value falling outside the corresponding reference region is classified as diagnostically suspect. Nonparametrically, reference limits are estimated by means of so-called order statistics. In both approaches, the precision of the estimate depends on the sample size. We present computational procedures for calculating minimally required numbers of subjects to be enrolled in a reference study. The much more sophisticated concept of reference bands replacing statistical reference intervals in case of age-dependent diagnostic markers is also discussed.

  9. Analyzing insulin samples by size-exclusion chromatography: a column degradation study.

    Science.gov (United States)

    Teska, Brandon M; Kumar, Amit; Carpenter, John F; Wempe, Michael F

    2015-04-01

    Investigating insulin analogs and probing their intrinsic stability at physiological temperature, we observed significant degradation in the size-exclusion chromatography (SEC) signal over a moderate number of insulin sample injections, which generated concerns about the quality of the separations. Therefore, our research goal was to identify the cause(s) for the observed signal degradation and attempt to mitigate the degradation in order to extend SEC column lifespan. In these studies, we used multiangle light scattering, nuclear magnetic resonance, and gas chromatography-mass spectrometry methods to evaluate column degradation. The results from these studies illustrate: (1) that zinc ions introduced by the insulin product produced the observed column performance issues; and (2) that including ethylenediaminetetraacetic acid, a zinc chelator, in the mobile phase helped to maintain column performance.

  10. Sample Size Dependence of Second Magnetization Peak in Type-II Superconductors

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    We show that the second magnetization peak (SMP), i. e., an increase in the magnetization hysteresis loop width in type-II superconductors,vanishes for samples smaller than a critical size. We argue that the SMP is not related to the critical current enhancement but can be well explained within a framework of the thermomagnetic flux-jump instability theory, where flux jumps reduce the absolute irreversible magnetization relative to the isothermal critical state value at low enough magnetic fields. The recovering of the isothermal critical state with increasing field leads to the SMP. The low-field SMP takes place in both low-Tc conventional and high-Tc unconventional superconductors. Our results show that the restoration of the isothermal critical state is responsible for the SMP occurrence in both cases.

  11. Estimating survival rates in ecological studies with small unbalanced sample sizes: an alternative Bayesian point estimator

    Directory of Open Access Journals (Sweden)

    Christian Damgaard

    2011-12-01

    Full Text Available Increasingly, the survival rates in experimental ecology are presented using odds ratios or log response ratios, but the use of ratio metrics has a problem when all the individuals have either died or survived in only one replicate. In the empirical ecological literature, the problem often has been ignored or circumvented by different, more or less ad hoc approaches. Here, it is argued that the best summary statistic for communicating ecological results of frequency data in studies with small unbalanced samples may be the mean of the posterior distribution of the survival rate. The developed approach may be particularly useful when effect size indexes, such as odds ratios, are needed to compare frequency data between treatments, sites or studies.

  12. Stratifying patients with peripheral neuropathic pain based on sensory profiles: algorithm and sample size recommendations

    Science.gov (United States)

    Vollert, Jan; Maier, Christoph; Attal, Nadine; Bennett, David L.H.; Bouhassira, Didier; Enax-Krumova, Elena K.; Finnerup, Nanna B.; Freynhagen, Rainer; Gierthmühlen, Janne; Haanpää, Maija; Hansson, Per; Hüllemann, Philipp; Jensen, Troels S.; Magerl, Walter; Ramirez, Juan D.; Rice, Andrew S.C.; Schuh-Hofer, Sigrid; Segerdahl, Märta; Serra, Jordi; Shillo, Pallai R.; Sindrup, Soeren; Tesfaye, Solomon; Themistocleous, Andreas C.; Tölle, Thomas R.; Treede, Rolf-Detlef; Baron, Ralf

    2017-01-01

    Abstract In a recent cluster analysis, it has been shown that patients with peripheral neuropathic pain can be grouped into 3 sensory phenotypes based on quantitative sensory testing profiles, which are mainly characterized by either sensory loss, intact sensory function and mild thermal hyperalgesia and/or allodynia, or loss of thermal detection and mild mechanical hyperalgesia and/or allodynia. Here, we present an algorithm for allocation of individual patients to these subgroups. The algorithm is nondeterministic—ie, a patient can be sorted to more than one phenotype—and can separate patients with neuropathic pain from healthy subjects (sensitivity: 78%, specificity: 94%). We evaluated the frequency of each phenotype in a population of patients with painful diabetic polyneuropathy (n = 151), painful peripheral nerve injury (n = 335), and postherpetic neuralgia (n = 97) and propose sample sizes of study populations that need to be screened to reach a subpopulation large enough to conduct a phenotype-stratified study. The most common phenotype in diabetic polyneuropathy was sensory loss (83%), followed by mechanical hyperalgesia (75%) and thermal hyperalgesia (34%, note that percentages are overlapping and not additive). In peripheral nerve injury, frequencies were 37%, 59%, and 50%, and in postherpetic neuralgia, frequencies were 31%, 63%, and 46%. For parallel study design, either the estimated effect size of the treatment needs to be high (>0.7) or only phenotypes that are frequent in the clinical entity under study can realistically be performed. For crossover design, populations under 200 patients screened are sufficient for all phenotypes and clinical entities with a minimum estimated treatment effect size of 0.5. PMID:28595241

  13. Presentation of coefficient of variation for bioequivalence sample-size calculation
.

    Science.gov (United States)

    Lee, Yi Lin; Mak, Wen Yao; Looi, Irene; Wong, Jia Woei; Yuen, Kah Hay

    2017-07-01

    The current study aimed to further contribute information on intrasubject coefficient of variation (CV) from 43 bioequivalence studies conducted by our center. Consistent with Yuen et al. (2001), current work also attempted to evaluate the effect of different parameters (AUC0-t, AUC0-∞, and Cmax) used in the estimation of the study power. Furthermore, we have estimated the number of subjects required for each study by looking at the values of intrasubject CV of AUC0-∞ and have also taken into consideration the minimum sample-size requirement set by the US FDA. A total of 37 immediate-release and 6 extended-release formulations from 28 different active pharmaceutical ingredients (APIs) were evaluated. Out of the total number of studies conducted, 10 studies did not achieve satisfactory statistical power on two or more parameters; 4 studies consistently scored poorly across all three parameters. In general, intrasubject CV values calculated from Cmax were more variable compared to either AUC0-t and AUC0-∞. 20 out of 43 studies did not achieve more than 80% power when the value was calculated from Cmax value, compared to only 11 (AUC0-∞) and 8 (AUC0-t) studies. This finding is consistent with Steinijans et al. (1995) [2] and Yuen et al. (2001) [3]. In conclusion, the CV values obtained from AUC0-t and AUC0-∞ were similar, while those derived from Cmax were consistently more variable. Hence, CV derived from AUC instead of Cmax should be used in sample-size calculation to achieve a sufficient, yet practical, test power.
.

  14. Measuring proteins with greater speed and resolution while reducing sample size.

    Science.gov (United States)

    Hsieh, Vincent H; Wyatt, Philip J

    2017-08-30

    A multi-angle light scattering (MALS) system, combined with chromatographic separation, directly measures the absolute molar mass, size and concentration of the eluate species. The measurement of these crucial properties in solution is essential in basic macromolecular characterization and all research and production stages of bio-therapeutic products. We developed a new MALS methodology that has overcome the long-standing, stubborn barrier to microliter-scale peak volumes and achieved the highest resolution and signal-to-noise performance of any MALS measurement. The novel design simultaneously facilitates online dynamic light scattering (DLS) measurements. As National Institute of Standards and Technology (NIST) new protein standard reference material (SRM 8671) is becoming the benchmark molecule against which many biomolecular analytical techniques are assessed and evaluated, we present its measurement results as a demonstration of the unique capability of our system to swiftly resolve and measure sharp (20~25 µL full-width-half-maximum) chromatography peaks. Precise measurements of protein mass and size can be accomplished 10 times faster than before with improved resolution. In the meantime the sample amount required for such measurements is reduced commensurately. These abilities will have far-reaching impacts at every stage of the development and production of biologics and bio-therapeutic formulations.

  15. Response Surface Modelling of Electrosprayed Polyacrylonitrile Nanoparticle Size

    Directory of Open Access Journals (Sweden)

    Sanaz Khademolqorani

    2014-01-01

    Full Text Available Electrospraying (electrohydrodynamic spraying is a method of liquid atomization by electrical forces. Spraying solutions or suspensions allow production of fine particles, down to nanometer size. These particles are interesting for a wide variety of applications, thanks to their unprecedented chemical and physical behaviour in comparison to their bulk form. Knowledge of the particle size in powders is important in many studies employing nanoparticles. In this paper, the effect of some process parameters on the size of electrosprayed polyacrylonitrile particles is presented in the form of response surface model. The model is achieved by employing a factorial design to evaluate the influence of parameters on the polyacrylonitrile nanoparticle size and response surface methodology. Four electrospraying parameters, namely, applied voltage, electrospraying solution concentration, flow rate, and syringe needle diameter were considered.

  16. A regression-based differential expression detection algorithm for microarray studies with ultra-low sample size.

    Science.gov (United States)

    Vasiliu, Daniel; Clamons, Samuel; McDonough, Molly; Rabe, Brian; Saha, Margaret

    2015-01-01

    Global gene expression analysis using microarrays and, more recently, RNA-seq, has allowed investigators to understand biological processes at a system level. However, the identification of differentially expressed genes in experiments with small sample size, high dimensionality, and high variance remains challenging, limiting the usability of these tens of thousands of publicly available, and possibly many more unpublished, gene expression datasets. We propose a novel variable selection algorithm for ultra-low-n microarray studies using generalized linear model-based variable selection with a penalized binomial regression algorithm called penalized Euclidean distance (PED). Our method uses PED to build a classifier on the experimental data to rank genes by importance. In place of cross-validation, which is required by most similar methods but not reliable for experiments with small sample size, we use a simulation-based approach to additively build a list of differentially expressed genes from the rank-ordered list. Our simulation-based approach maintains a low false discovery rate while maximizing the number of differentially expressed genes identified, a feature critical for downstream pathway analysis. We apply our method to microarray data from an experiment perturbing the Notch signaling pathway in Xenopus laevis embryos. This dataset was chosen because it showed very little differential expression according to limma, a powerful and widely-used method for microarray analysis. Our method was able to detect a significant number of differentially expressed genes in this dataset and suggest future directions for investigation. Our method is easily adaptable for analysis of data from RNA-seq and other global expression experiments with low sample size and high dimensionality.

  17. A regression-based differential expression detection algorithm for microarray studies with ultra-low sample size.

    Directory of Open Access Journals (Sweden)

    Daniel Vasiliu

    Full Text Available Global gene expression analysis using microarrays and, more recently, RNA-seq, has allowed investigators to understand biological processes at a system level. However, the identification of differentially expressed genes in experiments with small sample size, high dimensionality, and high variance remains challenging, limiting the usability of these tens of thousands of publicly available, and possibly many more unpublished, gene expression datasets. We propose a novel variable selection algorithm for ultra-low-n microarray studies using generalized linear model-based variable selection with a penalized binomial regression algorithm called penalized Euclidean distance (PED. Our method uses PED to build a classifier on the experimental data to rank genes by importance. In place of cross-validation, which is required by most similar methods but not reliable for experiments with small sample size, we use a simulation-based approach to additively build a list of differentially expressed genes from the rank-ordered list. Our simulation-based approach maintains a low false discovery rate while maximizing the number of differentially expressed genes identified, a feature critical for downstream pathway analysis. We apply our method to microarray data from an experiment perturbing the Notch signaling pathway in Xenopus laevis embryos. This dataset was chosen because it showed very little differential expression according to limma, a powerful and widely-used method for microarray analysis. Our method was able to detect a significant number of differentially expressed genes in this dataset and suggest future directions for investigation. Our method is easily adaptable for analysis of data from RNA-seq and other global expression experiments with low sample size and high dimensionality.

  18. Homeopathy: statistical significance versus the sample size in experiments with Toxoplasma gondii

    Directory of Open Access Journals (Sweden)

    Ana Lúcia Falavigna Guilherme

    2011-09-01

    , examined in its full length. This study was approved by the Ethics Committee for animal experimentation of the UEM - Protocol 036/2009. The data were compared using the tests Mann Whitney and Bootstrap [7] with the statistical software BioStat 5.0. Results and discussion: There was no significant difference when analyzed with the Mann-Whitney, even multiplying the "n" ten times (p=0.0618. The number of cysts observed in BIOT 200DH group was 4.5 ± 3.3 and 12.8 ± 9.7 in the CONTROL group. Table 1 shows the results obtained using the bootstrap analysis for each data changed from 2n until 2n+5, and their respective p-values. With the inclusion of more elements in the different groups, tested one by one, randomly, increasing gradually the samples, we observed the sample size needed to statistically confirm the results seen experimentally. Using 17 mice in group BIOT 200DH and 19 in the CONTROL group we have already observed statistical significance. This result suggests that experiments involving highly diluted substances and infection of mice with T. gondii should work with experimental groups with 17 animals at least. Despite the current and relevant ethical discussions about the number of animals used for experimental procedures the number of animals involved in each experiment must meet the characteristics of each item to be studied. In the case of experiments involving highly diluted substances, experimental animal models are still rudimentary and the biological effects observed appear to be also individualized, as described in literature for homeopathy [8]. The fact that the statistical significance was achieved by increasing the sample observed in this trial, tell us about a rare event, with a strong individual behavior, difficult to demonstrate in a result set, treated simply with a comparison of means or medians. Conclusion: Bootstrap seems to be an interesting methodology for the analysis of data obtained from experiments with highly diluted

  19. Statistical tests with accurate size and power for balanced linear mixed models.

    Science.gov (United States)

    Muller, Keith E; Edwards, Lloyd J; Simpson, Sean L; Taylor, Douglas J

    2007-08-30

    The convenience of linear mixed models for Gaussian data has led to their widespread use. Unfortunately, standard mixed model tests often have greatly inflated test size in small samples. Many applications with correlated outcomes in medical imaging and other fields have simple properties which do not require the generality of a mixed model. Alternately, stating the special cases as a general linear multivariate model allows analysing them with either the univariate or multivariate approach to repeated measures (UNIREP, MULTIREP). Even in small samples, an appropriate UNIREP or MULTIREP test always controls test size and has a good power approximation, in sharp contrast to mixed model tests. Hence, mixed model tests should never be used when one of the UNIREP tests (uncorrected, Huynh-Feldt, Geisser-Greenhouse, Box conservative) or MULTIREP tests (Wilks, Hotelling-Lawley, Roy's, Pillai-Bartlett) apply. Convenient methods give exact power for the uncorrected and Box conservative tests. Simulations demonstrate that new power approximations for all four UNIREP tests eliminate most inaccuracy in existing methods. In turn, free software implements the approximations to give a better choice of sample size. Two repeated measures power analyses illustrate the methods. The examples highlight the advantages of examining the entire response surface of power as a function of sample size, mean differences, and variability.

  20. Exploring Explanations of Subglacial Bedform Sizes Using Statistical Models.

    Directory of Open Access Journals (Sweden)

    John K Hillier

    Full Text Available Sediments beneath modern ice sheets exert a key control on their flow, but are largely inaccessible except through geophysics or boreholes. In contrast, palaeo-ice sheet beds are accessible, and typically characterised by numerous bedforms. However, the interaction between bedforms and ice flow is poorly constrained and it is not clear how bedform sizes might reflect ice flow conditions. To better understand this link we present a first exploration of a variety of statistical models to explain the size distribution of some common subglacial bedforms (i.e., drumlins, ribbed moraine, MSGL. By considering a range of models, constructed to reflect key aspects of the physical processes, it is possible to infer that the size distributions are most effectively explained when the dynamics of ice-water-sediment interaction associated with bedform growth is fundamentally random. A 'stochastic instability' (SI model, which integrates random bedform growth and shrinking through time with exponential growth, is preferred and is consistent with other observations of palaeo-bedforms and geophysical surveys of active ice sheets. Furthermore, we give a proof-of-concept demonstration that our statistical approach can bridge the gap between geomorphological observations and physical models, directly linking measurable size-frequency parameters to properties of ice sheet flow (e.g., ice velocity. Moreover, statistically developing existing models as proposed allows quantitative predictions to be made about sizes, making the models testable; a first illustration of this is given for a hypothesised repeat geophysical survey of bedforms under active ice. Thus, we further demonstrate the potential of size-frequency distributions of subglacial bedforms to assist the elucidation of subglacial processes and better constrain ice sheet models.

  1. A Laser-Deposition Approach to Compositional-Spread Discovery of Materials on Conventional Sample Sizes

    Energy Technology Data Exchange (ETDEWEB)

    Christen, Hans M [ORNL; Okubo, Isao [ORNL; Rouleau, Christopher M [ORNL; Jellison Jr, Gerald Earle [ORNL; Puretzky, Alexander A [ORNL; Geohegan, David B [ORNL; Lowndes, Douglas H [ORNL

    2005-01-01

    Parallel (multi-sample) approaches, such as discrete combinatorial synthesis or continuous compositional-spread (CCS), can significantly increase the rate of materials discovery and process optimization. Here we review our generalized CCS method, based on pulsed-laser deposition, in which the synchronization between laser firing and substrate translation (behind a fixed slit aperture) yields the desired variations of composition and thickness. In situ alloying makes this approach applicable to the non-equilibrium synthesis of metastable phases. Deposition on a heater plate with a controlled spatial temperature variation can additionally be used for growth-temperature-dependence studies. Composition and temperature variations are controlled on length scales large enough to yield sample sizes sufficient for conventional characterization techniques (such as temperature-dependent measurements of resistivity or magnetic properties). This technique has been applied to various experimental studies, and we present here the results for the growth of electro-optic materials (Sr{sub x}Ba{sub 1-x}Nb{sub 2}O{sub 6}) and magnetic perovskites (Sr{sub 1-x}Ca{sub x}RuO{sub 3}), and discuss the application to the understanding and optimization of catalysts used in the synthesis of dense forests of carbon nanotubes.

  2. A laser-deposition approach to compositional-spread discovery of materials on conventional sample sizes

    Science.gov (United States)

    Christen, Hans M.; Ohkubo, Isao; Rouleau, Christopher M.; Jellison, Gerald E., Jr.; Puretzky, Alex A.; Geohegan, David B.; Lowndes, Douglas H.

    2005-01-01

    Parallel (multi-sample) approaches, such as discrete combinatorial synthesis or continuous compositional-spread (CCS), can significantly increase the rate of materials discovery and process optimization. Here we review our generalized CCS method, based on pulsed-laser deposition, in which the synchronization between laser firing and substrate translation (behind a fixed slit aperture) yields the desired variations of composition and thickness. In situ alloying makes this approach applicable to the non-equilibrium synthesis of metastable phases. Deposition on a heater plate with a controlled spatial temperature variation can additionally be used for growth-temperature-dependence studies. Composition and temperature variations are controlled on length scales large enough to yield sample sizes sufficient for conventional characterization techniques (such as temperature-dependent measurements of resistivity or magnetic properties). This technique has been applied to various experimental studies, and we present here the results for the growth of electro-optic materials (SrxBa1-xNb2O6) and magnetic perovskites (Sr1-xCaxRuO3), and discuss the application to the understanding and optimization of catalysts used in the synthesis of dense forests of carbon nanotubes.

  3. Weighted piecewise LDA for solving the small sample size problem in face verification.

    Science.gov (United States)

    Kyperountas, Marios; Tefas, Anastasios; Pitas, Ioannis

    2007-03-01

    A novel algorithm that can be used to boost the performance of face-verification methods that utilize Fisher's criterion is presented and evaluated. The algorithm is applied to similarity, or matching error, data and provides a general solution for overcoming the "small sample size" (SSS) problem, where the lack of sufficient training samples causes improper estimation of a linear separation hyperplane between the classes. Two independent phases constitute the proposed method. Initially, a set of weighted piecewise discriminant hyperplanes are used in order to provide a more accurate discriminant decision than the one produced by the traditional linear discriminant analysis (LDA) methodology. The expected classification ability of this method is investigated throughout a series of simulations. The second phase defines proper combinations for person-specific similarity scores and describes an outlier removal process that further enhances the classification ability. The proposed technique has been tested on the M2VTS and XM2VTS frontal face databases. Experimental results indicate that the proposed framework greatly improves the face-verification performance.

  4. Evaluation of pump pulsation in respirable size-selective sampling: part I. Pulsation measurements.

    Science.gov (United States)

    Lee, Eun Gyung; Lee, Larry; Möhlmann, Carsten; Flemmer, Michael M; Kashon, Michael; Harper, Martin

    2014-01-01

    Pulsations generated by personal sampling pumps modulate the airflow through the sampling trains, thereby varying sampling efficiencies, and possibly invalidating collection or monitoring. The purpose of this study was to characterize pulsations generated by personal sampling pumps relative to a nominal flow rate at the inlet of different respirable cyclones. Experiments were conducted using a factorial combination of 13 widely used sampling pumps (11 medium and 2 high volumetric flow rate pumps having a diaphragm mechanism) and 7 cyclones [10-mm nylon also known as Dorr-Oliver (DO), Higgins-Dewell (HD), GS-1, GS-3, Aluminum, GK2.69, and FSP-10]. A hot-wire anemometer probe cemented to the inlet of each cyclone type was used to obtain pulsation readings. The three medium flow rate pump models showing the highest, a midrange, and the lowest pulsations and two high flow rate pump models for each cyclone type were tested with dust-loaded filters (0.05, 0.21, and 1.25mg) to determine the effects of filter loading on pulsations. The effects of different tubing materials and lengths on pulsations were also investigated. The fundamental frequency range was 22-110 Hz and the magnitude of pulsation as a proportion of the mean flow rate ranged from 4.4 to 73.1%. Most pump/cyclone combinations generated pulse magnitudes ≥10% (48 out of 59 combinations), while pulse shapes varied considerably. Pulsation magnitudes were not considerably different for the clean and dust-loaded filters for the DO, HD, and Aluminum cyclones, but no consistent pattern was observed for the other cyclone types. Tubing material had less effect on pulsations than tubing length; when the tubing length was 183cm, pronounced damping was observed for a pump with high pulsation (>60%) for all tested tubing materials except for the Tygon Inert tubing. The findings in this study prompted a further study to determine the possibility of shifts in cyclone sampling efficiency due to sampling pump pulsations

  5. Modeling size effect in the SMA response: a gradient theory

    Science.gov (United States)

    Tabesh, Majid; Boyd, James G.; Lagoudas, Dimitris C.

    2014-03-01

    Shape memory alloys (SMAs) show size effect in their response. The critical stresses, for instance, for the start of martensite and austenite transformations are reported to increase in some SMA wires for diameters below 100 μm. Simulation of such a behavior cannot be achieved using conventional theories that lack an intrinsic length scale in their constitutive modeling. To enable the size effect, a thermodynamically consistent constitutive model is developed, that in addition to conventional internal variables of martensitic volume fraction and transformation strain, contains the spatial gradient of martensitic volume fraction as an internal variable. The developed theory is simplified for 1D cases and analytical solutions for pure bending of SMA beams are presented. The gradient model captures the size effect in the response of the studied SMA structures.

  6. Maximum type I error rate inflation from sample size reassessment when investigators are blind to treatment labels.

    Science.gov (United States)

    Żebrowska, Magdalena; Posch, Martin; Magirr, Dominic

    2016-05-30

    Consider a parallel group trial for the comparison of an experimental treatment to a control, where the second-stage sample size may depend on the blinded primary endpoint data as well as on additional blinded data from a secondary endpoint. For the setting of normally distributed endpoints, we demonstrate that this may lead to an inflation of the type I error rate if the null hypothesis holds for the primary but not the secondary endpoint. We derive upper bounds for the inflation of the type I error rate, both for trials that employ random allocation and for those that use block randomization. We illustrate the worst-case sample size reassessment rule in a case study. For both randomization strategies, the maximum type I error rate increases with the effect size in the secondary endpoint and the correlation between endpoints. The maximum inflation increases with smaller block sizes if information on the block size is used in the reassessment rule. Based on our findings, we do not question the well-established use of blinded sample size reassessment methods with nuisance parameter estimates computed from the blinded interim data of the primary endpoint. However, we demonstrate that the type I error rate control of these methods relies on the application of specific, binding, pre-planned and fully algorithmic sample size reassessment rules and does not extend to general or unplanned sample size adjustments based on blinded data. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  7. Large sample area and size are needed for forest soil seed bank studies to ensure low discrepancy with standing vegetation.

    Directory of Open Access Journals (Sweden)

    You-xin Shen

    Full Text Available A large number of small-sized samples invariably shows that woody species are absent from forest soil seed banks, leading to a large discrepancy with the seedling bank on the forest floor. We ask: 1 Does this conventional sampling strategy limit the detection of seeds of woody species? 2 Are large sample areas and sample sizes needed for higher recovery of seeds of woody species? We collected 100 samples that were 10 cm (length × 10 cm (width × 10 cm (depth, referred to as larger number of small-sized samples (LNSS in a 1 ha forest plot, and placed them to germinate in a greenhouse, and collected 30 samples that were 1 m × 1 m × 10 cm, referred to as small number of large-sized samples (SNLS and placed them (10 each in a nearby secondary forest, shrub land and grass land. Only 15.7% of woody plant species of the forest stand were detected by the 100 LNSS, contrasting with 22.9%, 37.3% and 20.5% woody plant species being detected by SNLS in the secondary forest, shrub land and grassland, respectively. The increased number of species vs. sampled areas confirmed power-law relationships for forest stand, the LNSS and SNLS at all three recipient sites. Our results, although based on one forest, indicate that conventional LNSS did not yield a high percentage of detection for woody species, but SNLS strategy yielded a higher percentage of detection for woody species in the seed bank if samples were exposed to a better field germination environment. A 4 m2 minimum sample area derived from power equations is larger than the sampled area in most studies in the literature. Increased sample size also is needed to obtain an increased sample area if the number of samples is to remain relatively low.

  8. The use of summary statistics for sample size allocation for food composition surveys and an application to the potato group.

    Science.gov (United States)

    Tsukakoshi, Yoshiki; Yasui, Akemi

    2011-11-01

    To give a quantitative guide to sample size allocation for developing sampling designs for a food composition survey, we discuss sampling strategies that consider the importance of each food; namely, consumption or production, variability of composition, and the restrictions within the available resources for sample collection and analysis are considered., Here we consider two strategies: 'proportional' and 'Neyman' are discussed. Both of these incorporate consumed quantity of foods, and we review some available statistics for allocation issues. The Neyman optimal strategy allocates less sample size for starch than proportional, because the former incorporates variability in the composition. Those strategies improved accuracy in dietary nutrient intake more than equal sample size allocation. Those strategies will be useful as we often face sample size allocation problems, wherein we decide whether to sample 'five white potatoes and five taros or nine white and one taros'. Allocating sufficient sample size for important foodstuffs is essential in assuring data quality. Nevertheless, the food composition table should be as comprehensive as possible.

  9. Cuffed Endotracheal Tube Size and Leakage in Pediatric Tracheal Models

    Directory of Open Access Journals (Sweden)

    Jun Hyun Kim

    2014-04-01

    Full Text Available Object: Cuffed endotracheal tubes are increasingly used in pediatric patients in the hope that they can reduce air leakage and tube size mismatch by just inflating the cuff. Authors compared influence of various tube sizes and different levels of cuff pressures to air leakage around the cuff, in artificial tracheal models. Methods: Six PVC cylinders of different internal diameters (ID: 8.15, 8.50, 9.70, 12.05, 14.50, and 20.00 mm were prepared. An artificial lung connected with cylinder was ventilated with an anesthesia machine. Cuffed endotracheal tubes of different sizes (ID 3.0~8.0 were located in the cylinders and the cuff was inflated with various pressures (15, 20, 25, 30 and 35 cm H2O. Expiratory tidal volume was measured with more than 25% loss of baseline expiratory tidal volume was considered significant air leakage. Results: Tube sizes same as, or larger than ID 5.0 didn’t show significant air leakage for any trachea model, only if the inflated cuff size is larger than the cylinder ID, except ID 5.5 tube at cuff pressure 15 cm H2O and 20 cm H2O, in 12.05 mm cylinder. Tubes sizes same as or smaller than ID 4.5, which have short cuff lengths and sizes than tubes larger than, or same as ID 5.0, leaked significantly at any tracheal models, except ID 4.5 tube at cuff pressure 35 cm H2O, in 8.50 mm cylinder. Conclusion: In PVC pediatric tracheal models, tubes same as, or smaller than ID 4.5 are inferior to tubes same as, or larger than ID 5.0 in preventing air leakage, and may need a higher cuff pressure to reduce air leakage. Further clinical studies could be designed based on our results.

  10. Memory-Optimized Software Synthesis from Dataflow Program Graphs with Large Size Data Samples

    Directory of Open Access Journals (Sweden)

    Hyunok Oh

    2003-05-01

    Full Text Available In multimedia and graphics applications, data samples of nonprimitive type require significant amount of buffer memory. This paper addresses the problem of minimizing the buffer memory requirement for such applications in embedded software synthesis from graphical dataflow programs based on the synchronous dataflow (SDF model with the given execution order of nodes. We propose a memory minimization technique that separates global memory buffers from local pointer buffers: the global buffers store live data samples and the local buffers store the pointers to the global buffer entries. The proposed algorithm reduces 67% memory for a JPEG encoder, 40% for an H.263 encoder compared with unshared versions, and 22% compared with the previous sharing algorithm for the H.263 encoder. Through extensive buffer sharing optimization, we believe that automatic software synthesis from dataflow program graphs achieves the comparable code quality with the manually optimized code in terms of memory requirement.

  11. Assumptions behind size-based ecosystem models are realistic

    DEFF Research Database (Denmark)

    Andersen, Ken Haste; Blanchard, Julia L.; Fulton, Elizabeth A.;

    2016-01-01

    A recent publication about balanced harvesting (Froese et al., ICES Journal of Marine Science; doi:10.1093/icesjms/fsv122) contains several erroneous statements about size-spectrum models. We refute the statements by showing that the assumptions pertaining to size-spectrum models discussed...... by Froese et al. are realistic and consistent. We further show that the assumption about density-dependence being described by a stock recruitment relationship is responsible for determining whether a peak in the cohort biomass of a population occurs late or early in life. Finally, we argue...

  12. Modelling the impact of intrinsic size and luminosity correlations on magnification estimation

    CERN Document Server

    Ciarlariello, Sandro

    2016-01-01

    Spatial correlations of the observed sizes and luminosities of galaxies can be used to estimate the magnification that arises through weak gravitational lensing. However, the intrinsic prop- erties of galaxies can be similarly correlated through local physical effects, and these present a possible contamination to the weak lensing estimation. In an earlier paper (Ciarlariello et al. 2015) we modelled the intrinsic size correlations using the halo model, assuming the galaxy sizes reflect the mass in the associated halo. Here we extend this work to consider galaxy magnitudes and show that these may be even more affected by intrinsic correlations than galaxy sizes, making this a bigger systematic for measurements of the weak lensing signal. We also quantify how these intrinsic correlations are affected by sample selection criteria based on sizes and magnitudes.

  13. Experimental study on biopsy sampling using new flexible cryoprobes: influence of activation time, probe size, tissue consistency, and contact pressure of the probe on the size of the biopsy specimen.

    Science.gov (United States)

    Franke, Karl-Josef; Szyrach, Mara; Nilius, Georg; Hetzel, Jürgen; Hetzel, Martin; Ruehle, Karl-Heinz; Enderle, Markus D

    2009-08-01

    Cryoextraction is a procedure for recanalization of obstructed airways caused by exophytic growing tumors. Biopsy samples obtained with this method can be used for histological diagnosis. The objective of this study was to evaluate the parameters influencing the size of cryobiopsies in an in vitro animal model. New flexible cryoprobes with different diameters were used to extract biopsies from lung tissue. These biopsies were compared with forceps biopsy (gold standard) in terms of the biopsy size. Tissue dependency of the biopsy size was analyzed by comparing biopsies taken from the lung, the liver, and gastric mucosa. The effect of contact pressure exerted by the tip of the cryoprobe on the tissue was analyzed on liver tissue separately. Biopsy size was estimated by measuring the weight and the diameter. Weight and diameter of cryobiopsies correlated positively with longer activation times and larger diameters of the cryoprobe. The weight of the biopsies was tissue dependent: lung biopsy diameter. The biopsy size increased when the probe was pressed on the tissue during cooling. Cryobiopsies can be taken from different tissue types with flexible cryoprobes. The size of the samples depends on tissue type, probe diameter, application time, and pressure exerted by the probe on the tissue. Even the cryoprobe with the smallest diameter can provide larger biopsies than a forceps biopsy in lung. It can be expected that the same parameters influence the sample size of biopsies in vivo.

  14. Geospatial modeling of fire-size distributions in historical low-severity fire regimes

    Science.gov (United States)

    McKenzie, D.; Kellogg, L. B.; Larkin, N. K.

    2006-12-01

    Low-severity fires are recorded by fire-scarred trees. These records can provide temporal depth for reconstructing fire history because one tree may record dozens of separate fires over time, thereby providing adequate sample size for estimating fire frequency. Estimates of actual fire perimeters from these point-based records are uncertain, however, because fire boundaries can only be located approximately. We indirectly estimate fire-size distributions without attempting to establish individual fire perimeters. The slope and intercept of the interval-area function, a power-law relationship between sample area and mean fire-free intervals for that area, provide surrogates for the moments of a fire-size distribution, given a distribution of fire- free intervals. Analogously, by deconstructing variograms that use a binary distance measure (Sorensen's index) for the similarity of the time-series of fires recorded by pairs of recorder trees, we provide estimates of modal fire size. We link both variograms and interval-area functions to fire size distributions by simulating fire size distributions on neutral landscapes with and without right- censoring to represent topographic controls on maximum fire size. From parameters of the two functions produced by simulations we can back-estimate means and variances of fire sizes on real landscapes. This scale-based modeling provides a robust alternative to empirical and heuristic methods and a means to extrapolate estimates of fire-size distributions to unsampled landscapes.

  15. Transgender Population Size in the United States: a Meta-Regression of Population-Based Probability Samples

    Science.gov (United States)

    Sevelius, Jae M.

    2017-01-01

    Background. Transgender individuals have a gender identity that differs from the sex they were assigned at birth. The population size of transgender individuals in the United States is not well-known, in part because official records, including the US Census, do not include data on gender identity. Population surveys today more often collect transgender-inclusive gender-identity data, and secular trends in culture and the media have created a somewhat more favorable environment for transgender people. Objectives. To estimate the current population size of transgender individuals in the United States and evaluate any trend over time. Search methods. In June and July 2016, we searched PubMed, Cumulative Index to Nursing and Allied Health Literature, and Web of Science for national surveys, as well as “gray” literature, through an Internet search. We limited the search to 2006 through 2016. Selection criteria. We selected population-based surveys that used probability sampling and included self-reported transgender-identity data. Data collection and analysis. We used random-effects meta-analysis to pool eligible surveys and used meta-regression to address our hypothesis that the transgender population size estimate would increase over time. We used subsample and leave-one-out analysis to assess for bias. Main results. Our meta-regression model, based on 12 surveys covering 2007 to 2015, explained 62.5% of model heterogeneity, with a significant effect for each unit increase in survey year (F = 17.122; df = 1,10; b = 0.026%; P = .002). Extrapolating these results to 2016 suggested a current US population size of 390 adults per 100 000, or almost 1 million adults nationally. This estimate may be more indicative for younger adults, who represented more than 50% of the respondents in our analysis. Authors’ conclusions. Future national surveys are likely to observe higher numbers of transgender people. The large variety in questions used to ask

  16. Can we gain precision by sampling with probabilities proportional to size in surveying recent landscape changes in the Netherlands?

    NARCIS (Netherlands)

    Brus, D.J.; Nieuwenhuizen, W.; Koomen, A.J.M.

    2006-01-01

    Seventy-two squares of 100 ha were selected by stratified random sampling with probabilities proportional to size (pps) to survey landscape changes in the period 1996-2003. The area of the plots times the urbanization pressure was used as a size measure. The central question of this study is whether

  17. Size and power considerations for testing loglinear models using φ-divergence test statistics under product-multinomial sampling%基于φ-散度的乘积多项抽样下对数线性模型的检验水平和功效

    Institute of Scientific and Technical Information of China (English)

    金应华; 吴耀华

    2009-01-01

    考虑了乘积多项抽样下的对数线性模型.在这个模型下,文献[Jin Y H, Wu Y H. Minimum φ-divergence estimator and hierarchical testing in log-linear models under product-multinomial sampling. Journal of Statistical Planning and Inference, 2009,139:3 488-3 500]用基于φ-散度和最小φ-散度估计构造的统计量研究了几类假设检验问题,这其中就有嵌套假设.最小φ-散度估计是极大似然估计的推广.在上述文献的基础上,给出了其中一类检验的功效函数的渐近逼近公式;另外,还研究了在一列近邻假设下检验统计量的渐近分布.通过模拟研究发现,与Pearson型统计量和对数极大似然比统计量相比,Cressie-Read型检验统计量有差不多的甚至更好的模拟功效和水平.%Suppose that discrete data are distributed according to a product-multinomial distribution whose probabilities follow a loglinear model.Under the model above,Ref.[Jin Y H,Wu Y H.Minimum φ-divergence estimator and hierarchical testing in log-linear models under product-multinomial sampling.Journal of Statistical Planning and Inference,2009,139:3 488-3 500] have considered hypothesis test problems including hierarchical tests using φ-divergence test statistics that contain the minimum φ-divergence estimator (MφE) which is seen as a generalization of the maximum likelihood estimator.Here an approximation to the power function of one of these tests and asymptotic distributions of these test statistics under a contiguous sequence of hypotheses on the basis of the results in Jin et al was gotten.In the last section,a simulation study was conducted to find our member of the power-divergence statistics is the best,the Cressie-Read test statistic is an attractive alternative to the Pearson-based statistic and the likelihood ratio-based test statistic in terms of simulated sizes and powers.

  18. Sample Size Calculation: Inaccurate A Priori Assumptions for Nuisance Parameters Can Greatly Affect the Power of a Randomized Controlled Trial.

    Directory of Open Access Journals (Sweden)

    Elsa Tavernier

    Full Text Available We aimed to examine the extent to which inaccurate assumptions for nuisance parameters used to calculate sample size can affect the power of a randomized controlled trial (RCT. In a simulation study, we separately considered an RCT with continuous, dichotomous or time-to-event outcomes, with associated nuisance parameters of standard deviation, success rate in the control group and survival rate in the control group at some time point, respectively. For each type of outcome, we calculated a required sample size N for a hypothesized treatment effect, an assumed nuisance parameter and a nominal power of 80%. We then assumed a nuisance parameter associated with a relative error at the design stage. For each type of outcome, we randomly drew 10,000 relative errors of the associated nuisance parameter (from empirical distributions derived from a previously published review. Then, retro-fitting the sample size formula, we derived, for the pre-calculated sample size N, the real power of the RCT, taking into account the relative error for the nuisance parameter. In total, 23%, 0% and 18% of RCTs with continuous, binary and time-to-event outcomes, respectively, were underpowered (i.e., the real power was 90%. Even with proper calculation of sample size, a substantial number of trials are underpowered or overpowered because of imprecise knowledge of nuisance parameters. Such findings raise questions about how sample size for RCTs should be determined.

  19. Finite size scaling in the planar Lebwohl-Lasher model

    Science.gov (United States)

    Mondal, Enakshi; Roy, Soumen Kumar

    2003-06-01

    The standard finite size scaling method for second order phase transition has been applied to Monte Carlo data obtained for a planar Lebwohl-Lasher lattice model using the Wolff cluster algorithm. We obtain Tc and the exponents γ, ν, and z and the results are different from those obtained by other investigators.

  20. A Dynamic Lot-Sizing Model with Demand Time Windows

    NARCIS (Netherlands)

    C.Y. Lee (Chung-Yee); S. Çetinkaya; A.P.M. Wagelmans (Albert)

    1999-01-01

    textabstractOne of the basic assumptions of the classical dynamic lot-sizing model is that the aggregate demand of a given period must be satisfied in that period. Under this assumption, if backlogging is not allowed then the demand of a given period cannot be delivered earlier or later than the

  1. A dynamic lot-sizing model with demand time windows

    NARCIS (Netherlands)

    C.Y. Lee (Chung-Yee); S. Cetinkaya; A.P.M. Wagelmans (Albert)

    1999-01-01

    textabstractOne of the basic assumptions of the classical dynamic lot-sizing model is that the aggregate demand of a given period must be satisfied in that period. Under this assumption, if backlogging is not allowed then the demand of a given period cannot be delivered earlier or later than the

  2. Resilience and Critical Stock Size in a Stochastic Recruitment Model

    NARCIS (Netherlands)

    Grasman, J.; Huiskes, M.J.

    2001-01-01

    A stochastic model for fish recruitment is fitted to data after performing an age-structured stock assessment. The main aim is to investigate the relation between safe levels of spawning stock size and fish stock resilience. Resilience indicators, such as stock recovery time and the frequency that a

  3. A dynamic lot-sizing model with demand time windows

    NARCIS (Netherlands)

    C.Y. Lee (Chung-Yee); S. Cetinkaya; A.P.M. Wagelmans (Albert)

    1999-01-01

    textabstractOne of the basic assumptions of the classical dynamic lot-sizing model is that the aggregate demand of a given period must be satisfied in that period. Under this assumption, if backlogging is not allowed then the demand of a given period cannot be delivered earlier or later than the pe

  4. A Dynamic Lot-Sizing Model with Demand Time Windows

    NARCIS (Netherlands)

    C.Y. Lee (Chung-Yee); S. Çetinkaya; A.P.M. Wagelmans (Albert)

    1999-01-01

    textabstractOne of the basic assumptions of the classical dynamic lot-sizing model is that the aggregate demand of a given period must be satisfied in that period. Under this assumption, if backlogging is not allowed then the demand of a given period cannot be delivered earlier or later than the per

  5. Parabolic Free Boundary Price Formation Models Under Market Size Fluctuations

    KAUST Repository

    Markowich, Peter A.

    2016-10-04

    In this paper we propose an extension of the Lasry-Lions price formation model which includes uctuations of the numbers of buyers and vendors. We analyze the model in the case of deterministic and stochastic market size uctuations and present results on the long time asymptotic behavior and numerical evidence and conjectures on periodic, almost periodic, and stochastic uctuations. The numerical simulations extend the theoretical statements and give further insights into price formation dynamics.

  6. Importance of sample size for the estimation of repeater F waves in amyotrophic lateral sclerosis.

    Science.gov (United States)

    Fang, Jia; Liu, Ming-Sheng; Guan, Yu-Zhou; Cui, Bo; Cui, Li-Ying

    2015-02-20

    In amyotrophic lateral sclerosis (ALS), repeater F waves are increased. Accurate assessment of repeater F waves requires an adequate sample size. We studied the F waves of left ulnar nerves in ALS patients. Based on the presence or absence of pyramidal signs in the left upper limb, the ALS patients were divided into two groups: One group with pyramidal signs designated as P group and the other without pyramidal signs designated as NP group. The Index repeating neurons (RN) and Index repeater F waves (Freps) were compared among the P, NP and control groups following 20 and 100 stimuli respectively. For each group, the Index RN and Index Freps obtained from 20 and 100 stimuli were compared. In the P group, the Index RN (P = 0.004) and Index Freps (P = 0.001) obtained from 100 stimuli were significantly higher than from 20 stimuli. For F waves obtained from 20 stimuli, no significant differences were identified between the P and NP groups for Index RN (P = 0.052) and Index Freps (P = 0.079); The Index RN (P waves obtained from 100 stimuli, the Index RN (P waves reflect increased excitability of motor neuron pool and indicate upper motor neuron dysfunction in ALS. For an accurate evaluation of repeater F waves in ALS patients especially those with moderate to severe muscle atrophy, 100 stimuli would be required.

  7. Data with hierarchical structure: impact of intraclass correlation and sample size on type-I error.

    Science.gov (United States)

    Musca, Serban C; Kamiejski, Rodolphe; Nugier, Armelle; Méot, Alain; Er-Rafiy, Abdelatif; Brauer, Markus

    2011-01-01

    Least squares analyses (e.g., ANOVAs, linear regressions) of hierarchical data leads to Type-I error rates that depart severely from the nominal Type-I error rate assumed. Thus, when least squares methods are used to analyze hierarchical data coming from designs in which some groups are assigned to the treatment condition, and others to the control condition (i.e., the widely used "groups nested under treatment" experimental design), the Type-I error rate is seriously inflated, leading too often to the incorrect rejection of the null hypothesis (i.e., the incorrect conclusion of an effect of the treatment). To highlight the severity of the problem, we present simulations showing how the Type-I error rate is affected under different conditions of intraclass correlation and sample size. For all simulations the Type-I error rate after application of the popular Kish (1965) correction is also considered, and the limitations of this correction technique discussed. We conclude with suggestions on how one should collect and analyze data bearing a hierarchical structure.

  8. Data with hierarchical structure: impact of intraclass correlation and sample size on Type-I error

    Directory of Open Access Journals (Sweden)

    Serban C Musca

    2011-04-01

    Full Text Available Least squares analyses (e.g., ANOVAs, linear regressions of hierarchical data leads to Type-I error rates that depart severely from the nominal Type-I error rate assumed. Thus, when least squares methods are used to analyze hierarchical data coming from designs in which some groups are assigned to the treatment condition, and others to the control condition (i.e., the widely used "groups nested under treatment" experimental design, the Type-I error rate is seriously inflated, leading too often to the incorrect rejection of the null hypothesis (i.e., the incorrect conclusion of an effect of the treatment. To highlight the severity of the problem, we present simulations showing how the Type-I error rate is affected under different conditions of intraclass correlation and sample size. For all simulations the Type-I error rate after application of the popular Kish (1965 correction is also considered, and the limitations of this correction technique discussed. We conclude with suggestions on how one should collect and analyze data bearing a hierarchical structure.

  9. Sampling hazelnuts for aflatoxin: Effects of sample size and accetp/reject limit on reducing risk of misclassifying lots

    Science.gov (United States)

    About 100 countries have established regulatory limits for aflatoxin in food and feeds. Because these limits vary widely among regulating countries, the Codex Committee on Food Additives and Contaminants (CCFAC) began work in 2004 to harmonize aflatoxin limits and sampling plans for aflatoxin in alm...

  10. Media Exposure: How Models Simplify Sampling

    DEFF Research Database (Denmark)

    Mortensen, Peter Stendahl

    1998-01-01

    In media planning, the distribution of exposures to more ad spots in more media (print, TV, radio) is crucial to the evaluation of the campaign. If such information should be sampled, it would only be possible in expensive panel-studies (eg TV-meter panels). Alternatively, the distribution of exp...

  11. Media Exposure: How Models Simplify Sampling

    DEFF Research Database (Denmark)

    Mortensen, Peter Stendahl

    1998-01-01

    In media planning, the distribution of exposures to more ad spots in more media (print, TV, radio) is crucial to the evaluation of the campaign. If such information should be sampled, it would only be possible in expensive panel-studies (eg TV-meter panels). Alternatively, the distribution of exp...

  12. Comparing interval estimates for small sample ordinal CFA models.

    Science.gov (United States)

    Natesan, Prathiba

    2015-01-01

    Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading

  13. Evaluation of char combustion models: measurement and analysis of variability in char particle size and density

    Energy Technology Data Exchange (ETDEWEB)

    Maloney, Daniel J; Monazam, Esmail R; Casleton, Kent H; Shaddix, Christopher R

    2008-08-01

    Char samples representing a range of combustion conditions and extents of burnout were obtained from a well-characterized laminar flow combustion experiment. Individual particles from the parent coal and char samples were characterized to determine distributions in particle volume, mass, and density at different extent of burnout. The data were then compared with predictions from a comprehensive char combustion model referred to as the char burnout kinetics model (CBK). The data clearly reflect the particle- to-particle heterogeneity of the parent coal and show a significant broadening in the size and density distributions of the chars resulting from both devolatilization and combustion. Data for chars prepared in a lower oxygen content environment (6% oxygen by vol.) are consistent with zone II type combustion behavior where most of the combustion is occurring near the particle surface. At higher oxygen contents (12% by vol.), the data show indications of more burning occurring in the particle interior. The CBK model does a good job of predicting the general nature of the development of size and density distributions during burning but the input distribution of particle size and density is critical to obtaining good predictions. A significant reduction in particle size was observed to occur as a result of devolatilization. For comprehensive combustion models to provide accurate predictions, this size reduction phenomenon needs to be included in devolatilization models so that representative char distributions are carried through the calculations.

  14. Cascades in the Threshold Model for varying system sizes

    Science.gov (United States)

    Karampourniotis, Panagiotis; Sreenivasan, Sameet; Szymanski, Boleslaw; Korniss, Gyorgy

    2015-03-01

    A classical model in opinion dynamics is the Threshold Model (TM) aiming to model the spread of a new opinion based on the social drive of peer pressure. Under the TM a node adopts a new opinion only when the fraction of its first neighbors possessing that opinion exceeds a pre-assigned threshold. Cascades in the TM depend on multiple parameters, such as the number and selection strategy of the initially active nodes (initiators), and the threshold distribution of the nodes. For a uniform threshold in the network there is a critical fraction of initiators for which a transition from small to large cascades occurs, which for ER graphs is largerly independent of the system size. Here, we study the spread contribution of each newly assigned initiator under the TM for different initiator selection strategies for synthetic graphs of various sizes. We observe that for ER graphs when large cascades occur, the spread contribution of the added initiator on the transition point is independent of the system size, while the contribution of the rest of the initiators converges to zero at infinite system size. This property is used for the identification of large transitions for various threshold distributions. Supported in part by ARL NS-CTA, ARO, ONR, and DARPA.

  15. Modeling of Microporosity Size Distribution in Aluminum Alloy A356

    Science.gov (United States)

    Yao, Lu; Cockcroft, Steve; Zhu, Jindong; Reilly, Carl

    2011-12-01

    Porosity is one of the most common defects to degrade the mechanical properties of aluminum alloys. Prediction of pore size, therefore, is critical to optimize the quality of castings. Moreover, to the design engineer, knowledge of the inherent pore population in a casting is essential to avoid potential fatigue failure of the component. In this work, the size distribution of the porosity was modeled based on the assumptions that the hydrogen pores are nucleated heterogeneously and that the nucleation site distribution is a Gaussian function of hydrogen supersaturation in the melt. The pore growth is simulated as a hydrogen-diffusion-controlled process, which is driven by the hydrogen concentration gradient at the pore liquid interface. Directionally solidified A356 (Al-7Si-0.3Mg) alloy castings were used to evaluate the predictive capability of the proposed model. The cast pore volume fraction and size distributions were measured using X-ray microtomography (XMT). Comparison of the experimental and simulation results showed that good agreement could be obtained in terms of both porosity fraction and size distribution. The model can effectively evaluate the effect of hydrogen content, heterogeneous pore nucleation population, cooling conditions, and degassing time on microporosity formation.

  16. SINGLE SAMPLING PLANS FOR VARIABLES INDEXED BY AQL AND AOQL UNDER EWMA MODEL

    Directory of Open Access Journals (Sweden)

    J. R. Singh

    2012-12-01

    Full Text Available The objective of this paper is to provide expression for evaluating sample size(n,acceptance parameter(k, Average Outgoing Quality(AOQ and Operating Characteristic(OCfunction under Exponentially Weighted Moving Average (EWMA model. The paper provides aninvestigation into the robustness of single sampling procedure indexed by Acceptance QualityLevel(AQL and Average Outgoing Quality Level(AOQL.

  17. Sample Size Estimation for Negative Binomial Regression Comparing Rates of Recurrent Events with Unequal Follow-Up Time.

    Science.gov (United States)

    Tang, Yongqiang

    2015-01-01

    A sample size formula is derived for negative binomial regression for the analysis of recurrent events, in which subjects can have unequal follow-up time. We obtain sharp lower and upper bounds on the required size, which is easy to compute. The upper bound is generally only slightly larger than the required size, and hence can be used to approximate the sample size. The lower and upper size bounds can be decomposed into two terms. The first term relies on the mean number of events in each group, and the second term depends on two factors that measure, respectively, the extent of between-subject variability in event rates, and follow-up time. Simulation studies are conducted to assess the performance of the proposed method. An application of our formulae to a multiple sclerosis trial is provided.

  18. Investigation of FE model size definition for surface coating application

    Science.gov (United States)

    Chen, Yanhong; Zhuang, Weimin; Wang, Shiwen; Lin, Jianguo; Balint, Daniel; Shan, Debin

    2012-09-01

    An efficient prediction mechanical performance of coating structures has been a constant concern since the dawn of surface engineering. However, predictive models presented by initial research are normally based on traditional solid mechanics, and thus cannot predict coating performance accurately. Also, the high computational costs that originate from the exclusive structure of surface coating systems (a big difference in the order of coating and substrate) are not well addressed by these models. To fill the needs for accurate prediction and low computational costs, a multi-axial continuum damage mechanics (CDM)-based constitutive model is introduced for the investigation of the load bearing capacity and fracture properties of coatings. Material parameters within the proposed constitutive model are determined for a typical coating (TiN) and substrate (Cu) system. An efficient numerical subroutine is developed to implement the determined constitutive model into the commercial FE solver, ABAQUS, through the user-defined subroutine, VUMAT. By changing the geometrical sizes of FE models, a series of computations are carried out to investigate (1) loading features, (2) stress distributions, and (3) failure features of the coating system. The results show that there is a critical displacement corresponding to each FE model size, and only if the applied normal loading displacement is smaller than the critical displacement, a reasonable prediction can be achieved. Finally, a 3D map of the critical displacement is generated to provide guidance for users to determine an FE model with suitable geometrical size for surface coating simulations. This paper presents an effective modelling approach for the prediction of mechanical performance of surface coatings.

  19. Size reduction techniques for vital compliant VHDL simulation models

    Science.gov (United States)

    Rich, Marvin J.; Misra, Ashutosh

    2006-08-01

    A method and system select delay values from a VHDL standard delay file that correspond to an instance of a logic gate in a logic model. Then the system collects all the delay values of the selected instance and builds super generics for the rise-time and the fall-time of the selected instance. Then, the system repeats this process for every delay value in the standard delay file (310) that correspond to every instance of every logic gate in the logic model. The system then outputs a reduced size standard delay file (314) containing the super generics for every instance of every logic gate in the logic model.

  20. Obtained effect size as a function of sample size in approved antidepressants: a real-world illustration in support of better trial design.

    Science.gov (United States)

    Gibertini, Michael; Nations, Kari R; Whitaker, John A

    2012-03-01

    The high failure rate of antidepressant trials has spurred exploration of the factors that affect trial sensitivity. In the current analysis, Food and Drug Administration antidepressant drug registration trial data compiled by Turner et al. is extended to include the most recently approved antidepressants. The expanded dataset is examined to further establish the likely population effect size (ES) for monoaminergic antidepressants and to demonstrate the relationship between observed ES and sample size in trials on compounds with proven efficacy. Results indicate that the overall underlying ES for antidepressants is approximately 0.30, and that the variability in observed ES across trials is related to the sample size of the trial. The current data provide a unique real-world illustration of an often underappreciated statistical truism: that small N trials are more likely to mislead than to inform, and that by aligning sample size to the population ES, risks of both erroneously high and low effects are minimized. The results in the current study make this abstract concept concrete and will help drug developers arrive at informed gate decisions with greater confidence and fewer risks, improving the odds of success for future antidepressant trials.

  1. Sample size estimation to substantiate freedom from disease for clustered binary data with a specific risk profile

    DEFF Research Database (Denmark)

    Kostoulas, P.; Nielsen, Søren Saxmose; Browne, W. J.;

    2013-01-01

    and power when applied to these groups. We propose the use of the variance partition coefficient (VPC), which measures the clustering of infection/disease for individuals with a common risk profile. Sample size estimates are obtained separately for those groups that exhibit markedly different heterogeneity......SUMMARY Disease cases are often clustered within herds or generally groups that share common characteristics. Sample size formulae must adjust for the within-cluster correlation of the primary sampling units. Traditionally, the intra-cluster correlation coefficient (ICC), which is an average...

  2. Coagulation-Fragmentation Model for Animal Group-Size Statistics

    Science.gov (United States)

    Degond, Pierre; Liu, Jian-Guo; Pego, Robert L.

    2017-04-01

    We study coagulation-fragmentation equations inspired by a simple model proposed in fisheries science to explain data for the size distribution of schools of pelagic fish. Although the equations lack detailed balance and admit no H-theorem, we are able to develop a rather complete description of equilibrium profiles and large-time behavior, based on recent developments in complex function theory for Bernstein and Pick functions. In the large-population continuum limit, a scaling-invariant regime is reached in which all equilibria are determined by a single scaling profile. This universal profile exhibits power-law behavior crossing over from exponent -2/3 for small size to -3/2 for large size, with an exponential cutoff.

  3. Coagulation-Fragmentation Model for Animal Group-Size Statistics

    Science.gov (United States)

    Degond, Pierre; Liu, Jian-Guo; Pego, Robert L.

    2016-10-01

    We study coagulation-fragmentation equations inspired by a simple model proposed in fisheries science to explain data for the size distribution of schools of pelagic fish. Although the equations lack detailed balance and admit no H-theorem, we are able to develop a rather complete description of equilibrium profiles and large-time behavior, based on recent developments in complex function theory for Bernstein and Pick functions. In the large-population continuum limit, a scaling-invariant regime is reached in which all equilibria are determined by a single scaling profile. This universal profile exhibits power-law behavior crossing over from exponent -2/3 for small size to -3/2 for large size, with an exponential cutoff.

  4. Semantic Importance Sampling for Statistical Model Checking

    Science.gov (United States)

    2015-01-16

    approach called Statistical Model Checking (SMC) [16], which relies on Monte - Carlo -based simulations to solve this verification task more scalably...Conclusion Statistical model checking (SMC) is a prominent approach for rigorous analysis of stochastic systems using Monte - Carlo simulations. In this... Monte - Carlo simulations, for computing the bounded probability that a specific event occurs during a stochastic system’s execution. Estimating the

  5. Importance of Sample Size for the Estimation of Repeater F Waves in Amyotrophic Lateral Sclerosis

    Directory of Open Access Journals (Sweden)

    Jia Fang

    2015-01-01

    Full Text Available Background: In amyotrophic lateral sclerosis (ALS, repeater F waves are increased. Accurate assessment of repeater F waves requires an adequate sample size. Methods: We studied the F waves of left ulnar nerves in ALS patients. Based on the presence or absence of pyramidal signs in the left upper limb, the ALS patients were divided into two groups: One group with pyramidal signs designated as P group and the other without pyramidal signs designated as NP group. The Index repeating neurons (RN and Index repeater F waves (Freps were compared among the P, NP and control groups following 20 and 100 stimuli respectively. For each group, the Index RN and Index Freps obtained from 20 and 100 stimuli were compared. Results: In the P group, the Index RN (P = 0.004 and Index Freps (P = 0.001 obtained from 100 stimuli were significantly higher than from 20 stimuli. For F waves obtained from 20 stimuli, no significant differences were identified between the P and NP groups for Index RN (P = 0.052 and Index Freps (P = 0.079; The Index RN (P < 0.001 and Index Freps (P < 0.001 of the P group were significantly higher than the control group; The Index RN (P = 0.002 of the NP group was significantly higher than the control group. For F waves obtained from 100 stimuli, the Index RN (P < 0.001 and Index Freps (P < 0.001 of the P group were significantly higher than the NP group; The Index RN (P < 0.001 and Index Freps (P < 0.001 of the P and NP groups were significantly higher than the control group. Conclusions: Increased repeater F waves reflect increased excitability of motor neuron pool and indicate upper motor neuron dysfunction in ALS. For an accurate evaluation of repeater F waves in ALS patients especially those with moderate to severe muscle atrophy, 100 stimuli would be required.

  6. Bilayer Thickness Mismatch Controls Domain Size in Model Membranes

    Energy Technology Data Exchange (ETDEWEB)

    Heberle, Frederick A [ORNL; Petruzielo, Robin S [ORNL; Pan, Jianjun [ORNL; Drazba, Paul [ORNL; Kucerka, Norbert [Canadian Neutron Beam Centre and Comelius University (Slovakia); Feigenson, Gerald [Cornell University; Katsaras, John [ORNL

    2013-01-01

    The observation of lateral phase separation in lipid bilayers has received considerable attention, especially in connection to lipid raft phenomena in cells. It is widely accepted that rafts play a central role in cellular processes, notably signal transduction. While micrometer-sized domains are observed with some model membrane mixtures, rafts much smaller than 100 nm beyond the reach of optical microscopy are now thought to exist, both in vitro and in vivo. We have used small-angle neutron scattering, a probe free technique, to measure the size of nanoscopic membrane domains in unilamellar vesicles with unprecedented accuracy. These experiments were performed using a four-component model system containing fixed proportions of cholesterol and the saturated phospholipid 1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC), mixed with varying amounts of the unsaturated phospholipids 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) and 1,2-dioleoylsn- glycero-3-phosphocholine (DOPC). We find that liquid domain size increases with the extent of acyl chain unsaturation (DOPC:POPC ratio). Furthermore, we find a direct correlation between domain size and the mismatch in bilayer thickness of the coexisting liquid-ordered and liquid-disordered phases, suggesting a dominant role for line tension in controlling domain size. While this result is expected from line tension theories, we provide the first experimental verification in free-floating bilayers. Importantly, we also find that changes in bilayer thickness, which accompany changes in the degree of lipid chain unsaturation, are entirely confined to the disordered phase. Together, these results suggest how the size of functional domains in homeothermic cells may be regulated through changes in lipid composition.

  7. EMPIRICAL MODEL FOR HYDROCYCLONES CORRECTED CUT SIZE CALCULATION

    Directory of Open Access Journals (Sweden)

    André Carlos Silva

    2012-12-01

    Full Text Available Hydrocyclones are devices worldwide used in mineral processing for desliming, classification, selective classification, thickening and pre-concentration. A hydrocyclone is composed by one cylindrical and one conical section joint together, without any moving parts and it is capable of perform granular material separation in pulp. The mineral particles separation mechanism acting in a hydrocyclone is complex and its mathematical modelling is usually empirical. The most used model for hydrocyclone corrected cut size is proposed by Plitt. Over the years many revisions and corrections to Plitt´s model were proposed. The present paper shows a modification in the Plitt´s model constant, obtained by exponential regression of simulated data for three different hydrocyclones geometry: Rietema, Bradley and Krebs. To validate the proposed model literature data obtained from phosphate ore using fifteen different hydrocyclones geometry are used. The proposed model shows a correlation equals to 88.2% between experimental and calculated corrected cut size, while the correlation obtained using Plitt´s model is 11.5%.

  8. On Angular Sampling Methods for 3-D Spatial Channel Models

    DEFF Research Database (Denmark)

    Fan, Wei; Jämsä, Tommi; Nielsen, Jesper Ødum

    2015-01-01

    This paper discusses generating three dimensional (3D) spatial channel models with emphasis on the angular sampling methods. Three angular sampling methods, i.e. modified uniform power sampling, modified uniform angular sampling, and random pairing methods are proposed and investigated in detail....

  9. Evaluation of 1H NMR relaxometry for the assessment of pore size distribution in soil samples

    NARCIS (Netherlands)

    Jaeger, F.; Bowe, S.; As, van H.; Schaumann, G.E.

    2009-01-01

    1H NMR relaxometry is used in earth science as a non-destructive and time-saving method to determine pore size distributions (PSD) in porous media with pore sizes ranging from nm to mm. This is a broader range than generally reported for results from X-ray computed tomography (X-ray CT) scanning, wh

  10. Grain size of loess and paleosol samples: what are we measuring?

    Science.gov (United States)

    Varga, György; Kovács, János; Szalai, Zoltán; Újvári, Gábor

    2017-04-01

    Particle size falling into a particularly narrow range is among the most important properties of windblown mineral dust deposits. Therefore, various aspects of aeolian sedimentation and post-depositional alterations can be reconstructed only from precise grain size data. Present study is aimed at (1) reviewing grain size data obtained from different measurements, (2) discussing the major reasons for disagreements between data obtained by frequently applied particle sizing techniques, and (3) assesses the importance of particle shape in particle sizing. Grain size data of terrestrial aeolian dust deposits (loess and paleosoil) were determined by laser scattering instruments (Fritsch Analysette 22 Microtec Plus, Horiba Partica La-950 v2 and Malvern Mastersizer 3000 with a Hydro Lv unit), while particles size and shape distributions were acquired by Malvern Morphologi G3-ID. Laser scattering results reveal that the optical parameter settings of the measurements have significant effects on the grain size distributions, especially for the fine-grained fractions (Innovation Office (Hungary) under contract NKFI 120620 is gratefully acknowledged. It was additionally supported (for G. Varga) by the Bolyai János Research Scholarship of the Hungarian Academy of Sciences.

  11. Raindrop size distribution: Fitting performance of common theoretical models

    Science.gov (United States)

    Adirosi, E.; Volpi, E.; Lombardo, F.; Baldini, L.

    2016-10-01

    Modelling raindrop size distribution (DSD) is a fundamental issue to connect remote sensing observations with reliable precipitation products for hydrological applications. To date, various standard probability distributions have been proposed to build DSD models. Relevant questions to ask indeed are how often and how good such models fit empirical data, given that the advances in both data availability and technology used to estimate DSDs have allowed many of the deficiencies of early analyses to be mitigated. Therefore, we present a comprehensive follow-up of a previous study on the comparison of statistical fitting of three common DSD models against 2D-Video Distrometer (2DVD) data, which are unique in that the size of individual drops is determined accurately. By maximum likelihood method, we fit models based on lognormal, gamma and Weibull distributions to more than 42.000 1-minute drop-by-drop data taken from the field campaigns of the NASA Ground Validation program of the Global Precipitation Measurement (GPM) mission. In order to check the adequacy between the models and the measured data, we investigate the goodness of fit of each distribution using the Kolmogorov-Smirnov test. Then, we apply a specific model selection technique to evaluate the relative quality of each model. Results show that the gamma distribution has the lowest KS rejection rate, while the Weibull distribution is the most frequently rejected. Ranking for each minute the statistical models that pass the KS test, it can be argued that the probability distributions whose tails are exponentially bounded, i.e. light-tailed distributions, seem to be adequate to model the natural variability of DSDs. However, in line with our previous study, we also found that frequency distributions of empirical DSDs could be heavy-tailed in a number of cases, which may result in severe uncertainty in estimating statistical moments and bulk variables.

  12. Modeling grain size variations of aeolian gypsum deposits at White Sands, New Mexico, using AVIRIS imagery

    Science.gov (United States)

    Ghrefat, H.A.; Goodell, P.C.; Hubbard, B.E.; Langford, R.P.; Aldouri, R.E.

    2007-01-01

    Visible and Near-Infrared (VNIR) through Short Wavelength Infrared (SWIR) (0.4-2.5????m) AVIRIS data, along with laboratory spectral measurements and analyses of field samples, were used to characterize grain size variations in aeolian gypsum deposits across barchan-transverse, parabolic, and barchan dunes at White Sands, New Mexico, USA. All field samples contained a mineralogy of ?????100% gypsum. In order to document grain size variations at White Sands, surficial gypsum samples were collected along three Transects parallel to the prevailing downwind direction. Grain size analyses were carried out on the samples by sieving them into seven size fractions ranging from 45 to 621????m, which were subjected to spectral measurements. Absorption band depths of the size fractions were determined after applying an automated continuum-removal procedure to each spectrum. Then, the relationship between absorption band depth and gypsum size fraction was established using a linear regression. Three software processing steps were carried out to measure the grain size variations of gypsum in the Dune Area using AVIRIS data. AVIRIS mapping results, field work and laboratory analysis all show that the interdune areas have lower absorption band depth values and consist of finer grained gypsum deposits. In contrast, the dune crest areas have higher absorption band depth values and consist of coarser grained gypsum deposits. Based on laboratory estimates, a representative barchan-transverse dune (Transect 1) has a mean grain size of 1.16 ??{symbol} (449????m). The error bar results show that the error ranges from - 50 to + 50????m. Mean grain size for a representative parabolic dune (Transect 2) is 1.51 ??{symbol} (352????m), and 1.52 ??{symbol} (347????m) for a representative barchan dune (Transect 3). T-test results confirm that there are differences in the grain size distributions between barchan and parabolic dunes and between interdune and dune crest areas. The t-test results

  13. A Gamma Model for Mixture STR Samples

    DEFF Research Database (Denmark)

    Christensen, Susanne; Bøttcher, Susanne Gammelgaard; Morling, Niels

    This project investigates the behavior of the PCR Amplification Kit. A number of known DNA-profiles are mixed two by two in "known" proportions and analyzed. Gamma distribution models are fitted to the resulting data to learn to what extent actual mixing proportions can be rediscovered in the amp......This project investigates the behavior of the PCR Amplification Kit. A number of known DNA-profiles are mixed two by two in "known" proportions and analyzed. Gamma distribution models are fitted to the resulting data to learn to what extent actual mixing proportions can be rediscovered...... in the amplifier output and thereby the question of confidence in separate DNA -profiles suggested by an output is addressed....

  14. State-space size considerations for disease-progression models.

    Science.gov (United States)

    Regnier, Eva D; Shechter, Steven M

    2013-09-30

    Markov models of disease progression are widely used to model transitions in patients' health state over time. Usually, patients' health status may be classified according to a set of ordered health states. Modelers lump together similar health states into a finite and usually small, number of health states that form the basis of a Markov chain disease-progression model. This increases the number of observations used to estimate each parameter in the transition probability matrix. However, lumping together observably distinct health states also obscures distinctions among them and may reduce the predictive power of the model. Moreover, as we demonstrate, precision in estimating the model parameters does not necessarily improve as the number of states in the model declines. This paper explores the tradeoff between lumping error introduced by grouping distinct health states and sampling error that arises when there are insufficient patient data to precisely estimate the transition probability matrix. Copyright © 2013 John Wiley & Sons, Ltd.

  15. Automated Gel Size Selection to Improve the Quality of Next-generation Sequencing Libraries Prepared from Environmental Water Samples.

    Science.gov (United States)

    Uyaguari-Diaz, Miguel I; Slobodan, Jared R; Nesbitt, Matthew J; Croxen, Matthew A; Isaac-Renton, Judith; Prystajecky, Natalie A; Tang, Patrick

    2015-04-17

    Next-generation sequencing of environmental samples can be challenging because of the variable DNA quantity and quality in these samples. High quality DNA libraries are needed for optimal results from next-generation sequencing. Environmental samples such as water may have low quality and quantities of DNA as well as contaminants that co-precipitate with DNA. The mechanical and enzymatic processes involved in extraction and library preparation may further damage the DNA. Gel size selection enables purification and recovery of DNA fragments of a defined size for sequencing applications. Nevertheless, this task is one of the most time-consuming steps in the DNA library preparation workflow. The protocol described here enables complete automation of agarose gel loading, electrophoretic analysis, and recovery of targeted DNA fragments. In this study, we describe a high-throughput approach to prepare high quality DNA libraries from freshwater samples that can be applied also to other environmental samples. We used an indirect approach to concentrate bacterial cells from environmental freshwater samples; DNA was extracted using a commercially available DNA extraction kit, and DNA libraries were prepared using a commercial transposon-based protocol. DNA fragments of 500 to 800 bp were gel size selected using Ranger Technology, an automated electrophoresis workstation. Sequencing of the size-selected DNA libraries demonstrated significant improvements to read length and quality of the sequencing reads.

  16. Squares of different sizes: effect of geographical projection on model parameter estimates in species distribution modeling.

    Science.gov (United States)

    Budic, Lara; Didenko, Gregor; Dormann, Carsten F

    2016-01-01

    In species distribution analyses, environmental predictors and distribution data for large spatial extents are often available in long-lat format, such as degree raster grids. Long-lat projections suffer from unequal cell sizes, as a degree of longitude decreases in length from approximately 110 km at the equator to 0 km at the poles. Here we investigate whether long-lat and equal-area projections yield similar model parameter estimates, or result in a consistent bias. We analyzed the environmental effects on the distribution of 12 ungulate species with a northern distribution, as models for these species should display the strongest effect of projectional distortion. Additionally we choose four species with entirely continental distributions to investigate the effect of incomplete cell coverage at the coast. We expected that including model weights proportional to the actual cell area should compensate for the observed bias in model coefficients, and similarly that using land coverage of a cell should decrease bias in species with coastal distribution. As anticipated, model coefficients were different between long-lat and equal-area projections. Having progressively smaller and a higher number of cells with increasing latitude influenced the importance of parameters in models, increased the sample size for the northernmost parts of species ranges, and reduced the subcell variability of those areas. However, this bias could be largely removed by weighting long-lat cells by the area they cover, and marginally by correcting for land coverage. Overall we found little effect of using long-lat rather than equal-area projections in our analysis. The fitted relationship between environmental parameters and occurrence probability differed only very little between the two projection types. We still recommend using equal-area projections to avoid possible bias. More importantly, our results suggest that the cell area and the proportion of a cell covered by land should be

  17. Importance of Sample Size for the Estimation of Repeater F Waves in Amyotrophic Lateral Sclerosis

    Institute of Scientific and Technical Information of China (English)

    Jia Fang; Ming-Sheng Liu; Yu-Zhou Guan; Bo Cui; Li-Ying Cui

    2015-01-01

    Background:In amyotrophic lateral sclerosis (ALS),repeater F waves are increased.Accurate assessment of repeater F waves requires an adequate sample size.Methods:We studied the F waves of left ulnar nerves in ALS patients.Based on the presence or absence of pyramidal signs in the left upper limb,the ALS patients were divided into two groups:One group with pyramidal signs designated as P group and the other without pyramidal signs designated as NP group.The Index repeating neurons (RN) and Index repeater F waves (Freps) were compared among the P,NP and control groups following 20 and 100 stimuli respectively.For each group,the Index RN and Index Freps obtained from 20 and 100 stimuli were compared.Results:In the P group,the Index RN (P =0.004) and Index Freps (P =0.001) obtained from 100 stimuli were significantly higher than from 20 stimuli.For F waves obtained from 20 stimuli,no significant differences were identified between the P and NP groups for Index RN (P =0.052) and Index Freps (P =0.079); The Index RN (P < 0.001) and Index Freps (P < 0.001) of the P group were significantly higher than the control group; The Index RN (P =0.002) of the NP group was significantly higher than the control group.For F waves obtained from 100 stimuli,the Index RN (P < 0.001) and Index Freps (P < 0.001) of the P group were significantly higher than the NP group; The Index RN (P < 0.001) and Index Freps (P < 0.001) of the P and NP groups were significantly higher than the control group.Conclusions:Increased repeater F waves reflect increased excitability of motor neuron pool and indicate upper motor neuron dysfunction in ALS.For an accurate evaluation of repeater F waves in ALS patients especially those with moderate to severe muscle atrophy,100 stimuli would be required.

  18. Multiscale sampling of plant diversity: Effects of minimum mapping unit size

    Science.gov (United States)

    Stohlgren, T.J.; Chong, G.W.; Kalkhan, M.A.; Schell, L.D.

    1997-01-01

    Only a small portion of any landscape can be sampled for vascular plant diversity because of constraints of cost (salaries, travel time between sites, etc.). Often, the investigator decides to reduce the cost of creating a vegetation map by increasing the minimum mapping unit (MMU), and/or by reducing the number of vegetation classes to be considered. Questions arise about what information is sacrificed when map resolution is decreased. We compared plant diversity patterns from vegetation maps made with 100-ha, 50-ha, 2-ha, and 0.02-ha MMUs in a 754-ha study area in Rocky Mountain National Park, Colorado, United States, using four 0.025-ha and 21 0.1-ha multiscale vegetation plots. We developed and tested species-log(area) curves, correcting the curves for within-vegetation type heterogeneity with Jaccard's coefficients. Total species richness in the study area was estimated from vegetation maps at each resolution (MMU), based on the corrected species-area curves, total area of the vegetation type, and species overlap among vegetation types. With the 0.02-ha MMU, six vegetation types were recovered, resulting in an estimated 552 species (95% CI = 520-583 species) in the 754-ha study area (330 plant species were observed in the 25 plots). With the 2-ha MMU, five vegetation types were recognized, resulting in an estimated 473 species for the study area. With the 50-ha MMU, 439 plant species were estimated for the four vegetation types recognized in the study area. With the 100-ha MMU, only three vegetation types were recognized, resulting in an estimated 341 plant species for the study area. Locally rare species and keystone ecosystems (areas of high or unique plant diversity) were missed at the 2-ha, 50-ha, and 100-ha scales. To evaluate the effects of minimum mapping unit size requires: (1) an initial stratification of homogeneous, heterogeneous, and rare habitat types; and (2) an evaluation of within-type and between-type heterogeneity generated by environmental

  19. Sampling of illicit drugs for quantitative analysis--part II. Study of particle size and its influence on mass reduction.

    Science.gov (United States)

    Bovens, M; Csesztregi, T; Franc, A; Nagy, J; Dujourdy, L

    2014-01-01

    The basic goal in sampling for the quantitative analysis of illicit drugs is to maintain the average concentration of the drug in the material from its original seized state (the primary sample) all the way through to the analytical sample, where the effect of particle size is most critical. The size of the largest particles of different authentic illicit drug materials, in their original state and after homogenisation, using manual or mechanical procedures, was measured using a microscope with a camera attachment. The comminution methods employed included pestle and mortar (manual) and various ball and knife mills (mechanical). The drugs investigated were amphetamine, heroin, cocaine and herbal cannabis. It was shown that comminution of illicit drug materials using these techniques reduces the nominal particle size from approximately 600 μm down to between 200 and 300 μm. It was demonstrated that the choice of 1 g increments for the primary samples of powdered drugs and cannabis resin, which were used in the heterogeneity part of our study (Part I) was correct for the routine quantitative analysis of illicit seized drugs. For herbal cannabis we found that the appropriate increment size was larger. Based on the results of this study we can generally state that: An analytical sample weight of between 20 and 35 mg of an illicit powdered drug, with an assumed purity of 5% or higher, would be considered appropriate and would generate an RSDsampling in the same region as the RSDanalysis for a typical quantitative method of analysis for the most common, powdered, illicit drugs. For herbal cannabis, with an assumed purity of 1% THC (tetrahydrocannabinol) or higher, an analytical sample weight of approximately 200 mg would be appropriate. In Part III we will pull together our homogeneity studies and particle size investigations and use them to devise sampling plans and sample preparations suitable for the quantitative instrumental analysis of the most common illicit

  20. Core size effect on the dry and saturated ultrasonic pulse velocity of limestone samples.

    Science.gov (United States)

    Ercikdi, Bayram; Karaman, Kadir; Cihangir, Ferdi; Yılmaz, Tekin; Aliyazıcıoğlu, Şener; Kesimal, Ayhan

    2016-12-01

    This study presents the effect of core length on the saturated (UPVsat) and dry (UPVdry) P-wave velocities of four different biomicritic limestone samples, namely light grey (BL-LG), dark grey (BL-DG), reddish (BL-R) and yellow (BL-Y), using core samples having different lengths (25-125mm) at a constant diameter (54.7mm). The saturated P-wave velocity (UPVsat) of all core samples generally decreased with increasing the sample length. However, the dry P-wave velocity (UPVdry) of samples obtained from BL-LG and BL-Y limestones increased with increasing the sample length. In contrast to the literature, the dry P-wave velocity (UPVdry) values of core samples having a length of 75, 100 and 125mm were consistently higher (2.8-46.2%) than those of saturated (UPVsat). Chemical and mineralogical analyses have shown that the P wave velocity is very sensitive to the calcite and clay minerals potentially leading to the weakening/disintegration of rock samples in the presence of water. Severe fluctuations in UPV values were observed to occur between 25 and 75mm sample lengths, thereafter, a trend of stabilization was observed. The maximum variation of UPV values between the sample length of 75mm and 125mm was only 7.3%. Therefore, the threshold core sample length was interpreted as 75mm for UPV measurement in biomicritic limestone samples used in this study.

  1. Large scale inference in the Infinite Relational Model: Gibbs sampling is not enough

    DEFF Research Database (Denmark)

    Albers, Kristoffer Jon; Moth, Andreas Leon Aagard; Mørup, Morten

    2013-01-01

    The stochastic block-model and its non-parametric extension, the Infinite Relational Model (IRM), have become key tools for discovering group-structure in complex networks. Identifying these groups is a combinatorial inference problem which is usually solved by Gibbs sampling. However, whether...... Gibbs sampling suffices and can be scaled to the modeling of large scale real world complex networks has not been examined sufficiently. In this paper we evaluate the performance and mixing ability of Gibbs sampling in the Infinite Relational Model (IRM) by implementing a high performance Gibbs sampler....... We find that Gibbs sampling can be computationally scaled to handle millions of nodes and billions of links. Investigating the behavior of the Gibbs sampler for different sizes of networks we find that the mixing ability decreases drastically with the network size, clearly indicating a need...

  2. Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols

    DEFF Research Database (Denmark)

    Chan, A.W.; Hrobjartsson, A.; Jorgensen, K.J.

    2008-01-01

    in 1994-5 by the scientific-ethics committees for Copenhagen and Frederiksberg, Denmark (n=70). MAIN OUTCOME MEASURE: Proportion of protocols and publications that did not provide key information about sample size calculations and statistical methods; proportion of trials with discrepancies between......OBJECTIVE: To evaluate how often sample size calculations and methods of statistical analysis are pre-specified or changed in randomised trials. DESIGN: Retrospective cohort study. Data source Protocols and journal publications of published randomised parallel group trials initially approved...... information presented in the protocol and the publication. RESULTS: Only 11/62 trials described existing sample size calculations fully and consistently in both the protocol and the publication. The method of handling protocol deviations was described in 37 protocols and 43 publications. The method...

  3. Sample Size Planning for the Squared Multiple Correlation Coefficient: Accuracy in Parameter Estimation via Narrow Confidence Intervals.

    Science.gov (United States)

    Kelley, Ken

    2008-01-01

    Methods of sample size planning are developed from the accuracy in parameter approach in the multiple regression context in order to obtain a sufficiently narrow confidence interval for the population squared multiple correlation coefficient when regressors are random. Approximate and exact methods are developed that provide necessary sample size so that the expected width of the confidence interval will be sufficiently narrow. Modifications of these methods are then developed so that necessary sample size will lead to sufficiently narrow confidence intervals with no less than some desired degree of assurance. Computer routines have been developed and are included within the MBESS R package so that the methods discussed in the article can be implemented. The methods and computer routines are demonstrated using an empirical example linking innovation in the health services industry with previous innovation, personality factors, and group climate characteristics.

  4. PET/CT in cancer: moderate sample sizes may suffice to justify replacement of a regional gold standard

    DEFF Research Database (Denmark)

    Gerke, Oke; Poulsen, Mads Hvid; Bouchelouche, Kirsten

    2009-01-01

    PURPOSE: For certain cancer indications, the current patient evaluation strategy is a perfect but locally restricted gold standard procedure. If positron emission tomography/computed tomography (PET/CT) can be shown to be reliable within the gold standard region and if it can be argued that PET....../CT also performs well in adjacent areas, then sample sizes in accuracy studies can be reduced. PROCEDURES: Traditional standard power calculations for demonstrating sensitivities of both 80% and 90% are shown. The argument is then described in general terms and demonstrated by an ongoing study...... of metastasized prostate cancer. RESULTS: An added value in accuracy of PET/CT in adjacent areas can outweigh a downsized target level of accuracy in the gold standard region, justifying smaller sample sizes. CONCLUSIONS: If PET/CT provides an accuracy benefit in adjacent regions, then sample sizes can be reduced...

  5. Bayesian hierarchical model used to analyze regression between fish body size and scale size: application to rare fish species Zingel asper

    Directory of Open Access Journals (Sweden)

    Fontez B.

    2014-04-01

    Full Text Available Back-calculation allows to increase available data on fish growth. The accuracy of back-calculation models is of paramount importance for growth analysis. Frequentist and Bayesian hierarchical approaches were used for regression between fish body size and scale size for the rare fish species Zingel asper. The Bayesian approach permits more reliable estimation of back-calculated size, taking into account biological information and cohort variability. This method greatly improves estimation of back-calculated length when sampling is uneven and/or small.

  6. Modeling the impact of soil aggregate size on selenium immobilization

    Directory of Open Access Journals (Sweden)

    M. F. Kausch

    2012-09-01

    Full Text Available Soil aggregates are mm- to cm-sized microporous structures separated by macropores. Whereas fast advective transport prevails in macropores, advection is inhibited by the low permeability of intra-aggregate micropores. This can lead to mass transfer limitations and the formation of aggregate-scale concentration gradients affecting the distribution and transport of redox sensitive elements. Selenium (Se mobilized through irrigation of seleniferous soils has emerged as a major aquatic contaminant. In the absence of oxygen, the bioavailable oxyanions selenate, Se(VI, and selenite, Se(IV, can be microbially reduced to solid, elemental Se, Se(0, and anoxic microzones within soil aggregates are thought to promote this process in otherwise well aerated soils.

    To evaluate the impact of soil aggregate size on selenium retention, we developed a dynamic 2-D reactive transport model of selenium cycling in a single idealized aggregate surrounded by a macropore. The model was developed based on flow-through-reactor experiments involving artificial soil aggregates (diameter: 2.5 cm made of sand and containing Enterobacter cloacae SLD1a-1 that reduces Se(VI via Se(IV to Se(0. Aggregates were surrounded by a constant flow providing Se(VI and pyruvate under oxic or anoxic conditions. In the model, reactions were implemented with double-Monod rate equations coupled to the transport of pyruvate, O2, and Se-species. The spatial and temporal dynamics of the model were validated with data from experiments and predictive simulations were performed covering aggregate sizes between 1 and 2.5 cm diameter.

    Simulations predict that selenium retention scales with aggregate size. Depending on O2, Se(VI, and pyruvate concentrations, selenium retention was 4–23 times higher in 2.5-cm-aggregates compared to 1-cm-aggregates. Under oxic conditions, aggregate size and pyruvate-concentrations were found to have a positive synergistic

  7. Modeling the impact of soil aggregate size on selenium immobilization

    Science.gov (United States)

    Kausch, M. F.; Pallud, C. E.

    2013-03-01

    Soil aggregates are mm- to cm-sized microporous structures separated by macropores. Whereas fast advective transport prevails in macropores, advection is inhibited by the low permeability of intra-aggregate micropores. This can lead to mass transfer limitations and the formation of aggregate scale concentration gradients affecting the distribution and transport of redox sensitive elements. Selenium (Se) mobilized through irrigation of seleniferous soils has emerged as a major aquatic contaminant. In the absence of oxygen, the bioavailable oxyanions selenate, Se(VI), and selenite, Se(IV), can be microbially reduced to solid, elemental Se, Se(0), and anoxic microzones within soil aggregates are thought to promote this process in otherwise well-aerated soils. To evaluate the impact of soil aggregate size on selenium retention, we developed a dynamic 2-D reactive transport model of selenium cycling in a single idealized aggregate surrounded by a macropore. The model was developed based on flow-through-reactor experiments involving artificial soil aggregates (diameter: 2.5 cm) made of sand and containing Enterobacter cloacae SLD1a-1 that reduces Se(VI) via Se(IV) to Se(0). Aggregates were surrounded by a constant flow providing Se(VI) and pyruvate under oxic or anoxic conditions. In the model, reactions were implemented with double-Monod rate equations coupled to the transport of pyruvate, O2, and Se species. The spatial and temporal dynamics of the model were validated with data from experiments, and predictive simulations were performed covering aggregate sizes 1-2.5 cm in diameter. Simulations predict that selenium retention scales with aggregate size. Depending on O2, Se(VI), and pyruvate concentrations, selenium retention was 4-23 times higher in 2.5 cm aggregates compared to 1 cm aggregates. Under oxic conditions, aggregate size and pyruvate concentrations were found to have a positive synergistic effect on selenium retention. Promoting soil aggregation on

  8. Modeling the impact of soil aggregate size on selenium immobilization

    Directory of Open Access Journals (Sweden)

    M. F. Kausch

    2013-03-01

    Full Text Available Soil aggregates are mm- to cm-sized microporous structures separated by macropores. Whereas fast advective transport prevails in macropores, advection is inhibited by the low permeability of intra-aggregate micropores. This can lead to mass transfer limitations and the formation of aggregate scale concentration gradients affecting the distribution and transport of redox sensitive elements. Selenium (Se mobilized through irrigation of seleniferous soils has emerged as a major aquatic contaminant. In the absence of oxygen, the bioavailable oxyanions selenate, Se(VI, and selenite, Se(IV, can be microbially reduced to solid, elemental Se, Se(0, and anoxic microzones within soil aggregates are thought to promote this process in otherwise well-aerated soils. To evaluate the impact of soil aggregate size on selenium retention, we developed a dynamic 2-D reactive transport model of selenium cycling in a single idealized aggregate surrounded by a macropore. The model was developed based on flow-through-reactor experiments involving artificial soil aggregates (diameter: 2.5 cm made of sand and containing Enterobacter cloacae SLD1a-1 that reduces Se(VI via Se(IV to Se(0. Aggregates were surrounded by a constant flow providing Se(VI and pyruvate under oxic or anoxic conditions. In the model, reactions were implemented with double-Monod rate equations coupled to the transport of pyruvate, O2, and Se species. The spatial and temporal dynamics of the model were validated with data from experiments, and predictive simulations were performed covering aggregate sizes 1–2.5 cm in diameter. Simulations predict that selenium retention scales with aggregate size. Depending on O2, Se(VI, and pyruvate concentrations, selenium retention was 4–23 times higher in 2.5 cm aggregates compared to 1 cm aggregates. Under oxic conditions, aggregate size and pyruvate concentrations were found to have a positive synergistic effect on selenium retention. Promoting soil

  9. Ultrastructural model for size selectivity in glomerular filtration.

    Science.gov (United States)

    Edwards, A; Daniels, B S; Deen, W M

    1999-06-01

    A theoretical model was developed to relate the size selectivity of the glomerular barrier to the structural characteristics of the individual layers of the capillary wall. Thicknesses and other linear dimensions were evaluated, where possible, from previous electron microscopic studies. The glomerular basement membrane (GBM) was represented as a homogeneous material characterized by a Darcy permeability and by size-dependent hindrance coefficients for diffusion and convection, respectively; those coefficients were estimated from recent data obtained with isolated rat GBM. The filtration slit diaphragm was modeled as a single row of cylindrical fibers of equal radius but nonuniform spacing. The resistances of the remainder of the slit channel, and of the endothelial fenestrae, to macromolecule movement were calculated to be negligible. The slit diaphragm was found to be the most restrictive part of the barrier. Because of that, macromolecule concentrations in the GBM increased, rather than decreased, in the direction of flow. Thus the overall sieving coefficient (ratio of Bowman's space concentration to that in plasma) was predicted to be larger for the intact capillary wall than for a hypothetical structure with no GBM. In other words, because the slit diaphragm and GBM do not act independently, the overall sieving coefficient is not simply the product of those for GBM alone and the slit diaphragm alone. Whereas the calculated sieving coefficients were sensitive to the structural features of the slit diaphragm and to the GBM hindrance coefficients, variations in GBM thickness or filtration slit frequency were predicted to have little effect. The ability of the ultrastructural model to represent fractional clearance data in vivo was at least equal to that of conventional pore models with the same number of adjustable parameters. The main strength of the present approach, however, is that it provides a framework for relating structural findings to the size

  10. STUDIES ON PROPERTY OF SAMPLE SIZE AND DIFFERENT TRAITS FOR CORE COLLECTIONS BASED ON GENOTYPIC VALUES OF COTTON

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Studies were conducted on specific core collections constructedon the basis of different traits and sample size by the method of stepwise cluster with three sampling strategies based on genotypic values of cotton.A total of 21 traits (11 agronomy traits,5 fiber traits and 5 seed traits) were used to construct main core collections.Specific core collections,as representative of the initial collection,were constructed by agronomy,fiber or seed trait,respectively.As compared with the main core collection,specific core collections tended to have similar property for maintaining genetic diversity of agronomy,seed or fiber traits.Core collections developed by about sample size of 17% (P2=0.17) and 24% (P1= 0.24) with three sampling strategies could be quite representative of the initial collection.

  11. The effect of the sample size and location on contrast ultrasound measurement of perfusion parameters.

    Science.gov (United States)

    Leinonen, Merja R; Raekallio, Marja R; Vainio, Outi M; Ruohoniemi, Mirja O; O'Brien, Robert T

    2011-01-01

    Contrast-enhanced ultrasound can be used to quantify tissue perfusion based on region of interest (ROI) analysis. The effect of the location and size of the ROI on the obtained perfusion parameters has been described in phantom, ex vivo and in vivo studies. We assessed the effects of location and size of the ROI on perfusion parameters in the renal cortex of 10 healthy, anesthetized cats using Definity contrast-enhanced ultrasound to estimate the importance of the ROI on quantification of tissue perfusion with contrast-enhanced ultrasound. Three separate sets of ROIs were placed in the renal cortex, varying in location, size or depth. There was a significant inverse association between increased depth or increased size of the ROI and peak intensity (P < 0.05). There was no statistically significant difference in the peak intensity between the ROIs placed in a row in the near field cortex. There was no significant difference in the ROIs with regard to arrival time, time to peak intensity and wash-in rate. When comparing two different ROIs in a patient with focal lesions, such as suspected neoplasia or infarction, the ROIs should always be placed at same depth and be as similar in size as possible.

  12. Evaluating the performance of species richness estimators: sensitivity to sample grain size

    DEFF Research Database (Denmark)

    Hortal, Joaquín; Borges, Paulo A. V.; Gaspar, Clara

    2006-01-01

    . Data obtained with standardized sampling of 78 transects in natural forest remnants of five islands were aggregated in seven different grains (i.e. ways of defining a single sample): islands, natural areas, transects, pairs of traps, traps, database records and individuals to assess the effect of using...... in biodiversity studies. Owing to their inherent formulas, several nonparametric and asymptotic estimators present insensitivity to differences in the way the samples are aggregated. Thus, they could be used to compare species richness scores obtained from different sampling strategies. Our results also point out...

  13. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach.

    Directory of Open Access Journals (Sweden)

    Simon Boitard

    2016-03-01

    Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.

  14. Statistical traffic modeling of MPEG frame size: Experiments and Analysis

    Directory of Open Access Journals (Sweden)

    Haniph A. Latchman

    2009-12-01

    Full Text Available For guaranteed quality of service (QoS and sufficient bandwidth in a communication network which provides an integrated multimedia service, it is important to obtain an analytical and tractable model of the compressed MPEG data. This paper presents a statistical approach to a group of picture (GOP MPEG frame size model to increase network traffic performance in a communication network. We extract MPEG frame data from commercial DVD movies and make probability histograms to analyze the statistical characteristics of MPEG frame data. Six candidates of probability distributions are considered here and their parameters are obtained from the empirical data using the maximum likelihood estimation (MLE. This paper shows that the lognormal distribution is the best fitting model of MPEG-2 total frame data.

  15. ACTINIDE REMOVAL PROCESS SAMPLE ANALYSIS, CHEMICAL MODELING, AND FILTRATION EVALUATION

    Energy Technology Data Exchange (ETDEWEB)

    Martino, C.; Herman, D.; Pike, J.; Peters, T.

    2014-06-05

    Filtration within the Actinide Removal Process (ARP) currently limits the throughput in interim salt processing at the Savannah River Site. In this process, batches of salt solution with Monosodium Titanate (MST) sorbent are concentrated by crossflow filtration. The filtrate is subsequently processed to remove cesium in the Modular Caustic Side Solvent Extraction Unit (MCU) followed by disposal in saltstone grout. The concentrated MST slurry is washed and sent to the Defense Waste Processing Facility (DWPF) for vitrification. During recent ARP processing, there has been a degradation of filter performance manifested as the inability to maintain high filtrate flux throughout a multi-batch cycle. The objectives of this effort were to characterize the feed streams, to determine if solids (in addition to MST) are precipitating and causing the degraded performance of the filters, and to assess the particle size and rheological data to address potential filtration impacts. Equilibrium modelling with OLI Analyzer{sup TM} and OLI ESP{sup TM} was performed to determine chemical components at risk of precipitation and to simulate the ARP process. The performance of ARP filtration was evaluated to review potential causes of the observed filter behavior. Task activities for this study included extensive physical and chemical analysis of samples from the Late Wash Pump Tank (LWPT) and the Late Wash Hold Tank (LWHT) within ARP as well as samples of the tank farm feed from Tank 49H. The samples from the LWPT and LWHT were obtained from several stages of processing of Salt Batch 6D, Cycle 6, Batch 16.

  16. Modeling group size and scalar stress by logistic regression from an archaeological perspective.

    Directory of Open Access Journals (Sweden)

    Gianmarco Alberti

    Full Text Available Johnson's scalar stress theory, describing the mechanics of (and the remedies to the increase in in-group conflictuality that parallels the increase in groups' size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout. Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson's theory and on Dunbar's findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c allows locating a critical scalar stress threshold at community size 127 (95% CI: 122-132, while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147-170. The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion.

  17. A Monte-Carlo simulation analysis for evaluating the severity distribution functions (SDFs) calibration methodology and determining the minimum sample-size requirements.

    Science.gov (United States)

    Shirazi, Mohammadali; Reddy Geedipally, Srinivas; Lord, Dominique

    2017-01-01

    Severity distribution functions (SDFs) are used in highway safety to estimate the severity of crashes and conduct different types of safety evaluations and analyses. Developing a new SDF is a difficult task and demands significant time and resources. To simplify the process, the Highway Safety Manual (HSM) has started to document SDF models for different types of facilities. As such, SDF models have recently been introduced for freeway and ramps in HSM addendum. However, since these functions or models are fitted and validated using data from a few selected number of states, they are required to be calibrated to the local conditions when applied to a new jurisdiction. The HSM provides a methodology to calibrate the models through a scalar calibration factor. However, the proposed methodology to calibrate SDFs was never validated through research. Furthermore, there are no concrete guidelines to select a reliable sample size. Using extensive simulation, this paper documents an analysis that examined the bias between the 'true' and 'estimated' calibration factors. It was indicated that as the value of the true calibration factor deviates further away from '1', more bias is observed between the 'true' and 'estimated' calibration factors. In addition, simulation studies were performed to determine the calibration sample size for various conditions. It was found that, as the average of the coefficient of variation (CV) of the 'KAB' and 'C' crashes increases, the analyst needs to collect a larger sample size to calibrate SDF models. Taking this observation into account, sample-size guidelines are proposed based on the average CV of crash severities that are used for the calibration process.

  18. Method to study sample object size limit of small-angle x-ray scattering computed tomography

    Science.gov (United States)

    Choi, Mina; Ghammraoui, Bahaa; Badal, Andreu; Badano, Aldo

    2016-03-01

    Small-angle x-ray scattering (SAXS) imaging is an emerging medical tool that can be used for in vivo detailed tissue characterization and has the potential to provide added contrast to conventional x-ray projection and CT imaging. We used a publicly available MC-GPU code to simulate x-ray trajectories in a SAXS-CT geometry for a target material embedded in a water background material with varying sample sizes (1, 3, 5, and 10 mm). Our target materials were water solution of gold nanoparticle (GNP) spheres with a radius of 6 nm and a water solution with dissolved serum albumin (BSA) proteins due to their well-characterized scatter profiles at small angles and highly scattering properties. The background material was water. Our objective is to study how the reconstructed scatter profile degrades at larger target imaging depths and increasing sample sizes. We have found that scatter profiles of the GNP in water can still be reconstructed at depths up to 5 mm embedded at the center of a 10 mm sample. Scatter profiles of BSA in water were also reconstructed at depths up to 5 mm in a 10 mm sample but with noticeable signal degradation as compared to the GNP sample. This work presents a method to study the sample size limits for future SAXS-CT imaging systems.

  19. 45 CFR Appendix C to Part 1356 - Calculating Sample Size for NYTD Follow-Up Populations

    Science.gov (United States)

    2010-10-01

    ... Populations C Appendix C to Part 1356 Public Welfare Regulations Relating to Public Welfare (Continued) OFFICE... Follow-Up Populations 1. Using Finite Population Correction The Finite Population Correction (FPC) is applied when the sample is drawn from a population of one to 5,000 youth, because the sample is more...

  20. The accuracy of instrumental neutron activation analysis of kilogram-size inhomogeneous samples.

    Science.gov (United States)

    Blaauw, M; Lakmaker, O; van Aller, P

    1997-07-01

    The feasibility of quantitative instrumental neutron activation analysis (INAA) of samples in the kilogram range without internal standardization has been demonstrated by Overwater et al. (Anal. Chem. 1996, 68, 341). In their studies, however, they demonstrated only the agreement between the "corrected" γ ray spectrum of homogeneous large samples and that of small samples of the same material. In this paper, the k(0) calibration of the IRI facilities for large samples is described, and, this time in terms of (trace) element concentrations, some of Overwater's results for homogeneous materials are presented again, as well as results obtained from inhomogeneous materials and subsamples thereof. It is concluded that large-sample INAA can be as accurate as ordinary INAA, even when applied to inhomogeneous materials.

  1. Investigation of a Gamma model for mixture STR samples

    DEFF Research Database (Denmark)

    Christensen, Susanne; Bøttcher, Susanne Gammelgaard; Lauritzen, Steffen L.

    The behaviour of PCR Amplification Kit, when used for mixture STR samples, is investigated. A model based on the Gamma distribution is fitted to the amplifier output for constructed mixtures, and the assumptions of the model is evaluated via residual analysis.......The behaviour of PCR Amplification Kit, when used for mixture STR samples, is investigated. A model based on the Gamma distribution is fitted to the amplifier output for constructed mixtures, and the assumptions of the model is evaluated via residual analysis....

  2. (I Can't Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research.

    Science.gov (United States)

    van Rijnsoever, Frank J

    2017-01-01

    I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: "random chance," which is based on probability sampling, "minimal information," which yields at least one new code per sampling step, and "maximum information," which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario.

  3. The economic production lot size model with several production rates

    DEFF Research Database (Denmark)

    Larsen, Christian

    should be chosen in the interval between the demand rate and the production rate, which minimize unit production costs, and should be used in an increasing order. Then, given the production rates, we derive closed form solutions for the optimal runtimes as well as the minimum average cost. Finally we......We study an extension of the economic production lot size model, where more than one production rate can be used during a cycle. The production rates and their corresponding runtimes are decision variables. We decompose the problem into two subproblems. First, we show that all production rates...

  4. Does size matter? An investigation into the Rey Complex Figure in a pediatric clinical sample.

    Science.gov (United States)

    Loughan, Ashlee R; Perna, Robert B; Galbreath, Jennifer D

    2014-01-01

    The Rey Complex Figure Test (RCF) copy requires visuoconstructional skills and significant attentional, organizational, and problem-solving skills. Most scoring schemes codify a subset of the details involved in figure construction. Research is unclear regarding the meaning of figure size. The research hypothesis of our inquiry is that size of the RCF copy will have neuropsychological significance. Data from 95 children (43 girls, 52 boys; ages 6-18 years) with behavioral and academic issues revealed that larger figure drawings were associated with higher RCF total scores and significantly higher scores across many neuropsychological tests including the Wechsler Individual Achievement Test-Second Edition (WIAT-II) Word Reading (F = 5.448, p = .022), WIAT-II Math Reasoning (F = 6.365, p = .013), Children's Memory Scale Visual Delay (F = 4.015, p = .048), Trail-Making Test-Part A (F = 5.448, p = .022), and RCF Recognition (F = 4.862, p = .030). Results indicated that wider figures were associated with higher cognitive functioning, which may be part of an adaptive strategy in helping facilitate accurate and relative proportions of the complex details presented in the RCF. Overall, this study initiates the investigation of the RCF size and the relationship between size and a child's neuropsychological profile.

  5. Ion size effect on colloidal forces within the primitive model

    Directory of Open Access Journals (Sweden)

    J.Wu

    2005-01-01

    Full Text Available The effect of ion size on the mean force between a pair of isolated charged particles in an electrolyte solution is investigated using Monte Carlo simulations within the framework of the primitive model where both colloidal particles and small ions are represented by charged hard spheres and the solvent is treated as a dielectric continuum. It is found that the short-ranged attraction between like-charged macroions diminishes as the diameter of the intermediating divalent counterions and coions increases and the maximum attractive force is approximately a linear function of the counterion diameter. This size effect contradicts the prediction of the Asakura-Oosawa theory suggesting that an increase in the excluded volume of small ions would lead to a stronger depletion between colloidal particles. Interestingly, the simulation results indicate that both the hard-sphere collision and the electrostatic contributions to the mean force are insensitive to the size disparity of colloidal particles with the same average diameter.

  6. Population Validity and Cross-Validity: Applications of Distribution Theory for Testing Hypotheses, Setting Confidence Intervals, and Determining Sample Size

    Science.gov (United States)

    Algina, James; Keselman, H. J.

    2008-01-01

    Applications of distribution theory for the squared multiple correlation coefficient and the squared cross-validation coefficient are reviewed, and computer programs for these applications are made available. The applications include confidence intervals, hypothesis testing, and sample size selection. (Contains 2 tables.)

  7. (I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research

    NARCIS (Netherlands)

    van Rijnsoever, Frank J.|info:eu-repo/dai/nl/314100334

    2017-01-01

    I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in

  8. (I Can’t Get No) Saturation: A Simulation and Guidelines for Minimum Sample Sizes in Qualitative Research

    NARCIS (Netherlands)

    van Rijnsoever, F.J.

    2015-01-01

    This paper explores the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the

  9. Methods for flexible sample-size design in clinical trials: Likelihood, weighted, dual test, and promising zone approaches.

    Science.gov (United States)

    Shih, Weichung Joe; Li, Gang; Wang, Yining

    2016-03-01

    Sample size plays a crucial role in clinical trials. Flexible sample-size designs, as part of the more general category of adaptive designs that utilize interim data, have been a popular topic in recent years. In this paper, we give a comparative review of four related methods for such a design. The likelihood method uses the likelihood ratio test with an adjusted critical value. The weighted method adjusts the test statistic with given weights rather than the critical value. The dual test method requires both the likelihood ratio statistic and the weighted statistic to be greater than the unadjusted critical value. The promising zone approach uses the likelihood ratio statistic with the unadjusted value and other constraints. All four methods preserve the type-I error rate. In this paper we explore their properties and compare their relationships and merits. We show that the sample size rules for the dual test are in conflict with the rules of the promising zone approach. We delineate what is necessary to specify in the study protocol to ensure the validity of the statistical procedure and what can be kept implicit in the protocol so that more flexibility can be attained for confirmatory phase III trials in meeting regulatory requirements. We also prove that under mild conditions, the likelihood ratio test still preserves the type-I error rate when the actual sample size is larger than the re-calculated one.

  10. Inert gases in a terra sample - Measurements in six grain-size fractions and two single particles from Lunar 20.

    Science.gov (United States)

    Heymann, D.; Lakatos, S.; Walton, J. R.

    1973-01-01

    Review of the results of inert gas measurements performed on six grain-size fractions and two single particles from four samples of Luna 20 material. Presented and discussed data include the inert gas contents, element and isotope systematics, radiation ages, and Ar-36/Ar-40 systematics.

  11. Inert gases in a terra sample - Measurements in six grain-size fractions and two single particles from Lunar 20.

    Science.gov (United States)

    Heymann, D.; Lakatos, S.; Walton, J. R.

    1973-01-01

    Review of the results of inert gas measurements performed on six grain-size fractions and two single particles from four samples of Luna 20 material. Presented and discussed data include the inert gas contents, element and isotope systematics, radiation ages, and Ar-36/Ar-40 systematics.

  12. (I Can’t Get No) Saturation: A Simulation and Guidelines for Minimum Sample Sizes in Qualitative Research

    NARCIS (Netherlands)

    van Rijnsoever, F.J.

    2015-01-01

    This paper explores the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the

  13. Annual design-based estimation for the annualized inventories of forest inventory and analysis: sample size determination

    Science.gov (United States)

    Hans T. Schreuder; Jin-Mann S. Lin; John Teply

    2000-01-01

    The Forest Inventory and Analysis units in the USDA Forest Service have been mandated by Congress to go to an annualized inventory where a certain percentage of plots, say 20 percent, will be measured in each State each year. Although this will result in an annual sample size that will be too small for reliable inference for many areas, it is a sufficiently large...

  14. Use of pharmacogenetics in bioequivalence studies to reduce sample size: an example with mirtazapine and CYP2D6.

    Science.gov (United States)

    González-Vacarezza, N; Abad-Santos, F; Carcas-Sansuan, A; Dorado, P; Peñas-Lledó, E; Estévez-Carrizo, F; Llerena, A

    2013-10-01

    In bioequivalence studies, intra-individual variability (CV(w)) is critical in determining sample size. In particular, highly variable drugs may require enrollment of a greater number of subjects. We hypothesize that a strategy to reduce pharmacokinetic CV(w), and hence sample size and costs, would be to include subjects with decreased metabolic enzyme capacity for the drug under study. Therefore, two mirtazapine studies, two-way, two-period crossover design (n=68) were re-analysed to calculate the total CV(w) and the CV(w)s in three different CYP2D6 genotype groups (0, 1 and ≥ 2 active genes). The results showed that a 29.2 or 15.3% sample size reduction would have been possible if the recruitment had been of individuals carrying just 0 or 0 plus 1 CYP2D6 active genes, due to the lower CV(w). This suggests that there may be a role for pharmacogenetics in the design of bioequivalence studies to reduce sample size and costs, thus introducing a new paradigm for the biopharmaceutical evaluation of drug products.

  15. A Comparison of the Exact Kruskal-Wallis Distribution to Asymptotic Approximations for All Sample Sizes up to 105

    Science.gov (United States)

    Meyer, J. Patrick; Seaman, Michael A.

    2013-01-01

    The authors generated exact probability distributions for sample sizes up to 35 in each of three groups ("n" less than or equal to 105) and up to 10 in each of four groups ("n" less than or equal to 40). They compared the exact distributions to the chi-square, gamma, and beta approximations. The beta approximation was best in…

  16. A mathematical model for the product mixing and lot-sizing problem by considering stochastic demand

    Directory of Open Access Journals (Sweden)

    Dionicio Neira Rodado

    2016-11-01

    Full Text Available The product-mix planning and the lot size decisions are some of the most fundamental research themes for the operations research community. The fact that markets have become more unpredictable has increaed the importance of these issues, rapidly. Currently, directors need to work with product-mix planning and lot size decision models by introducing stochastic variables related to the demands, lead times, etc. However, some real mathematical models involving stochastic variables are not capable of obtaining good solutions within short commuting times. Several heuristics and metaheuristics have been developed to deal with lot decisions problems, in order to obtain high quality results within short commuting times. Nevertheless, the search for an efficient model by considering product mix and deal size with stochastic demand is a prominent research area. This paper aims to develop a general model for the product-mix, and lot size decision within a stochastic demand environment, by introducing the Economic Value Added (EVA as the objective function of a product portfolio selection. The proposed stochastic model has been solved by using a Sample Average Approximation (SAA scheme. The proposed model obtains high quality results within acceptable computing times.

  17. Flow Through a Laboratory Sediment Sample by Computer Simulation Modeling

    Science.gov (United States)

    2006-09-07

    Flow through a laboratory sediment sample by computer simulation modeling R.B. Pandeya’b*, Allen H. Reeda, Edward Braithwaitea, Ray Seyfarth0, J.F...through a laboratory sediment sample by computer simulation modeling 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S

  18. How well does end-member modelling analysis of grain size data work?

    Science.gov (United States)

    Schulte, Philipp; Dietze, Michael; Dietze, Elisabeth

    2014-05-01

    End-member modelling analysis (EMMA) is a powerful and flexible statistic approach to identify and quantify generic sediment transport processes from multimodal grain-size distributions. EMMA has been introduced over 15 years ago and is now available in different approaches as encapsulated FORTRAN code (Weltje, 1997), Matlab-script (Dietze et al., 2012) and the R-package EMMAgeo (Dietze and Dietze, 2013). EMMA was mainly used to reconstruct past sedimentation processes in a variety of sedimentary environments (marine, aeolian, lacustrine). Typically, it is rather difficult to assess how meaningful and well the model performs in a certain environment, since neither the actual process end-members (generic grain-size distributions sorted by a certain sediment transport) nor their individual contributions to each sample are known a priori. To allow a comprehensive performance test, we sampled a set of four known process end-members: alluvial sand (main mode: 0.70±0.55 φ), dune sand (main mode: 1.35±0.60 φ), loess (main mode: 4.71±0.65 φ) and overbank deposit (main mode: 5.81±1.62 φ). High resolution grain-size information is based on laser-diffraction analysis (116 classes). The four process end-members were artificially mixed with random, but known proportions to yield 100 samples. This mixed data set was measured again with the laser particle size analyser and served as input for EMMA within the R-package EMMAgeo. This contribution discusses the ability of EMMA to identify and characterise the four distinct process end-members and quantify their contributions to each sample. Different ways to estimate uncertainties are presented. Further evaluations focus on the influence of numbers of included samples, numbers of grain-size classes, vertical mixing of samples (simulating turbation) and self-similarity of process end-members. Dietze E, et al. 2012. An end-member algorithm for deciphering modern detrital processes from lake sediments of Lake Donggi Cona, NE

  19. Preferential sampling and Bayesian geostatistics: Statistical modeling and examples.

    Science.gov (United States)

    Cecconi, Lorenzo; Grisotto, Laura; Catelan, Dolores; Lagazio, Corrado; Berrocal, Veronica; Biggeri, Annibale

    2016-08-01

    Preferential sampling refers to any situation in which the spatial process and the sampling locations are not stochastically independent. In this paper, we present two examples of geostatistical analysis in which the usual assumption of stochastic independence between the point process and the measurement process is violated. To account for preferential sampling, we specify a flexible and general Bayesian geostatistical model that includes a shared spatial random component. We apply the proposed model to two different case studies that allow us to highlight three different modeling and inferential aspects of geostatistical modeling under preferential sampling: (1) continuous or finite spatial sampling frame; (2) underlying causal model and relevant covariates; and (3) inferential goals related to mean prediction surface or prediction uncertainty.

  20. Sample size considerations in active-control non-inferiority trials with binary data based on the odds ratio.

    Science.gov (United States)

    Siqueira, Arminda Lucia; Todd, Susan; Whitehead, Anne

    2015-08-01

    This paper presents an approximate closed form sample size formula for determining non-inferiority in active-control trials with binary data. We use the odds-ratio as the measure of the relative treatment effect, derive the sample size formula based on the score test and compare it with a second, well-known formula based on the Wald test. Both closed form formulae are compared with simulations based on the likelihood ratio test. Within the range of parameter values investigated, the score test closed form formula is reasonably accurate when non-inferiority margins are based on odds-ratios of about 0.5 or above and when the magnitude of the odds ratio under the alternative hypothesis lies between about 1 and 2.5. The accuracy generally decreases as the odds ratio under the alternative hypothesis moves upwards from 1. As the non-inferiority margin odds ratio decreases from 0.5, the score test closed form formula increasingly overestimates the sample size irrespective of the magnitude of the odds ratio under the alternative hypothesis. The Wald test closed form formula is also reasonably accurate in the cases where the score test closed form formula works well. Outside these scenarios, the Wald test closed form formula can either underestimate or overestimate the sample size, depending on the magnitude of the non-inferiority margin odds ratio and the odds ratio under the alternative hypothesis. Although neither approximation is accurate for all cases, both approaches lead to satisfactory sample size calculation for non-inferiority trials with binary data where the odds ratio is the parameter of interest. © The Author(s) 2014.

  1. Modeling photoacoustic spectral features of micron-sized particles.

    Science.gov (United States)

    Strohm, Eric M; Gorelikov, Ivan; Matsuura, Naomi; Kolios, Michael C

    2014-10-07

    The photoacoustic signal generated from particles when irradiated by light is determined by attributes of the particle such as the size, speed of sound, morphology and the optical absorption coefficient. Unique features such as periodically varying minima and maxima are observed throughout the photoacoustic signal power spectrum, where the periodicity depends on these physical attributes. The frequency content of the photoacoustic signals can be used to obtain the physical attributes of unknown particles by comparison to analytical solutions of homogeneous symmetric geometric structures, such as spheres. However, analytical solutions do not exist for irregularly shaped particles, inhomogeneous particles or particles near structures. A finite element model (FEM) was used to simulate photoacoustic wave propagation from four different particle configurations: a homogeneous particle suspended in water, a homogeneous particle on a reflecting boundary, an inhomogeneous particle with an absorbing shell and non-absorbing core, and an irregularly shaped particle such as a red blood cell. Biocompatible perfluorocarbon droplets, 3-5 μm in diameter containing optically absorbing nanoparticles were used as the representative ideal particles, as they are spherical, homogeneous, optically translucent, and have known physical properties. The photoacoustic spectrum of micron-sized single droplets in suspension and on a reflecting boundary were measured over the frequency range of 100-500 MHz and compared directly to analytical models and the FEM. Good agreement between the analytical model, FEM and measured values were observed for a droplet in suspension, where the spectral minima agreed to within a 3.3 MHz standard deviation. For a droplet on a reflecting boundary, spectral features were correctly reproduced using the FEM but not the analytical model. The photoacoustic spectra from other common particle configurations such as particle with an absorbing shell and a

  2. Modeling photoacoustic spectral features of micron-sized particles

    Science.gov (United States)

    Strohm, Eric M.; Gorelikov, Ivan; Matsuura, Naomi; Kolios, Michael C.

    2014-10-01

    The photoacoustic signal generated from particles when irradiated by light is determined by attributes of the particle such as the size, speed of sound, morphology and the optical absorption coefficient. Unique features such as periodically varying minima and maxima are observed throughout the photoacoustic signal power spectrum, where the periodicity depends on these physical attributes. The frequency content of the photoacoustic signals can be used to obtain the physical attributes of unknown particles by comparison to analytical solutions of homogeneous symmetric geometric structures, such as spheres. However, analytical solutions do not exist for irregularly shaped particles, inhomogeneous particles or particles near structures. A finite element model (FEM) was used to simulate photoacoustic wave propagation from four different particle configurations: a homogeneous particle suspended in water, a homogeneous particle on a reflecting boundary, an inhomogeneous particle with an absorbing shell and non-absorbing core, and an irregularly shaped particle such as a red blood cell. Biocompatible perfluorocarbon droplets, 3-5 μm in diameter containing optically absorbing nanoparticles were used as the representative ideal particles, as they are spherical, homogeneous, optically translucent, and have known physical properties. The photoacoustic spectrum of micron-sized single droplets in suspension and on a reflecting boundary were measured over the frequency range of 100-500 MHz and compared directly to analytical models and the FEM. Good agreement between the analytical model, FEM and measured values were observed for a droplet in suspension, where the spectral minima agreed to within a 3.3 MHz standard deviation. For a droplet on a reflecting boundary, spectral features were correctly reproduced using the FEM but not the analytical model. The photoacoustic spectra from other common particle configurations such as particle with an absorbing shell and a

  3. An environmental sampling model for combining judgment and randomly placed samples

    Energy Technology Data Exchange (ETDEWEB)

    Sego, Landon H.; Anderson, Kevin K.; Matzke, Brett D.; Sieber, Karl; Shulman, Stanley; Bennett, James; Gillen, M.; Wilson, John E.; Pulsipher, Brent A.

    2007-08-23

    In the event of the release of a lethal agent (such as anthrax) inside a building, law enforcement and public health responders take samples to identify and characterize the contamination. Sample locations may be rapidly chosen based on available incident details and professional judgment. To achieve greater confidence of whether or not a room or zone was contaminated, or to certify that detectable contamination is not present after decontamination, we consider a Bayesian model for combining the information gained from both judgment and randomly placed samples. We investigate the sensitivity of the model to the parameter inputs and make recommendations for its practical use.

  4. Sample Size Effect of Magnetomechanical Response for Magnetic Elastomers by Using Permanent Magnets

    Directory of Open Access Journals (Sweden)

    Tsubasa Oguro

    2017-01-01

    Full Text Available The size effect of magnetomechanical response of chemically cross-linked disk shaped magnetic elastomers placed on a permanent magnet has been investigated by unidirectional compression tests. A cylindrical permanent magnet with a size of 35 mm in diameter and 15 mm in height was used to create the magnetic field. The magnetic field strength was approximately 420 mT at the center of the upper surface of the magnet. The diameter of the magnetoelastic polymer disks was varied from 14 mm to 35 mm, whereas the height was kept constant (5 mm in the undeformed state. We have studied the influence of the disk diameter on the stress-strain behavior of the magnetoelastic in the presence and in the lack of magnetic field. It was found that the smallest magnetic elastomer with 14 mm diameter did not exhibit measurable magnetomechanical response due to magnetic field. On the opposite, the magnetic elastomers with diameters larger than 30 mm contracted in the direction parallel to the mechanical stress and largely elongated in the perpendicular direction. An explanation is put forward to interpret this size-dependent behavior by taking into account the nonuniform field distribution of magnetic field produced by the permanent magnet.

  5. The proportionator: unbiased stereological estimation using biased automatic image analysis and non-uniform probability proportional to size sampling

    DEFF Research Database (Denmark)

    Gardi, Jonathan Eyal; Nyengaard, Jens Randel; Gundersen, Hans Jørgen Gottlieb

    2008-01-01

    of its entirely different sampling strategy, based on known but non-uniform sampling probabilities, the proportionator for the first time allows the real CE at the section level to be automatically estimated (not just predicted), unbiased - for all estimators and at no extra cost to the user.......The proportionator is a novel and radically different approach to sampling with microscopes based on well-known statistical theory (probability proportional to size - PPS sampling). It uses automatic image analysis, with a large range of options, to assign to every field of view in the section......, the desired number of fields are sampled automatically with probability proportional to the weight and presented to the expert observer. Using any known stereological probe and estimator, the correct count in these fields leads to a simple, unbiased estimate of the total amount of structure in the sections...

  6. A mathematical model to predict the size of the pellets formed in freeze pelletization techniques: parameters affecting pellet size.

    Science.gov (United States)

    Cheboyina, Sreekhar; O'Haver, John; Wyandt, Christy M

    2006-01-01

    A mathematical model was developed based on the theory of drop formation to predict the size of the pellets formed in the freeze pelletization process. Further the model was validated by studying the effect of various parameters on the pellet size such as viscosity of the pellet forming and column liquids, surface/interfacial tension, density difference between pellet forming and column liquids; size, shape, and material of construction of the needle tips and temperatures maintained in the columns. In this study, pellets were prepared from different matrices including polyethylene glycols and waxes. The column liquids studied were silicone oils and aqueous glycerol solutions. The surface/interfacial tension, density difference between pellet forming and column liquids and needle tip size were found to be the most important factors affecting pellet size. The viscosity of the column liquid was not found to significantly affect the size of the pellets. The size of the pellets was also not affected by the pellet forming liquids of low viscosities. An increase in the initial column temperature slightly decreased the pellet size. The mathematical model developed was found to successfully predict the size of the pellets with an average error of 3.32% for different matrices that were studied.

  7. Randomized comparison of 3 different-sized biopsy forceps for quality of sampling in Barrett's esophagus.

    Science.gov (United States)

    Gonzalez, Susana; Yu, Woojin M; Smith, Michael S; Slack, Kristen N; Rotterdam, Heidrun; Abrams, Julian A; Lightdale, Charles J

    2010-11-01

    Several types of forceps are available for use in sampling Barrett's esophagus (BE). Few data exist with regard to biopsy quality for histologic assessment. To evaluate sampling quality of 3 different forceps in patients with BE. Single-center, randomized clinical trial. Consecutive patients with BE undergoing upper endoscopy. Patients randomized to have biopsy specimens taken with 1 of 3 types of forceps: standard, large capacity, or jumbo. Specimen adequacy was defined a priori as a well-oriented biopsy sample 2 mm or greater in diameter and with at least muscularis mucosa present. A total of 65 patients were enrolled and analyzed (standard forceps, n = 21; large-capacity forceps, n = 21; jumbo forceps, n = 23). Compared with jumbo forceps, a significantly higher proportion of biopsy samples with large-capacity forceps were adequate (37.8% vs 25.2%, P = .002). Of the standard forceps biopsy samples, 31.9% were adequate, which was not significantly different from specimens taken with large-capacity (P = .20) or jumbo (P = .09) forceps. Biopsy specimens taken with jumbo forceps had the largest diameter (median, 3.0 mm vs 2.5 mm [standard] vs 2.8 mm [large capacity]; P = .0001). However, jumbo forceps had the lowest proportion of specimens that were well oriented (overall P = .001). Heterogeneous patient population precluded dysplasia detection analyses. Our results challenge the requirement of jumbo forceps and therapeutic endoscopes to properly perform the Seattle protocol. We found that standard and large-capacity forceps used with standard upper endoscopes produced biopsy samples at least as adequate as those obtained with jumbo forceps and therapeutic endoscopes in patients with BE. Copyright © 2010 American Society for Gastrointestinal Endoscopy. Published by Mosby, Inc. All rights reserved.

  8. Particle-size distribution models for the conversion of Chinese data to FAO/USDA system.

    Science.gov (United States)

    Shangguan, Wei; Dai, YongJiu; García-Gutiérrez, Carlos; Yuan, Hua

    2014-01-01

    We investigated eleven particle-size distribution (PSD) models to determine the appropriate models for describing the PSDs of 16349 Chinese soil samples. These data are based on three soil texture classification schemes, including one ISSS (International Society of Soil Science) scheme with four data points and two Katschinski's schemes with five and six data points, respectively. The adjusted coefficient of determination r (2), Akaike's information criterion (AIC), and geometric mean error ratio (GMER) were used to evaluate the model performance. The soil data were converted to the USDA (United States Department of Agriculture) standard using PSD models and the fractal concept. The performance of PSD models was affected by soil texture and classification of fraction schemes. The performance of PSD models also varied with clay content of soils. The Anderson, Fredlund, modified logistic growth, Skaggs, and Weilbull models were the best.

  9. Early detection of nonnative alleles in fish populations: When sample size actually matters

    Science.gov (United States)

    Croce, Patrick Della; Poole, Geoffrey C.; Payne, Robert A.; Gresswell, Bob

    2017-01-01

    Reliable detection of nonnative alleles is crucial for the conservation of sensitive native fish populations at risk of introgression. Typically, nonnative alleles in a population are detected through the analysis of genetic markers in a sample of individuals. Here we show that common assumptions associated with such analyses yield substantial overestimates of the likelihood of detecting nonnative alleles. We present a revised equation to estimate the likelihood of detecting nonnative alleles in a population with a given level of admixture. The new equation incorporates the effects of the genotypic structure of the sampled population and shows that conventional methods overestimate the likelihood of detection, especially when nonnative or F-1 hybrid individuals are present. Under such circumstances—which are typical of early stages of introgression and therefore most important for conservation efforts—our results show that improved detection of nonnative alleles arises primarily from increasing the number of individuals sampled rather than increasing the number of genetic markers analyzed. Using the revised equation, we describe a new approach to determining the number of individuals to sample and the number of diagnostic markers to analyze when attempting to monitor the arrival of nonnative alleles in native populations.

  10. In situ detection of small-size insect pests sampled on traps using multifractal analysis

    Science.gov (United States)

    Xia, Chunlei; Lee, Jang-Myung; Li, Yan; Chung, Bu-Keun; Chon, Tae-Soo

    2012-02-01

    We introduce a multifractal analysis for detecting the small-size pest (e.g., whitefly) images from a sticky trap in situ. An automatic attraction system is utilized for collecting pests from greenhouse plants. We applied multifractal analysis to segment action of whitefly images based on the local singularity and global image characteristics. According to the theory of multifractal dimension, the candidate blobs of whiteflies are initially defined from the sticky-trap image. Two schemes, fixed thresholding and regional minima obtainment, were utilized for feature extraction of candidate whitefly image areas. The experiment was conducted with the field images in a greenhouse. Detection results were compared with other adaptive segmentation algorithms. Values of F measuring precision and recall score were higher for the proposed multifractal analysis (96.5%) compared with conventional methods such as Watershed (92.2%) and Otsu (73.1%). The true positive rate of multifractal analysis was 94.3% and the false positive rate minimal level at 1.3%. Detection performance was further tested via human observation. The degree of scattering between manual and automatic counting was remarkably higher with multifractal analysis (R2=0.992) compared with Watershed (R2=0.895) and Otsu (R2=0.353), ensuring overall detection of the small-size pests is most feasible with multifractal analysis in field conditions.

  11. Sample size requirements for studies of treatment effects on beta-cell function in newly diagnosed type 1 diabetes.

    Directory of Open Access Journals (Sweden)

    John M Lachin

    Full Text Available Preservation of β-cell function as measured by stimulated C-peptide has recently been accepted as a therapeutic target for subjects with newly diagnosed type 1 diabetes. In recently completed studies conducted by the Type 1 Diabetes Trial Network (TrialNet, repeated 2-hour Mixed Meal Tolerance Tests (MMTT were obtained for up to 24 months from 156 subjects with up to 3 months duration of type 1 diabetes at the time of study enrollment. These data provide the information needed to more accurately determine the sample size needed for future studies of the effects of new agents on the 2-hour area under the curve (AUC of the C-peptide values. The natural log(x, log(x+1 and square-root (√x transformations of the AUC were assessed. In general, a transformation of the data is needed to better satisfy the normality assumptions for commonly used statistical tests. Statistical analysis of the raw and transformed data are provided to estimate the mean levels over time and the residual variation in untreated subjects that allow sample size calculations for future studies at either 12 or 24 months of follow-up and among children 8-12 years of age, adolescents (13-17 years and adults (18+ years. The sample size needed to detect a given relative (percentage difference with treatment versus control is greater at 24 months than at 12 months of follow-up, and differs among age categories. Owing to greater residual variation among those 13-17 years of age, a larger sample size is required for this age group. Methods are also described for assessment of sample size for mixtures of subjects among the age categories. Statistical expressions are presented for the presentation of analyses of log(x+1 and √x transformed values in terms of the original units of measurement (pmol/ml. Analyses using different transformations are described for the TrialNet study of masked anti-CD20 (rituximab versus masked placebo. These results provide the information needed to

  12. Mass Balance Model, A study of contamination effects in AMS 14C sample analysis

    NARCIS (Netherlands)

    Prokopiou, Markella

    2010-01-01

    In this training thesis a background correction analysis, also known as mass balance model, was implemented to study the contamination effects in AMS 14C sample processing. A variety of backgrounds and standards with sizes ranging from 50 μg C to 1500 μg

  13. Queuing theory models used for port equipment sizing

    Science.gov (United States)

    Dragu, V.; Dinu, O.; Ruscă, A.; Burciu, Ş.; Roman, E. A.

    2017-08-01

    The significant growth of volumes and distances on road transportation led to the necessity of finding solutions to increase water transportation market share together with the handling and transfer technologies within its terminals. It is widely known that the biggest times are consumed within the transport terminals (loading/unloading/transfer) and so the necessity of constantly developing handling techniques and technologies in concordance with the goods flows size so that the total waiting time of ships within ports is reduced. Port development should be achieved by harmonizing the contradictory interests of port administration and users. Port administrators aim profit increase opposite to users that want savings by increasing consumers’ surplus. The difficulty consists in the fact that the transport demand - supply equilibrium must be realised at costs and goods quantities transiting the port in order to satisfy the interests of both parties involved. This paper presents a port equipment sizing model by using queueing theory so that the sum of costs for ships waiting operations and equipment usage would be minimum. Ship operation within the port is assimilated to a mass service waiting system in which parameters are later used to determine the main costs for ships and port equipment.

  14. Numerical modelling of riverbed grain size stratigraphic evolution

    Institute of Scientific and Technical Information of China (English)

    Peng HU; Zhi-xian CAO; Gareth PENDER; Huai-han LIU

    2014-01-01

    For several decades, quantification of riverbed grain size stratigraphic evolution has been based upon the active layer formulation (ALF), which unfortunately involves considerable uncertainty. While it is the sediment exchange across the bed surface that directly affects the riverbed stratigraphy, it has been assumed in the ALF that the sediment fraction at the lower interface of the active layer is a linear function of the sediment fraction in the flow. Here it is proposed that the sediment fraction of the sediment exchange flux is used directly in estimating the sediment fraction at the lower surface of the active layer. Together with the size-specific mass conservation for riverbed sediment, the modified approach is referred to as the surface-based formulation (SBF). When incorporated into a coupled non-capacity modelling framework for fluvial processes, the SBF leads to results that agree as well or better than those using ALF with laboratory and field observations. This is illustrated for typical cases featuring bed aggradation and degradation due to graded bed-load sediment transport. Systematic experiments on graded sediment transport by unsteady flows are warranted for further testing the modified formulation.

  15. Evaluation of pump pulsation in respirable size-selective sampling: Part III. Investigation of European standard methods.

    Science.gov (United States)

    Soo, Jhy-Charm; Lee, Eun Gyung; Lee, Larry A; Kashon, Michael L; Harper, Martin

    2014-10-01

    Lee et al. (Evaluation of pump pulsation in respirable size-selective sampling: part I. Pulsation measurements. Ann Occup Hyg 2014a;58:60-73) introduced an approach to measure pump pulsation (PP) using a real-world sampling train, while the European Standards (EN) (EN 1232-1997 and EN 12919-1999) suggest measuring PP using a resistor in place of the sampler. The goal of this study is to characterize PP according to both EN methods and to determine the relationship of PP between the published method (Lee et al., 2014a) and the EN methods. Additional test parameters were investigated to determine whether the test conditions suggested by the EN methods were appropriate for measuring pulsations. Experiments were conducted using a factorial combination of personal sampling pumps (six medium- and two high-volumetric flow rate pumps), back pressures (six medium- and seven high-flow rate pumps), resistors (two types), tubing lengths between a pump and resistor (60 and 90 cm), and different flow rates (2 and 2.5 l min(-1) for the medium- and 4.4, 10, and 11.2 l min(-1) for the high-flow rate pumps). The selection of sampling pumps and the ranges of back pressure were based on measurements obtained in the previous study (Lee et al., 2014a). Among six medium-flow rate pumps, only the Gilian5000 and the Apex IS conformed to the 10% criterion specified in EN 1232-1997. Although the AirChek XR5000 exceeded the 10% limit, the average PP (10.9%) was close to the criterion. One high-flow rate pump, the Legacy (PP=8.1%), conformed to the 10% criterion in EN 12919-1999, while the Elite12 did not (PP=18.3%). Conducting supplemental tests with additional test parameters beyond those used in the two subject EN standards did not strengthen the characterization of PPs. For the selected test conditions, a linear regression model [PPEN=0.014+0.375×PPNIOSH (adjusted R2=0.871)] was developed to determine the PP relationship between the published method (Lee et al., 2014a) and the EN methods

  16. Propagation of Uncertainty in System Parameters of a LWR Model by Sampling MCNPX Calculations - Burnup Analysis

    Science.gov (United States)

    Campolina, Daniel de A. M.; Lima, Claubia P. B.; Veloso, Maria Auxiliadora F.

    2014-06-01

    For all the physical components that comprise a nuclear system there is an uncertainty. Assessing the impact of uncertainties in the simulation of fissionable material systems is essential for a best estimate calculation that has been replacing the conservative model calculations as the computational power increases. The propagation of uncertainty in a simulation using a Monte Carlo code by sampling the input parameters is recent because of the huge computational effort required. In this work a sample space of MCNPX calculations was used to propagate the uncertainty. The sample size was optimized using the Wilks formula for a 95th percentile and a two-sided statistical tolerance interval of 95%. Uncertainties in input parameters of the reactor considered included geometry dimensions and densities. It was showed the capacity of the sampling-based method for burnup when the calculations sample size is optimized and many parameter uncertainties are investigated together, in the same input.

  17. Basic distribution free identification tests for small size samples of environmental data

    Energy Technology Data Exchange (ETDEWEB)

    Federico, A.G.; Musmeci, F. [ENEA, Centro Ricerche Casaccia, Rome (Italy). Dipt. Ambiente

    1998-01-01

    Testing two or more data sets for the hypothesis that they are sampled form the same population is often required in environmental data analysis. Typically the available samples have a small number of data and often then assumption of normal distributions is not realistic. On the other hand the diffusion of the days powerful Personal Computers opens new possible opportunities based on a massive use of the CPU resources. The paper reviews the problem introducing the feasibility of two non parametric approaches based on intrinsic equi probability properties of the data samples. The first one is based on a full re sampling while the second is based on a bootstrap approach. A easy to use program is presented. A case study is given based on the Chernobyl children contamination data. [Italiano] Nell`analisi di dati ambientali ricorre spesso il caso di dover sottoporre a test l`ipotesi di provenienza di due, o piu`, insiemi di dati dalla stessa popolazione. Tipicamente i dati disponibili sono pochi e spesso l`ipotesi di provenienza da distribuzioni normali non e` sostenibile. D`altra aprte la diffusione odierna di Personal Computer fornisce nuove possibili soluzioni basate sull`uso intensivo delle risorse della CPU. Il rapporto analizza il problema e presenta la possibilita` di utilizzo di due test non parametrici basati sulle proprieta` intrinseche di equiprobabilita` dei campioni. Il primo e` basato su una tecnica di ricampionamento esaustivo mentre il secondo su un approccio di tipo bootstrap. E` presentato un programma di semplice utilizzo e un caso di studio basato su dati di contaminazione di bambini a Chernobyl.

  18. Second generation laser-heated microfurnace for the preparation of microgram-sized graphite samples

    Science.gov (United States)

    Yang, Bin; Smith, A. M.; Long, S.

    2015-10-01

    We present construction details and test results for two second-generation laser-heated microfurnaces (LHF-II) used to prepare graphite samples for Accelerator Mass Spectrometry (AMS) at ANSTO. Based on systematic studies aimed at optimising the performance of our prototype laser-heated microfurnace (LHF-I) (Smith et al., 2007 [1]; Smith et al., 2010 [2,3]; Yang et al., 2014 [4]), we have designed the LHF-II to have the following features: (i) it has a small reactor volume of 0.25 mL allowing us to completely graphitise carbon dioxide samples containing as little as 2 μg of C, (ii) it can operate over a large pressure range (0-3 bar) and so has the capacity to graphitise CO2 samples containing up to 100 μg of C; (iii) it is compact, with three valves integrated into the microfurnace body, (iv) it is compatible with our new miniaturised conventional graphitisation furnaces (MCF), also designed for small samples, and shares a common vacuum system. Early tests have shown that the extraneous carbon added during graphitisation in each LHF-II is of the order of 0.05 μg, assuming 100 pMC activity, similar to that of the prototype unit. We use a 'budget' fibre packaged array for the diode laser with custom built focusing optics. The use of a new infrared (IR) thermometer with a short focal length has allowed us to decrease the height of the light-proof safety enclosure. These innovations have produced a cheaper and more compact device. As with the LHF-I, feedback control of the catalyst temperature and logging of the reaction parameters is managed by a LabVIEW interface.

  19. Second generation laser-heated microfurnace for the preparation of microgram-sized graphite samples

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Bin; Smith, A.M.; Long, S.

    2015-10-15

    We present construction details and test results for two second-generation laser-heated microfurnaces (LHF-II) used to prepare graphite samples for Accelerator Mass Spectrometry (AMS) at ANSTO. Based on systematic studies aimed at optimising the performance of our prototype laser-heated microfurnace (LHF-I) (Smith et al., 2007 [1]; Smith et al., 2010 [2,3]; Yang et al., 2014 [4]), we have designed the LHF-II to have the following features: (i) it has a small reactor volume of 0.25 mL allowing us to completely graphitise carbon dioxide samples containing as little as 2 μg of C, (ii) it can operate over a large pressure range (0–3 bar) and so has the capacity to graphitise CO{sub 2} samples containing up to 100 μg of C; (iii) it is compact, with three valves integrated into the microfurnace body, (iv) it is compatible with our new miniaturised conventional graphitisation furnaces (MCF), also designed for small samples, and shares a common vacuum system. Early tests have shown that the extraneous carbon added during graphitisation in each LHF-II is of the order of 0.05 μg, assuming 100 pMC activity, similar to that of the prototype unit. We use a ‘budget’ fibre packaged array for the diode laser with custom built focusing optics. The use of a new infrared (IR) thermometer with a short focal length has allowed us to decrease the height of the light-proof safety enclosure. These innovations have produced a cheaper and more compact device. As with the LHF-I, feedback control of the catalyst temperature and logging of the reaction parameters is managed by a LabVIEW interface.

  20. Simple capture-recapture models permitting unequal catchability and variable sampling effort.

    Science.gov (United States)

    Agresti, A

    1994-06-01

    We consider two capture-recapture models that imply that the logit of the probability of capture is an additive function of an animal catchability parameter and a parameter reflecting the sampling effort. The models are special cases of the Rasch model, and satisfy the property of quasi-symmetry. One model is log-linear and the other is a latent class model. For the log-linear model, point and interval estimates of the population size are easily obtained using standard software, such as GLIM.

  1. Sizes and ages of SDSS ellipticals: Comparison with hierarchical galaxy formation models

    CERN Document Server

    Shankar, Francesco; Bernardi, Mariangela; Dai, Xinyu; Hyde, Joseph B; Sheth, Ravi K

    2009-01-01

    In a sample of about 45,700 early-type galaxies extracted from SDSS, we find that the shape, normalization, and dispersion around the mean size-stellar mass relation is the same for young and old systems, provided the stellar mass is greater than 3*10^10 Msun. This is difficult to reproduce in pure passive evolution models, which generically predict older galaxies to be much more compact than younger ones of the same stellar mass. However, this aspect of our measurements is well reproduced by hierarchical models of galaxy formation. Whereas the models predict more compact galaxies at high redshifts, subsequent minor, dry mergers increase the sizes of the more massive objects, resulting in a flat size-age relation at the present time. At lower masses, the models predict that mergers are less frequent, so that the expected anti-correlation between age and size is not completely erased. This is in good agreement with our data: below 3*10^10 Msun, the effective radius R_e is a factor of ~2 lower for older galaxie...

  2. [Sample size for the estimation of F-wave parameters in healthy volunteers and amyotrophic lateral sclerosis patients].

    Science.gov (United States)

    Fang, J; Cui, L Y; Liu, M S; Guan, Y Z; Ding, Q Y; Du, H; Li, B H; Wu, S

    2017-03-07

    Objective: The study aimed to investigate whether sample sizes of F-wave study differed according to different nerves, different F-wave parameters, and amyotrophic lateral sclerosis(ALS) patients or healthy subjects. Methods: The F-waves in the median, ulnar, tibial, and deep peroneal nerves of 55 amyotrophic lateral sclerosis (ALS) patients and 52 healthy subjects were studied to assess the effect of sample size on the accuracy of measurements of the following F-wave parameters: F-wave minimum latency, maximum latency, mean latency, F-wave persistence, F-wave chronodispersion, mean and maximum F-wave amplitude. A hundred stimuli were used in F-wave study. The values obtained from 100 stimuli were considered "true" values and were compared with the corresponding values from smaller samples of 20, 40, 60 and 80 stimuli. F-wave parameters obtained from different sample sizes were compared between the ALS patients and the normal controls. Results: Significant differences were not detected with samples above 60 stimuli for chronodispersion in all four nerves in normal participants. Significant differences were not detected with samples above 40 stimuli for maximum F-wave amplitude in median, ulnar and tibial nerves in normal participants. When comparing ALS patients and normal controls, significant differences were detected in the maximum (median nerve, Z=-3.560, PF-wave latency (median nerve, Z=-3.243, PF-wave chronodispersion (Z=-3.152, PF-wave persistence in the median (Z=6.139, PF-wave amplitude in the tibial nerve(t=2.981, PF-wave amplitude in the ulnar (Z=-2.134, PF-wave persistence in tibial nerve (Z=2.119, PF-wave amplitude in ulnar (Z=-2.552, PF-wave amplitude in peroneal nerve (t=2.693, PF-wave study differed according to different nerves, different F-wave parameters , and ALS patients or healthy subjects.

  3. Investigating effects of sample pretreatment on protein stability using size-exclusion chromatography and high-resolution continuum source atomic absorption spectrometry.

    Science.gov (United States)

    Rakow, Tobias; El Deeb, Sami; Hahne, Thomas; El-Hady, Deia Abd; AlBishri, Hassan M; Wätzig, Hermann

    2014-09-01

    In this study, size-exclusion chromatography and high-resolution atomic absorption spectrometry methods have been developed and evaluated to test the stability of proteins during sample pretreatment. This especially includes different storage conditions but also adsorption before or even during the chromatographic process. For the development of the size exclusion method, a Biosep S3000 5 μm column was used for investigating a series of representative model proteins, namely bovine serum albumin, ovalbumin, monoclonal immunoglobulin G antibody, and myoglobin. Ambient temperature storage was found to be harmful to all model proteins, whereas short-term storage up to 14 days could be done in an ordinary refrigerator. Freezing the protein solutions was always complicated and had to be evaluated for each protein in the corresponding solvent. To keep the proteins in their native state a gentle freezing temperature should be chosen, hence liquid nitrogen should be avoided. Furthermore, a high-resolution continuum source atomic absorption spectrometry method was developed to observe the adsorption of proteins on container material and chromatographic columns. Adsorption to any container led to a sample loss and lowered the recovery rates. During the pretreatment and high-performance size-exclusion chromatography, adsorption caused sample losses of up to 33%.

  4. Sample size determinations for group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms.

    Science.gov (United States)

    Heo, Moonseong; Litwin, Alain H; Blackstock, Oni; Kim, Namhee; Arnsten, Julia H

    2017-02-01

    We derived sample size formulae for detecting main effects in group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms. Such designs are necessary when experimental interventions need to be administered to groups of subjects whereas control conditions need to be administered to individual subjects. This type of trial, often referred to as a partially nested or partially clustered design, has been implemented for management of chronic diseases such as diabetes and is beginning to emerge more commonly in wider clinical settings. Depending on the research setting, the level of hierarchy of data structure for the experimental arm can be three or two, whereas that for the control arm is two or one. Such different levels of data hierarchy assume correlation structures of outcomes that are different between arms, regardless of whether research settings require two or three level data structure for the experimental arm. Therefore, the different correlations should be taken into account for statistical modeling and for sample size determinations. To this end, we considered mixed-effects linear models with different correlation structures between experimental and control arms to theoretically derive and empirically validate the sample size formulae with simulation studies.

  5. Evidence for a Global Sampling Process in Extraction of Summary Statistics of Item Sizes in a Set.

    Science.gov (United States)

    Tokita, Midori; Ueda, Sachiyo; Ishiguchi, Akira

    2016-01-01

    Several studies have shown that our visual system may construct a "summary statistical representation" over groups of visual objects. Although there is a general understanding that human observers can accurately represent sets of a variety of features, many questions on how summary statistics, such as an average, are computed remain unanswered. This study investigated sampling properties of visual information used by human observers to extract two types of summary statistics of item sets, average and variance. We presented three models of ideal observers to extract the summary statistics: a global sampling model without sampling noise, global sampling model with sampling noise, and limited sampling model. We compared the performance of an ideal observer of each model with that of human observers using statistical efficiency analysis. Results suggest that summary statistics of items in a set may be computed without representing individual items, which makes it possible to discard the limited sampling account. Moreover, the extraction of summary statistics may not necessarily require the representation of individual objects with focused attention when the sets of items are larger than 4.

  6. Laboratory simulation and modeling of size, shape distributed interstellar graphite dust analogues: A comparative study

    Science.gov (United States)

    Boruah, Manash J.; Gogoi, Ankur; Ahmed, Gazi A.

    2016-06-01

    The computation of the light scattering properties of size and shape distributed interstellar graphite dust analogues using discrete dipole approximation (DDA) is presented. The light scattering properties of dust particles of arbitrary shapes having sizes ranging from 0.5 to 5.0 μm were computed using DDSCAT 7.3.0 software package and an indigenously developed post-processing tool for size and shape averaging. In order to model realistic samples of graphite dust and compute their light scattering properties using DDA, different target geometries were generated to represent the graphite particle composition in terms of surface smoothness, surface roughness and aggregation or their combination, for using as the target for DDSCAT calculations. A comparison of the theoretical volume scattering function at 543.5 nm and 632.8 nm incident wavelengths with laboratory simulation is also presented in this paper.

  7. Sampling considerations when analyzing micrometric-sized particles in a liquid jet using laser induced breakdown spectroscopy

    Energy Technology Data Exchange (ETDEWEB)

    Faye, C.B.; Amodeo, T.; Fréjafon, E. [Institut National de l' Environnement Industriel et des Risques (INERIS/DRC/CARA/NOVA), Parc Technologique Alata, BP 2, 60550 Verneuil-En-Halatte (France); Delepine-Gilon, N. [Institut des Sciences Analytiques, 5 rue de la Doua, 69100 Villeurbanne (France); Dutouquet, C., E-mail: christophe.dutouquet@ineris.fr [Institut National de l' Environnement Industriel et des Risques (INERIS/DRC/CARA/NOVA), Parc Technologique Alata, BP 2, 60550 Verneuil-En-Halatte (France)

    2014-01-01

    Pollution of water is a matter of concern all over the earth. Particles are known to play an important role in the transportation of pollutants in this medium. In addition, the emergence of new materials such as NOAA (Nano-Objects, their Aggregates and their Agglomerates) emphasizes the need to develop adapted instruments for their detection. Surveillance of pollutants in particulate form in waste waters in industries involved in nanoparticle manufacturing and processing is a telling example of possible applications of such instrumental development. The LIBS (laser-induced breakdown spectroscopy) technique coupled with the liquid jet as sampling mode for suspensions was deemed as a potential candidate for on-line and real time monitoring. With the final aim in view to obtain the best detection limits, the interaction of nanosecond laser pulses with the liquid jet was examined. The evolution of the volume sampled by laser pulses was estimated as a function of the laser energy applying conditional analysis when analyzing a suspension of micrometric-sized particles of borosilicate glass. An estimation of the sampled depth was made. Along with the estimation of the sampled volume, the evolution of the SNR (signal to noise ratio) as a function of the laser energy was investigated as well. Eventually, the laser energy and the corresponding fluence optimizing both the sampling volume and the SNR were determined. The obtained results highlight intrinsic limitations of the liquid jet sampling mode when using 532 nm nanosecond laser pulses with suspensions. - Highlights: • Micrometric-sized particles in suspensions are analyzed using LIBS and a liquid jet. • The evolution of the sampling volume is estimated as a function of laser energy. • The sampling volume happens to saturate beyond a certain laser fluence. • Its value was found much lower than the beam diameter times the jet thickness. • Particles proved not to be entirely vaporized.

  8. Power and sample size calculations in the presence of phenotype errors for case/control genetic association studies

    Directory of Open Access Journals (Sweden)

    Finch Stephen J

    2005-04-01

    Full Text Available Abstract Background Phenotype error causes reduction in power to detect genetic association. We present a quantification of phenotype error, also known as diagnostic error, on power and sample size calculations for case-control genetic association studies between a marker locus and a disease phenotype. We consider the classic Pearson chi-square test for independence as our test of genetic association. To determine asymptotic power analytically, we compute the distribution's non-centrality parameter, which is a function of the case and control sample sizes, genotype frequencies, disease prevalence, and phenotype misclassification probabilities. We derive the non-centrality parameter in the presence of phenotype errors and equivalent formulas for misclassification cost (the percentage increase in minimum sample size needed to maintain constant asymptotic power at a fixed significance level for each percentage increase in a given misclassification parameter. We use a linear Taylor Series approximation for the cost of phenotype misclassification to determine lower bounds for the relative costs of misclassifying a true affected (respectively, unaffected as a control (respectively, case. Power is verified by computer simulation. Results Our major findings are that: (i the median absolute difference between analytic power with our method and simulation power was 0.001 and the absolute difference was no larger than 0.011; (ii as the disease prevalence approaches 0, the cost of misclassifying a unaffected as a case becomes infinitely large while the cost of misclassifying an affected as a control approaches 0. Conclusion Our work enables researchers to specifically quantify power loss and minimum sample size requirements in the presence of phenotype errors, thereby allowing for more realistic study design. For most diseases of current interest, verifying that cases are correctly classified is of paramount importance.

  9. THE BERRY-ESSEEN BOUND AND LIL PP CMSTATISTIC TESTING UNIFORM DISTRIBUTION ON SPHERE ABOUT DIMENSION AND SAMPLE SIZE

    Institute of Scientific and Technical Information of China (English)

    LIU Yixing; CHENG Ping

    2000-01-01

    Cheng[1]gave the limit distribution of weighted PP Cramér-Von Mises test statistic when dimension and sample size tend to infinity simultaneously under the underlying distribution being uniform distribution on Sp-1 = {a:‖a‖ = 1, a ∈ SP-1}; this limit distribution is standard normal distribution N(0, 1). In this paper, we give the BerryEsseen bound of this statistic converging to normal distribution and the law of iterated logarithm.

  10. Ultrasonic Techniques for Air Void Size Distribution and Property Evaluation in Both Early-Age and Hardened Concrete Samples

    Directory of Open Access Journals (Sweden)

    Shuaicheng Guo

    2017-03-01

    Full Text Available Entrained air voids can improve the freeze-thaw durability of concrete, and also affect its mechanical and transport properties. Therefore, it is important to measure the air void structure and understand its influence on concrete performance for quality control. This paper aims to measure air void structure evolution at both early-age and hardened stages with the ultrasonic technique, and evaluates its influence on concrete properties. Three samples with different air entrainment agent content were specially prepared. The air void structure was determined with optimized inverse analysis by achieving the minimum error between experimental and theoretical attenuation. The early-age sample measurement showed that the air void content with the whole size range slightly decreases with curing time. The air void size distribution of hardened samples (at Day 28 was compared with American Society for Testing and Materials (ASTM C457 test results. The air void size distribution with different amount of air entrainment agent was also favorably compared. In addition, the transport property, compressive strength, and dynamic modulus of concrete samples were also evaluated. The concrete transport decreased with the curing age, which is in accordance with the air void shrinkage. The correlation between the early-age strength development and hardened dynamic modulus with the ultrasonic parameters was also evaluated. The existence of clustered air voids in the Interfacial Transition Zone (ITZ area was found to cause severe compressive strength loss. The results indicated that this developed ultrasonic technique has potential in air void size distribution measurement, and demonstrated the influence of air void structure evolution on concrete properties during both early-age and hardened stages.

  11. The N-pact factor: evaluating the quality of empirical journals with respect to sample size and statistical power.

    Science.gov (United States)

    Fraley, R Chris; Vazire, Simine

    2014-01-01

    The authors evaluate the quality of research reported in major journals in social-personality psychology by ranking those journals with respect to their N-pact Factors (NF)-the statistical power of the empirical studies they publish to detect typical effect sizes. Power is a particularly important attribute for evaluating research quality because, relative to studies that have low power, studies that have high power are more likely to (a) to provide accurate estimates of effects, (b) to produce literatures with low false positive rates, and (c) to lead to replicable findings. The authors show that the average sample size in social-personality research is 104 and that the power to detect the typical effect size in the field is approximately 50%. Moreover, they show that there is considerable variation among journals in sample sizes and power of the studies they publish, with some journals consistently publishing higher power studies than others. The authors hope that these rankings will be of use to authors who are choosing where to submit their best work, provide hiring and promotion committees with a superior way of quantifying journal quality, and encourage competition among journals to improve their NF rankings.

  12. Performance of Random Effects Model Estimators under Complex Sampling Designs

    Science.gov (United States)

    Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan

    2011-01-01

    In this article, we consider estimation of parameters of random effects models from samples collected via complex multistage designs. Incorporation of sampling weights is one way to reduce estimation bias due to unequal probabilities of selection. Several weighting methods have been proposed in the literature for estimating the parameters of…

  13. Alpha spectrometric characterization of process-related particle size distributions from active particle sampling at the Los Alamos National Laboratory uranium foundry

    Energy Technology Data Exchange (ETDEWEB)

    Plionis, Alexander A [Los Alamos National Laboratory; Peterson, Dominic S [Los Alamos National Laboratory; Tandon, Lav [Los Alamos National Laboratory; Lamont, Stephen P [Los Alamos National Laboratory

    2009-01-01

    Uranium particles within the respirable size range pose a significant hazard to the health and safety of workers. Significant differences in the deposition and incorporation patterns of aerosols within the respirable range can be identified and integrated into sophisticated health physics models. Data characterizing the uranium particle size distribution resulting from specific foundry-related processes are needed. Using personal air sampling cascade impactors, particles collected from several foundry processes were sorted by activity median aerodynamic diameter onto various Marple substrates. After an initial gravimetric assessment of each impactor stage, the substrates were analyzed by alpha spectrometry to determine the uranium content of each stage. Alpha spectrometry provides rapid nondestructive isotopic data that can distinguish process uranium from natural sources and the degree of uranium contribution to the total accumulated particle load. In addition, the particle size bins utilized by the impactors provide adequate resolution to determine if a process particle size distribution is: lognormal, bimodal, or trimodal. Data on process uranium particle size values and distributions facilitate the development of more sophisticated and accurate models for internal dosimetry, resulting in an improved understanding of foundry worker health and safety.

  14. A GMM-Based Test for Normal Disturbances of the Heckman Sample Selection Model

    Directory of Open Access Journals (Sweden)

    Michael Pfaffermayr

    2014-10-01

    Full Text Available The Heckman sample selection model relies on the assumption of normal and homoskedastic disturbances. However, before considering more general, alternative semiparametric models that do not need the normality assumption, it seems useful to test this assumption. Following Meijer and Wansbeek (2007, the present contribution derives a GMM-based pseudo-score LM test on whether the third and fourth moments of the disturbances of the outcome equation of the Heckman model conform to those implied by the truncated normal distribution. The test is easy to calculate and in Monte Carlo simulations it shows good performance for sample sizes of 1000 or larger.

  15. A Rounding by Sampling Approach to the Minimum Size k-Arc Connected Subgraph Problem

    CERN Document Server

    Laekhanukit, Bundit; Singh, Mohit

    2012-01-01

    In the k-arc connected subgraph problem, we are given a directed graph G and an integer k and the goal is the find a subgraph of minimum cost such that there are at least k-arc disjoint paths between any pair of vertices. We give a simple (1 + 1/k)-approximation to the unweighted variant of the problem, where all arcs of G have the same cost. This improves on the 1 + 2/k approximation of Gabow et al. [GGTW09]. Similar to the 2-approximation algorithm for this problem [FJ81], our algorithm simply takes the union of a k in-arborescence and a k out-arborescence. The main difference is in the selection of the two arborescences. Here, inspired by the recent applications of the rounding by sampling method (see e.g. [AGM+ 10, MOS11, OSS11, AKS12]), we select the arborescences randomly by sampling from a distribution on unions of k arborescences that is defined based on an extreme point solution of the linear programming relaxation of the problem. In the analysis, we crucially utilize the sparsity property of the ext...

  16. RNA Profiling for Biomarker Discovery: Practical Considerations for Limiting Sample Sizes

    Directory of Open Access Journals (Sweden)

    Danny J. Kelly

    2005-01-01

    Full Text Available We have compared microarray data generated on Affymetrix™ chips from standard (8 micrograms or low (100 nanograms amounts of total RNA. We evaluated the gene signals and gene fold-change estimates obtained from the two methods and validated a subset of the results by real time, polymerase chain reaction assays. The correlation of low RNA derived gene signals to gene signals obtained from standard RNA was poor for less to moderately abundant genes. Genes with high abundance showed better correlation in signals between the two methods. The signal correlation between the low RNA and standard RNA methods was improved by including a reference sample in the microarray analysis. In contrast, the fold-change estimates for genes were better correlated between the two methods regardless of the magnitude of gene signals. A reference sample based method is suggested for studies that would end up comparing gene signal data from a combination of low and standard RNA templates; no such referencing appears to be necessary when comparing fold-changes of gene expression between standard and low template reactions.

  17. Shrinkage-based diagonal Hotelling’s tests for high-dimensional small sample size data

    KAUST Repository

    Dong, Kai

    2015-09-16

    DNA sequencing techniques bring novel tools and also statistical challenges to genetic research. In addition to detecting differentially expressed genes, testing the significance of gene sets or pathway analysis has been recognized as an equally important problem. Owing to the “large pp small nn” paradigm, the traditional Hotelling’s T2T2 test suffers from the singularity problem and therefore is not valid in this setting. In this paper, we propose a shrinkage-based diagonal Hotelling’s test for both one-sample and two-sample cases. We also suggest several different ways to derive the approximate null distribution under different scenarios of pp and nn for our proposed shrinkage-based test. Simulation studies show that the proposed method performs comparably to existing competitors when nn is moderate or large, but it is better when nn is small. In addition, we analyze four gene expression data sets and they demonstrate the advantage of our proposed shrinkage-based diagonal Hotelling’s test.

  18. Dealing with large sample sizes: comparison of a new one spot dot blot method to western blot.

    Science.gov (United States)

    Putra, Sulistyo Emantoko Dwi; Tsuprykov, Oleg; Von Websky, Karoline; Ritter, Teresa; Reichetzeder, Christoph; Hocher, Berthold

    2014-01-01

    Western blot is the gold standard method to determine individual protein expression levels. However, western blot is technically difficult to perform in large sample sizes because it is a time consuming and labor intensive process. Dot blot is often used instead when dealing with large sample sizes, but the main disadvantage of the existing dot blot techniques, is the absence of signal normalization to a housekeeping protein. In this study we established a one dot two development signals (ODTDS) dot blot method employing two different signal development systems. The first signal from the protein of interest was detected by horseradish peroxidase (HRP). The second signal, detecting the housekeeping protein, was obtained by using alkaline phosphatase (AP). Inter-assay results variations within ODTDS dot blot and western blot and intra-assay variations between both methods were low (1.04-5.71%) as assessed by coefficient of variation. ODTDS dot blot technique can be used instead of western blot when dealing with large sample sizes without a reduction in results accuracy.

  19. The Effect of Small Calibration Sample Sizes on TOEFL IRT-Based Equating.

    Science.gov (United States)

    Tang, K. Linda; And Others

    This study compared the performance of the LOGIST and BILOG computer programs on item response theory (IRT) based scaling and equating for the Test of English as a Foreign Language (TOEFL) using real and simulated data and two calibration structures. Applications of IRT for the TOEFL program are based on the three-parameter logistic (3PL) model.…

  20. Bayesian forecasting of recurrent earthquakes and predictive performance for a small sample size

    Science.gov (United States)

    Nomura, S.; Ogata, Y.; Komaki, F.; Toda, S.

    2011-04-01

    This paper presents a Bayesian method of probability forecasting for a renewal of earthquakes. When only limited records of characteristic earthquakes on a fault are available, relevant prior distributions for renewal model parameters are essential to computing unbiased, stable time-dependent earthquake probabilities. We also use event slip and geological slip rate data combined with historical earthquake records to improve our forecast model. We apply the Brownian Passage Time (BPT) model and make use of the best fit prior distribution for its coefficient of variation (the shape parameter, alpha) relative to the mean recurrence time because the Earthquake Research Committee (ERC) of Japan uses the BPT model for long-term forecasting. Currently, more than 110 active faults have been evaluated by the ERC, but most include very few paleoseismic events. We objectively select the prior distribution with the Akaike Bayesian Information Criterion using all available recurrence data including the ERC datasets. These data also include mean recurrence times estimated from slip per event divided by long-term slip rate. By comparing the goodness of fit to the historical record and simulated data, we show that the proposed predictor provides more stable performance than plug-in predictors, such as maximum likelihood estimates and the predictor currently adopted by the ERC.

  1. Distribution of human waste samples in relation to sizing waste processing in space

    Science.gov (United States)

    Parker, Dick; Gallagher, S. K.

    1992-01-01

    Human waste processing for closed ecological life support systems (CELSS) in space requires that there be an accurate knowledge of the quantity of wastes produced. Because initial CELSS will be handling relatively few individuals, it is important to know the variation that exists in the production of wastes rather than relying upon mean values that could result in undersizing equipment for a specific crew. On the other hand, because of the costs of orbiting equipment, it is important to design the equipment with a minimum of excess capacity because of the weight that extra capacity represents. A considerable quantity of information that had been independently gathered on waste production was examined in order to obtain estimates of equipment sizing requirements for handling waste loads from crews of 2 to 20 individuals. The recommended design for a crew of 8 should hold 34.5 liters per day (4315 ml/person/day) for urine and stool water and a little more than 1.25 kg per day (154 g/person/day) of human waste solids and sanitary supplies.

  2. The effect of sample size on fresh plasma thromboplastin ISI determination

    DEFF Research Database (Denmark)

    Poller, L; Van Den Besselaar, A M; Jespersen, J;

    1999-01-01

    The possibility of reduction of numbers of fresh coumarin and normal plasmas has been studied in a multicentre manual prothrombin (PT) calibration of high international sensitivity index (ISI) rabbit and low ISI human reference thromboplastins at 14 laboratories. The number of calibrant plasmas...... was reduced progressively by a computer program which generated random numbers to provide 1000 different selections for each reduced sample at each participant laboratory. Results were compared with those of the full set of 20 normal and 60 coumarin plasma calibrations. With the human reagent, 20 coumarins...... and seven normals still achieved the W.H.O. precision limit (3% CV of the slope), but with the rabbit reagent reduction coumarins with 17 normal plasmas led to unacceptable CV. Little reduction of numbers from the full set of 80 fresh plasmas appears advisable. For maximum confidence, when calibrating...

  3. Determination of minimum sample size for fault diagnosis of automobile hydraulic brake system using power analysis

    Directory of Open Access Journals (Sweden)

    V. Indira

    2015-03-01

    Full Text Available Hydraulic brake in automobile engineering is considered to be one of the important components. Condition monitoring and fault diagnosis of such a component is very essential for safety of passengers, vehicles and to minimize the unexpected maintenance time. Vibration based machine learning approach for condition monitoring of hydraulic brake system is gaining momentum. Training and testing the classifier are two important activities in the process of feature classification. This study proposes a systematic statistical method called power analysis to find the minimum number of samples required to train the classifier with statistical stability so as to get good classification accuracy. Descriptive statistical features have been used and the more contributing features have been selected by using C4.5 decision tree algorithm. The results of power analysis have also been verified using a decision tree algorithm namely, C4.5.

  4. Strategies for minimizing sample size for use in airborne LiDAR-based forest inventory

    Science.gov (United States)

    Junttila, Virpi; Finley, Andrew O.; Bradford, John B.; Kauranne, Tuomo

    2013-01-01

    Recently airborne Light Detection And Ranging (LiDAR) has emerged as a highly accurate remote sensing modality to be used in operational scale forest inventories. Inventories conducted with the help of LiDAR are most often model-based, i.e. they use variables derived from LiDAR point clouds as the predictive variables that are to be calibrated using field plots. The measurement of the necessary field plots is a time-consuming and statistically sensitive process. Because of this, current practice often presumes hundreds of plots to be collected. But since these plots are only used to calibrate regression models, it should be possible to minimize the number of plots needed by carefully selecting the plots to be measured. In the current study, we compare several systematic and random methods for calibration plot selection, with the specific aim that they be used in LiDAR based regression models for forest parameters, especially above-ground biomass. The primary criteria compared are based on both spatial representativity as well as on their coverage of the variability of the forest features measured. In the former case, it is important also to take into account spatial auto-correlation between the plots. The results indicate that choosing the plots in a way that ensures ample coverage of both spatial and feature space variability improves the performance of the corresponding models, and that adequate coverage of the variability in the feature space is the most important condition that should be met by the set of plots collected.

  5. Monte Carlo path sampling approach to modeling aeolian sediment transport

    Science.gov (United States)

    Hardin, E. J.; Mitasova, H.; Mitas, L.

    2011-12-01

    Coastal communities and vital infrastructure are subject to coastal hazards including storm surge and hurricanes. Coastal dunes offer protection by acting as natural barriers from waves and storm surge. During storms, these landforms and their protective function can erode; however, they can also erode even in the absence of storms due to daily wind and waves. Costly and often controversial beach nourishment and coastal construction projects are common erosion mitigation practices. With a more complete understanding of coastal morphology, the efficacy and consequences of anthropogenic activities could be better predicted. Currently, the research on coastal landscape evolution is focused on waves and storm surge, while only limited effort is devoted to understanding aeolian forces. Aeolian transport occurs when the wind supplies a shear stress that exceeds a critical value, consequently ejecting sand grains into the air. If the grains are too heavy to be suspended, they fall back to the grain bed where the collision ejects more grains. This is called saltation and is the salient process by which sand mass is transported. The shear stress required to dislodge grains is related to turbulent air speed. Subsequently, as sand mass is injected into the air, the wind loses speed along with its ability to eject more grains. In this way, the flux of saltating grains is itself influenced by the flux of saltating grains and aeolian transport becomes nonlinear. Aeolian sediment transport is difficult to study experimentally for reasons arising from the orders of magnitude difference between grain size and dune size. It is difficult to study theoretically because aeolian transport is highly nonlinear especially over complex landscapes. Current computational approaches have limitations as well; single grain models are mathematically simple but are computationally intractable even with modern computing power whereas cellular automota-based approaches are computationally efficient

  6. PROCESS MODELLING OF ROCK SAMPLE HANDLING IN PETROPHYSICAL LABORATORY

    Directory of Open Access Journals (Sweden)

    Adaleta Perković

    2010-12-01

    Full Text Available Everyday procedures carried out in petrophysical laboratory can be defined as a complete cycle of business processes. Sample handling process is one of the most significant and demanding procedures. It starts with sample receiving in laboratory and then subsequently, series of analyses and measurements are carrying out resulting in petrophysical parameters. Sample handling process ends with sample storage and archiving of obtained measurement data. Process model is used for description of repeating activities. Sample handling process is presented by graphical method and use of eEPC diagram (extended Event-Driven Process Chain which describe process based on events. Created process model jointly binds static laboratory resources (measuring instruments, computers and data, speeds up process with increasing the user’s efficiency and with improvements of data and information exchange. Besides flow of activity, model of data sample handling includes information about system components (laboratory equipment and software applications that carry out activities. Described model, with minor modifications and adaptations, can be used in any laboratory that is dealing with samples (the paper is published in Croatian.

  7. Estimating the Size of the Methamphetamine-Using Population in New York City Using Network Sampling Techniques.

    Science.gov (United States)

    Dombrowski, Kirk; Khan, Bilal; Wendel, Travis; McLean, Katherine; Misshula, Evan; Curtis, Ric

    2012-12-01

    As part of a recent study of the dynamics of the retail market for methamphetamine use in New York City, we used network sampling methods to estimate the size of the total networked population. This process involved sampling from respondents' list of co-use contacts, which in turn became the basis for capture-recapture estimation. Recapture sampling was based on links to other respondents derived from demographic and "telefunken" matching procedures-the latter being an anonymized version of telephone number matching. This paper describes the matching process used to discover the links between the solicited contacts and project respondents, the capture-recapture calculation, the estimation of "false matches", and the development of confidence intervals for the final population estimates. A final population of 12,229 was estimated, with a range of 8235 - 23,750. The techniques described here have the special virtue of deriving an estimate for a hidden population while retaining respondent anonymity and the anonymity of network alters, but likely require larger sample size than the 132 persons interviewed to attain acceptable confidence levels for the estimate.

  8. Application of nested sampling in statistical physics: the Potts model

    CERN Document Server

    Pfeifenberger, Manuel J

    2016-01-01

    We present a systematic benchmark study of the nested sampling algorithm on the basis of the Potts model. This model exhibits a first order phase transition for $q>4$ at the critical temperature. The numerical evaluation of the partition function and thermodynamic observables, which involves high dimensional sums of sharply structured multi-modal density functions, represents a major challenge to most standard numerical techniques, such as Markov Chain Monte Carlo. Nested sampling, on the other hand, is particularly suited for such problems. In this paper we will employ both, nested sampling and thermodynamic integration to evaluate the partition function of the Potts model. In both cases individual moves are based on Swendsen-Wang updates. A autocorrelation time analysis of both algorithms shows that the severe slowing down of thermodynamic integration around the critical temperature does not occur in nested sampling. In addition we show, how thermodynamic variables can be computed with high accuracy from th...

  9. The effects of different syringe volume, needle size and sample volume on blood gas analysis in syringes washed with heparin

    Science.gov (United States)

    Küme, Tuncay; Şişman, Ali Rıza; Solak, Ahmet; Tuğlu, Birsen; Çinkooğlu, Burcu; Çoker, Canan

    2012-01-01

    Introductıon: We evaluated the effect of different syringe volume, needle size and sample volume on blood gas analysis in syringes washed with heparin. Materials and methods: In this multi-step experimental study, percent dilution ratios (PDRs) and final heparin concentrations (FHCs) were calculated by gravimetric method for determining the effect of syringe volume (1, 2, 5 and 10 mL), needle size (20, 21, 22, 25 and 26 G) and sample volume (0.5, 1, 2, 5 and 10 mL). The effect of different PDRs and FHCs on blood gas and electrolyte parameters were determined. The erroneous results from nonstandardized sampling were evaluated according to RiliBAK’s TEa. Results: The increase of PDRs and FHCs was associated with the decrease of syringe volume, the increase of needle size and the decrease of sample volume: from 2.0% and 100 IU/mL in 10 mL-syringe to 7.0% and 351 IU/mL in 1 mL-syringe; from 4.9% and 245 IU/mL in 26G to 7.6% and 380 IU/mL in 20 G with combined 1 mL syringe; from 2.0% and 100 IU/mL in full-filled sample to 34% and 1675 IU/mL in 0.5 mL suctioned sample into 10 mL-syringe. There was no statistical difference in pH; but the percent decreasing in pCO2, K+, iCa2+, iMg2+; the percent increasing in pO2 and Na+ were statistical significance compared to samples full-filled in syringes. The all changes in pH and pO2 were acceptable; but the changes in pCO2, Na+, K+ and iCa2+ were unacceptable according to TEa limits except fullfilled-syringes. Conclusions: The changes in PDRs and FHCs due nonstandardized sampling in syringe washed with liquid heparin give rise to erroneous test results for pCO2 and electrolytes. PMID:22838185

  10. Modeling the Effects of Beam Size and Flaw Morphology on Ultrasonic Pulse/Echo Sizing of Delaminations in Carbon Composites

    Science.gov (United States)

    Margetan, Frank J.; Leckey, Cara A.; Barnard, Dan

    2012-01-01

    The size and shape of a delamination in a multi-layered structure can be estimated in various ways from an ultrasonic pulse/echo image. For example the -6dB contours of measured response provide one simple estimate of the boundary. More sophisticated approaches can be imagined where one adjusts the proposed boundary to bring measured and predicted UT images into optimal agreement. Such approaches require suitable models of the inspection process. In this paper we explore issues pertaining to model-based size estimation for delaminations in carbon fiber reinforced laminates. In particular we consider the influence on sizing when the delamination is non-planar or partially transmitting in certain regions. Two models for predicting broadband sonic time-domain responses are considered: (1) a fast "simple" model using paraxial beam expansions and Kirchhoff and phase-screen approximations; and (2) the more exact (but computationally intensive) 3D elastodynamic finite integration technique (EFIT). Model-to-model and model-to experiment comparisons are made for delaminations in uniaxial composite plates, and the simple model is then used to critique the -6dB rule for delamination sizing.