sample selection estimation: Topics by WorldWideScience.org

Sample records for sample selection estimation

Sample size estimation and sampling techniques for selecting a representative sample

Directory of Open Access Journals (Sweden)

Aamir Omair

2014-01-01

Full Text Available Introduction: The purpose of this article is to provide a general understanding of the concepts of sampling as applied to health-related research. Sample Size Estimation: It is important to select a representative sample in quantitative research in order to be able to generalize the results to the target population. The sample should be of the required sample size and must be selected using an appropriate probability sampling technique. There are many hidden biases which can adversely affect the outcome of the study. Important factors to consider for estimating the sample size include the size of the study population, confidence level, expected proportion of the outcome variable (for categorical variables/standard deviation of the outcome variable (for numerical variables, and the required precision (margin of accuracy from the study. The more the precision required, the greater is the required sample size. Sampling Techniques: The probability sampling techniques applied for health related research include simple random sampling, systematic random sampling, stratified random sampling, cluster sampling, and multistage sampling. These are more recommended than the nonprobability sampling techniques, because the results of the study can be generalized to the target population.
Sampling point selection for energy estimation in the quasicontinuum method

NARCIS (Netherlands)

Beex, L.A.A.; Peerlings, R.H.J.; Geers, M.G.D.

2010-01-01

The quasicontinuum (QC) method reduces computational costs of atomistic calculations by using interpolation between a small number of so-called repatoms to represent the displacements of the complete lattice and by selecting a small number of sampling atoms to estimate the total potential energy of
Failure Probability Estimation Using Asymptotic Sampling and Its Dependence upon the Selected Sampling Scheme

Directory of Open Access Journals (Sweden)

Martinásková Magdalena

2017-12-01

Full Text Available The article examines the use of Asymptotic Sampling (AS for the estimation of failure probability. The AS algorithm requires samples of multidimensional Gaussian random vectors, which may be obtained by many alternative means that influence the performance of the AS method. Several reliability problems (test functions have been selected in order to test AS with various sampling schemes: (i Monte Carlo designs; (ii LHS designs optimized using the Periodic Audze-Eglājs (PAE criterion; (iii designs prepared using Sobol’ sequences. All results are compared with the exact failure probability value.
Semiparametric efficient and robust estimation of an unknown symmetric population under arbitrary sample selection bias

KAUST Repository

Ma, Yanyuan

2013-09-01

We propose semiparametric methods to estimate the center and shape of a symmetric population when a representative sample of the population is unavailable due to selection bias. We allow an arbitrary sample selection mechanism determined by the data collection procedure, and we do not impose any parametric form on the population distribution. Under this general framework, we construct a family of consistent estimators of the center that is robust to population model misspecification, and we identify the efficient member that reaches the minimum possible estimation variance. The asymptotic properties and finite sample performance of the estimation and inference procedures are illustrated through theoretical analysis and simulations. A data example is also provided to illustrate the usefulness of the methods in practice. © 2013 American Statistical Association.
Accounting for animal movement in estimation of resource selection functions: sampling and data analysis.

Science.gov (United States)

Forester, James D; Im, Hae Kyung; Rathouz, Paul J

2009-12-01

Patterns of resource selection by animal populations emerge as a result of the behavior of many individuals. Statistical models that describe these population-level patterns of habitat use can miss important interactions between individual animals and characteristics of their local environment; however, identifying these interactions is difficult. One approach to this problem is to incorporate models of individual movement into resource selection models. To do this, we propose a model for step selection functions (SSF) that is composed of a resource-independent movement kernel and a resource selection function (RSF). We show that standard case-control logistic regression may be used to fit the SSF; however, the sampling scheme used to generate control points (i.e., the definition of availability) must be accommodated. We used three sampling schemes to analyze simulated movement data and found that ignoring sampling and the resource-independent movement kernel yielded biased estimates of selection. The level of bias depended on the method used to generate control locations, the strength of selection, and the spatial scale of the resource map. Using empirical or parametric methods to sample control locations produced biased estimates under stronger selection; however, we show that the addition of a distance function to the analysis substantially reduced that bias. Assuming a uniform availability within a fixed buffer yielded strongly biased selection estimates that could be corrected by including the distance function but remained inefficient relative to the empirical and parametric sampling methods. As a case study, we used location data collected from elk in Yellowstone National Park, USA, to show that selection and bias may be temporally variable. Because under constant selection the amount of bias depends on the scale at which a resource is distributed in the landscape, we suggest that distance always be included as a covariate in SSF analyses. This approach to
Estimating the residential demand function for natural gas in Seoul with correction for sample selection bias

International Nuclear Information System (INIS)

Yoo, Seung-Hoon; Lim, Hea-Jin; Kwak, Seung-Jun

2009-01-01

Over the last twenty years, the consumption of natural gas in Korea has increased dramatically. This increase has mainly resulted from the rise of consumption in the residential sector. The main objective of the study is to estimate households' demand function for natural gas by applying a sample selection model using data from a survey of households in Seoul. The results show that there exists a selection bias in the sample and that failure to correct for sample selection bias distorts the mean estimate, of the demand for natural gas, downward by 48.1%. In addition, according to the estimation results, the size of the house, the dummy variable for dwelling in an apartment, the dummy variable for having a bed in an inner room, and the household's income all have positive relationships with the demand for natural gas. On the other hand, the size of the family and the price of gas negatively contribute to the demand for natural gas. (author)
Optimal Selection of the Sampling Interval for Estimation of Modal Parameters by an ARMA- Model

DEFF Research Database (Denmark)

Kirkegaard, Poul Henning

1993-01-01

Optimal selection of the sampling interval for estimation of the modal parameters by an ARMA-model for a white noise loaded structure modelled as a single degree of- freedom linear mechanical system is considered. An analytical solution for an optimal uniform sampling interval, which is optimal...
Robust inference in sample selection models

KAUST Repository

Zhelonkin, Mikhail; Genton, Marc G.; Ronchetti, Elvezio

2015-01-01

The problem of non-random sample selectivity often occurs in practice in many fields. The classical estimators introduced by Heckman are the backbone of the standard statistical analysis of these models. However, these estimators are very sensitive to small deviations from the distributional assumptions which are often not satisfied in practice. We develop a general framework to study the robustness properties of estimators and tests in sample selection models. We derive the influence function and the change-of-variance function of Heckman's two-stage estimator, and we demonstrate the non-robustness of this estimator and its estimated variance to small deviations from the model assumed. We propose a procedure for robustifying the estimator, prove its asymptotic normality and give its asymptotic variance. Both cases with and without an exclusion restriction are covered. This allows us to construct a simple robust alternative to the sample selection bias test. We illustrate the use of our new methodology in an analysis of ambulatory expenditures and we compare the performance of the classical and robust methods in a Monte Carlo simulation study.
Robust inference in sample selection models

KAUST Repository

Zhelonkin, Mikhail

2015-11-20

The problem of non-random sample selectivity often occurs in practice in many fields. The classical estimators introduced by Heckman are the backbone of the standard statistical analysis of these models. However, these estimators are very sensitive to small deviations from the distributional assumptions which are often not satisfied in practice. We develop a general framework to study the robustness properties of estimators and tests in sample selection models. We derive the influence function and the change-of-variance function of Heckman\\'s two-stage estimator, and we demonstrate the non-robustness of this estimator and its estimated variance to small deviations from the model assumed. We propose a procedure for robustifying the estimator, prove its asymptotic normality and give its asymptotic variance. Both cases with and without an exclusion restriction are covered. This allows us to construct a simple robust alternative to the sample selection bias test. We illustrate the use of our new methodology in an analysis of ambulatory expenditures and we compare the performance of the classical and robust methods in a Monte Carlo simulation study.
Evaluation of sampling strategies to estimate crown biomass

Directory of Open Access Journals (Sweden)

Krishna P Poudel

2015-01-01

Full Text Available Background Depending on tree and site characteristics crown biomass accounts for a significant portion of the total aboveground biomass in the tree. Crown biomass estimation is useful for different purposes including evaluating the economic feasibility of crown utilization for energy production or forest products, fuel load assessments and fire management strategies, and wildfire modeling. However, crown biomass is difficult to predict because of the variability within and among species and sites. Thus the allometric equations used for predicting crown biomass should be based on data collected with precise and unbiased sampling strategies. In this study, we evaluate the performance different sampling strategies to estimate crown biomass and to evaluate the effect of sample size in estimating crown biomass. Methods Using data collected from 20 destructively sampled trees, we evaluated 11 different sampling strategies using six evaluation statistics: bias, relative bias, root mean square error (RMSE, relative RMSE, amount of biomass sampled, and relative biomass sampled. We also evaluated the performance of the selected sampling strategies when different numbers of branches (3, 6, 9, and 12 are selected from each tree. Tree specific log linear model with branch diameter and branch length as covariates was used to obtain individual branch biomass. Results Compared to all other methods stratified sampling with probability proportional to size estimation technique produced better results when three or six branches per tree were sampled. However, the systematic sampling with ratio estimation technique was the best when at least nine branches per tree were sampled. Under the stratified sampling strategy, selecting unequal number of branches per stratum produced approximately similar results to simple random sampling, but it further decreased RMSE when information on branch diameter is used in the design and estimation phases. Conclusions Use of
Variable selection and estimation for longitudinal survey data

KAUST Repository

Wang, Li

2014-09-01

There is wide interest in studying longitudinal surveys where sample subjects are observed successively over time. Longitudinal surveys have been used in many areas today, for example, in the health and social sciences, to explore relationships or to identify significant variables in regression settings. This paper develops a general strategy for the model selection problem in longitudinal sample surveys. A survey weighted penalized estimating equation approach is proposed to select significant variables and estimate the coefficients simultaneously. The proposed estimators are design consistent and perform as well as the oracle procedure when the correct submodel was known. The estimating function bootstrap is applied to obtain the standard errors of the estimated parameters with good accuracy. A fast and efficient variable selection algorithm is developed to identify significant variables for complex longitudinal survey data. Simulated examples are illustrated to show the usefulness of the proposed methodology under various model settings and sampling designs. © 2014 Elsevier Inc.
Systematic sampling of discrete and continuous populations: sample selection and the choice of estimator

Science.gov (United States)

Harry T. Valentine; David L. R. Affleck; Timothy G. Gregoire

2009-01-01

Systematic sampling is easy, efficient, and widely used, though it is not generally recognized that a systematic sample may be drawn from the population of interest with or without restrictions on randomization. The restrictions or the lack of them determine which estimators are unbiased, when using the sampling design as the basis for inference. We describe the...
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.

Science.gov (United States)

Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen

2011-04-01

Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.
Hybrid nested sampling algorithm for Bayesian model selection applied to inverse subsurface flow problems

International Nuclear Information System (INIS)

Elsheikh, Ahmed H.; Wheeler, Mary F.; Hoteit, Ibrahim

2014-01-01

A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems
Hybrid nested sampling algorithm for Bayesian model selection applied to inverse subsurface flow problems

Energy Technology Data Exchange (ETDEWEB)

Elsheikh, Ahmed H., E-mail: aelsheikh@ices.utexas.edu [Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX (United States); Institute of Petroleum Engineering, Heriot-Watt University, Edinburgh EH14 4AS (United Kingdom); Wheeler, Mary F. [Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX (United States); Hoteit, Ibrahim [Department of Earth Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), Thuwal (Saudi Arabia)

2014-02-01

A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems.
Hybrid nested sampling algorithm for Bayesian model selection applied to inverse subsurface flow problems

KAUST Repository

Elsheikh, Ahmed H.

2014-02-01

A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems. © 2013 Elsevier Inc.
Estimating HIES Data through Ratio and Regression Methods for Different Sampling Designs

Directory of Open Access Journals (Sweden)

Faqir Muhammad

2007-01-01

Full Text Available In this study, comparison has been made for different sampling designs, using the HIES data of North West Frontier Province (NWFP for 2001-02 and 1998-99 collected from the Federal Bureau of Statistics, Statistical Division, Government of Pakistan, Islamabad. The performance of the estimators has also been considered using bootstrap and Jacknife. A two-stage stratified random sample design is adopted by HIES. In the first stage, enumeration blocks and villages are treated as the first stage Primary Sampling Units (PSU. The sample PSU’s are selected with probability proportional to size. Secondary Sampling Units (SSU i.e., households are selected by systematic sampling with a random start. They have used a single study variable. We have compared the HIES technique with some other designs, which are: Stratified Simple Random Sampling. Stratified Systematic Sampling. Stratified Ranked Set Sampling. Stratified Two Phase Sampling. Ratio and Regression methods were applied with two study variables, which are: Income (y and Household sizes (x. Jacknife and Bootstrap are used for variance replication. Simple Random Sampling with sample size (462 to 561 gave moderate variances both by Jacknife and Bootstrap. By applying Systematic Sampling, we received moderate variance with sample size (467. In Jacknife with Systematic Sampling, we obtained variance of regression estimator greater than that of ratio estimator for a sample size (467 to 631. At a sample size (952 variance of ratio estimator gets greater than that of regression estimator. The most efficient design comes out to be Ranked set sampling compared with other designs. The Ranked set sampling with jackknife and bootstrap, gives minimum variance even with the smallest sample size (467. Two Phase sampling gave poor performance. Multi-stage sampling applied by HIES gave large variances especially if used with a single study variable.
Adaptive measurement selection for progressive damage estimation

Science.gov (United States)

Zhou, Wenfan; Kovvali, Narayan; Papandreou-Suppappola, Antonia; Chattopadhyay, Aditi; Peralta, Pedro

2011-04-01

Noise and interference in sensor measurements degrade the quality of data and have a negative impact on the performance of structural damage diagnosis systems. In this paper, a novel adaptive measurement screening approach is presented to automatically select the most informative measurements and use them intelligently for structural damage estimation. The method is implemented efficiently in a sequential Monte Carlo (SMC) setting using particle filtering. The noise suppression and improved damage estimation capability of the proposed method is demonstrated by an application to the problem of estimating progressive fatigue damage in an aluminum compact-tension (CT) sample using noisy PZT sensor measurements.
A sampling strategy for estimating plot average annual fluxes of chemical elements from forest soils

NARCIS (Netherlands)

Brus, D.J.; Gruijter, de J.J.; Vries, de W.

2010-01-01

A sampling strategy for estimating spatially averaged annual element leaching fluxes from forest soils is presented and tested in three Dutch forest monitoring plots. In this method sampling locations and times (days) are selected by probability sampling. Sampling locations were selected by
High-dimensional model estimation and model selection

CERN Multimedia

CERN. Geneva

2015-01-01

I will review concepts and algorithms from high-dimensional statistics for linear model estimation and model selection. I will particularly focus on the so-called p>>n setting where the number of variables p is much larger than the number of samples n. I will focus mostly on regularized statistical estimators that produce sparse models. Important examples include the LASSO and its matrix extension, the Graphical LASSO, and more recent non-convex methods such as the TREX. I will show the applicability of these estimators in a diverse range of scientific applications, such as sparse interaction graph recovery and high-dimensional classification and regression problems in genomics.

Estimating nonlinear selection gradients using quadratic regression coefficients: double or nothing?

Science.gov (United States)

Stinchcombe, John R; Agrawal, Aneil F; Hohenlohe, Paul A; Arnold, Stevan J; Blows, Mark W

2008-09-01

The use of regression analysis has been instrumental in allowing evolutionary biologists to estimate the strength and mode of natural selection. Although directional and correlational selection gradients are equal to their corresponding regression coefficients, quadratic regression coefficients must be doubled to estimate stabilizing/disruptive selection gradients. Based on a sample of 33 papers published in Evolution between 2002 and 2007, at least 78% of papers have not doubled quadratic regression coefficients, leading to an appreciable underestimate of the strength of stabilizing and disruptive selection. Proper treatment of quadratic regression coefficients is necessary for estimation of fitness surfaces and contour plots, canonical analysis of the gamma matrix, and modeling the evolution of populations on an adaptive landscape.
Gender Wage Gap : A Semi-Parametric Approach With Sample Selection Correction

NARCIS (Netherlands)

Picchio, M.; Mussida, C.

2010-01-01

Sizeable gender differences in employment rates are observed in many countries. Sample selection into the workforce might therefore be a relevant issue when estimating gender wage gaps. This paper proposes a new semi-parametric estimator of densities in the presence of covariates which incorporates
Estimation of reference intervals from small samples: an example using canine plasma creatinine.

Science.gov (United States)

Geffré, A; Braun, J P; Trumel, C; Concordet, D

2009-12-01

According to international recommendations, reference intervals should be determined from at least 120 reference individuals, which often are impossible to achieve in veterinary clinical pathology, especially for wild animals. When only a small number of reference subjects is available, the possible bias cannot be known and the normality of the distribution cannot be evaluated. A comparison of reference intervals estimated by different methods could be helpful. The purpose of this study was to compare reference limits determined from a large set of canine plasma creatinine reference values, and large subsets of this data, with estimates obtained from small samples selected randomly. Twenty sets each of 120 and 27 samples were randomly selected from a set of 1439 plasma creatinine results obtained from healthy dogs in another study. Reference intervals for the whole sample and for the large samples were determined by a nonparametric method. The estimated reference limits for the small samples were minimum and maximum, mean +/- 2 SD of native and Box-Cox-transformed values, 2.5th and 97.5th percentiles by a robust method on native and Box-Cox-transformed values, and estimates from diagrams of cumulative distribution functions. The whole sample had a heavily skewed distribution, which approached Gaussian after Box-Cox transformation. The reference limits estimated from small samples were highly variable. The closest estimates to the 1439-result reference interval for 27-result subsamples were obtained by both parametric and robust methods after Box-Cox transformation but were grossly erroneous in some cases. For small samples, it is recommended that all values be reported graphically in a dot plot or histogram and that estimates of the reference limits be compared using different methods.
Sampling and estimating recreational use.

Science.gov (United States)

Timothy G. Gregoire; Gregory J. Buhyoff

1999-01-01

Probability sampling methods applicable to estimate recreational use are presented. Both single- and multiple-access recreation sites are considered. One- and two-stage sampling methods are presented. Estimation of recreational use is presented in a series of examples.
A comparison of small-area estimation techniques to estimate selected stand attributes using LiDAR-derived auxiliary variables

Science.gov (United States)

Michael E. Goerndt; Vicente J. Monleon; Hailemariam. Temesgen

2011-01-01

One of the challenges often faced in forestry is the estimation of forest attributes for smaller areas of interest within a larger population. Small-area estimation (SAE) is a set of techniques well suited to estimation of forest attributes for small areas in which the existing sample size is small and auxiliary information is available. Selected SAE methods were...
Are we under-estimating the association between autism symptoms?: The importance of considering simultaneous selection when using samples of individuals who meet diagnostic criteria for an autism spectrum disorder.

Science.gov (United States)

Murray, Aja Louise; McKenzie, Karen; Kuenssberg, Renate; O'Donnell, Michael

2014-11-01

The magnitude of symptom inter-correlations in diagnosed individuals has contributed to the evidence that autism spectrum disorders (ASD) is a fractionable disorder. Such correlations may substantially under-estimate the population correlations among symptoms due to simultaneous selection on the areas of deficit required for diagnosis. Using statistical simulations of this selection mechanism, we provide estimates of the extent of this bias, given different levels of population correlation between symptoms. We then use real data to compare domain inter-correlations in the Autism Spectrum Quotient, in those with ASD versus a combined ASD and non-ASD sample. Results from both studies indicate that samples restricted to individuals with a diagnosis of ASD potentially substantially under-estimate the magnitude of association between features of ASD.
Estimation of AUC or Partial AUC under Test-Result-Dependent Sampling.

Science.gov (United States)

Wang, Xiaofei; Ma, Junling; George, Stephen; Zhou, Haibo

2012-01-01

The area under the ROC curve (AUC) and partial area under the ROC curve (pAUC) are summary measures used to assess the accuracy of a biomarker in discriminating true disease status. The standard sampling approach used in biomarker validation studies is often inefficient and costly, especially when ascertaining the true disease status is costly and invasive. To improve efficiency and reduce the cost of biomarker validation studies, we consider a test-result-dependent sampling (TDS) scheme, in which subject selection for determining the disease state is dependent on the result of a biomarker assay. We first estimate the test-result distribution using data arising from the TDS design. With the estimated empirical test-result distribution, we propose consistent nonparametric estimators for AUC and pAUC and establish the asymptotic properties of the proposed estimators. Simulation studies show that the proposed estimators have good finite sample properties and that the TDS design yields more efficient AUC and pAUC estimates than a simple random sampling (SRS) design. A data example based on an ongoing cancer clinical trial is provided to illustrate the TDS design and the proposed estimators. This work can find broad applications in design and analysis of biomarker validation studies.
On a Robust MaxEnt Process Regression Model with Sample-Selection

Directory of Open Access Journals (Sweden)

Hea-Jung Kim

2018-04-01

Full Text Available In a regression analysis, a sample-selection bias arises when a dependent variable is partially observed as a result of the sample selection. This study introduces a Maximum Entropy (MaxEnt process regression model that assumes a MaxEnt prior distribution for its nonparametric regression function and finds that the MaxEnt process regression model includes the well-known Gaussian process regression (GPR model as a special case. Then, this special MaxEnt process regression model, i.e., the GPR model, is generalized to obtain a robust sample-selection Gaussian process regression (RSGPR model that deals with non-normal data in the sample selection. Various properties of the RSGPR model are established, including the stochastic representation, distributional hierarchy, and magnitude of the sample-selection bias. These properties are used in the paper to develop a hierarchical Bayesian methodology to estimate the model. This involves a simple and computationally feasible Markov chain Monte Carlo algorithm that avoids analytical or numerical derivatives of the log-likelihood function of the model. The performance of the RSGPR model in terms of the sample-selection bias correction, robustness to non-normality, and prediction, is demonstrated through results in simulations that attest to its good finite-sample performance.
Low-sampling-rate ultra-wideband channel estimation using equivalent-time sampling

KAUST Repository

Ballal, Tarig

2014-09-01

In this paper, a low-sampling-rate scheme for ultra-wideband channel estimation is proposed. The scheme exploits multiple observations generated by transmitting multiple pulses. In the proposed scheme, P pulses are transmitted to produce channel impulse response estimates at a desired sampling rate, while the ADC samples at a rate that is P times slower. To avoid loss of fidelity, the number of sampling periods (based on the desired rate) in the inter-pulse interval is restricted to be co-prime with P. This condition is affected when clock drift is present and the transmitted pulse locations change. To handle this case, and to achieve an overall good channel estimation performance, without using prior information, we derive an improved estimator based on the bounded data uncertainty (BDU) model. It is shown that this estimator is related to the Bayesian linear minimum mean squared error (LMMSE) estimator. Channel estimation performance of the proposed sub-sampling scheme combined with the new estimator is assessed in simulation. The results show that high reduction in sampling rate can be achieved. The proposed estimator outperforms the least squares estimator in almost all cases, while in the high SNR regime it also outperforms the LMMSE estimator. In addition to channel estimation, a synchronization method is also proposed that utilizes the same pulse sequence used for channel estimation. © 2014 IEEE.
Hybrid nested sampling algorithm for Bayesian model selection applied to inverse subsurface flow problems

KAUST Repository

Elsheikh, Ahmed H.; Wheeler, Mary Fanett; Hoteit, Ibrahim

2014-01-01

A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using
An Improved Nested Sampling Algorithm for Model Selection and Assessment

Science.gov (United States)

Zeng, X.; Ye, M.; Wu, J.; WANG, D.

2017-12-01

Multimodel strategy is a general approach for treating model structure uncertainty in recent researches. The unknown groundwater system is represented by several plausible conceptual models. Each alternative conceptual model is attached with a weight which represents the possibility of this model. In Bayesian framework, the posterior model weight is computed as the product of model prior weight and marginal likelihood (or termed as model evidence). As a result, estimating marginal likelihoods is crucial for reliable model selection and assessment in multimodel analysis. Nested sampling estimator (NSE) is a new proposed algorithm for marginal likelihood estimation. The implementation of NSE comprises searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm and its variants are often used for local sampling in NSE. However, M-H is not an efficient sampling algorithm for high-dimensional or complex likelihood function. For improving the performance of NSE, it could be feasible to integrate more efficient and elaborated sampling algorithm - DREAMzs into the local sampling. In addition, in order to overcome the computation burden problem of large quantity of repeating model executions in marginal likelihood estimation, an adaptive sparse grid stochastic collocation method is used to build the surrogates for original groundwater model.
Observed Characteristics and Teacher Quality: Impacts of Sample Selection on a Value Added Model

Science.gov (United States)

Winters, Marcus A.; Dixon, Bruce L.; Greene, Jay P.

2012-01-01

We measure the impact of observed teacher characteristics on student math and reading proficiency using a rich dataset from Florida. We expand upon prior work by accounting directly for nonrandom attrition of teachers from the classroom in a sample selection framework. We find evidence that sample selection is present in the estimation of the…
Nested sampling algorithm for subsurface flow model selection, uncertainty quantification, and nonlinear calibration

KAUST Repository

Elsheikh, A. H.

2013-12-01

Calibration of subsurface flow models is an essential step for managing ground water aquifers, designing of contaminant remediation plans, and maximizing recovery from hydrocarbon reservoirs. We investigate an efficient sampling algorithm known as nested sampling (NS), which can simultaneously sample the posterior distribution for uncertainty quantification, and estimate the Bayesian evidence for model selection. Model selection statistics, such as the Bayesian evidence, are needed to choose or assign different weights to different models of different levels of complexities. In this work, we report the first successful application of nested sampling for calibration of several nonlinear subsurface flow problems. The estimated Bayesian evidence by the NS algorithm is used to weight different parameterizations of the subsurface flow models (prior model selection). The results of the numerical evaluation implicitly enforced Occam\\'s razor where simpler models with fewer number of parameters are favored over complex models. The proper level of model complexity was automatically determined based on the information content of the calibration data and the data mismatch of the calibrated model.
Estimation of breeding values using selected pedigree records.

Science.gov (United States)

Morton, Richard; Howarth, Jordan M

2005-06-01

Fish bred in tanks or ponds cannot be easily tagged individually. The parentage of any individual may be determined by DNA fingerprinting, but is sufficiently expensive that large numbers cannot be so finger-printed. The measurement of the objective trait can be made on a much larger sample relatively cheaply. This article deals with experimental designs for selecting individuals to be finger-printed and for the estimation of the individual and family breeding values. The general setup provides estimates for both genetic effects regarded as fixed or random and for fixed effects due to known regressors. The family effects can be well estimated when even very small numbers are finger-printed, provided that they are the individuals with the most extreme phenotypes.
Estimating uncertainty in multivariate responses to selection.

Science.gov (United States)

Stinchcombe, John R; Simonsen, Anna K; Blows, Mark W

2014-04-01

Predicting the responses to natural selection is one of the key goals of evolutionary biology. Two of the challenges in fulfilling this goal have been the realization that many estimates of natural selection might be highly biased by environmentally induced covariances between traits and fitness, and that many estimated responses to selection do not incorporate or report uncertainty in the estimates. Here we describe the application of a framework that blends the merits of the Robertson-Price Identity approach and the multivariate breeder's equation to address these challenges. The approach allows genetic covariance matrices, selection differentials, selection gradients, and responses to selection to be estimated without environmentally induced bias, direct and indirect selection and responses to selection to be distinguished, and if implemented in a Bayesian-MCMC framework, statistically robust estimates of uncertainty on all of these parameters to be made. We illustrate our approach with a worked example of previously published data. More generally, we suggest that applying both the Robertson-Price Identity and the multivariate breeder's equation will facilitate hypothesis testing about natural selection, genetic constraints, and evolutionary responses. © 2013 The Author(s). Evolution © 2013 The Society for the Study of Evolution.
Selection of the Sample for Data-Driven $Z \\to \

CERN Document Server

Krauss, Martin

2009-01-01

The topic of this study was to improve the selection of the sample for data-driven Z → ν ν background estimation, which is a major contribution in supersymmetric searches in ̄ a no-lepton search mode. The data is based on Z → + − samples using data created with ATLAS simulation software. This method works if two leptons are reconstructed, but using cuts that are typical for SUSY searches reconstruction efficiency for electrons and muons is rather low. For this reason it was tried to enhance the data sample. Therefore events were considered, where only one electron was reconstructed. In this case the invariant mass for the electron and each jet was computed to select the jet with the best match for the Z boson mass as not reconstructed electron. This way the sample can be extended but significantly looses purity because of also reconstructed background events. To improve this method other variables have to be considered which were not available for this study. Applying a similar method to muons using ...
Comparison of Three Plot Selection Methods for Estimating Change in Temporally Variable, Spatially Clustered Populations.

Energy Technology Data Exchange (ETDEWEB)

Thompson, William L. [Bonneville Power Administration, Portland, OR (US). Environment, Fish and Wildlife

2001-07-01

Monitoring population numbers is important for assessing trends and meeting various legislative mandates. However, sampling across time introduces a temporal aspect to survey design in addition to the spatial one. For instance, a sample that is initially representative may lose this attribute if there is a shift in numbers and/or spatial distribution in the underlying population that is not reflected in later sampled plots. Plot selection methods that account for this temporal variability will produce the best trend estimates. Consequently, I used simulation to compare bias and relative precision of estimates of population change among stratified and unstratified sampling designs based on permanent, temporary, and partial replacement plots under varying levels of spatial clustering, density, and temporal shifting of populations. Permanent plots produced more precise estimates of change than temporary plots across all factors. Further, permanent plots performed better than partial replacement plots except for high density (5 and 10 individuals per plot) and 25% - 50% shifts in the population. Stratified designs always produced less precise estimates of population change for all three plot selection methods, and often produced biased change estimates and greatly inflated variance estimates under sampling with partial replacement. Hence, stratification that remains fixed across time should be avoided when monitoring populations that are likely to exhibit large changes in numbers and/or spatial distribution during the study period. Key words: bias; change estimation; monitoring; permanent plots; relative precision; sampling with partial replacement; temporary plots.
Climate Change and Agricultural Productivity in Sub-Saharan Africa: A Spatial Sample Selection Model

NARCIS (Netherlands)

Ward, P.S.; Florax, R.J.G.M.; Flores-Lagunes, A.

2014-01-01

Using spatially explicit data, we estimate a cereal yield response function using a recently developed estimator for spatial error models when endogenous sample selection is of concern. Our results suggest that yields across Sub-Saharan Africa will decline with projected climatic changes, and that
Design-based estimators for snowball sampling

OpenAIRE

Shafie, Termeh

2010-01-01

Snowball sampling, where existing study subjects recruit further subjects from amongtheir acquaintances, is a popular approach when sampling from hidden populations.Since people with many in-links are more likely to be selected, there will be a selectionbias in the samples obtained. In order to eliminate this bias, the sample data must beweighted. However, the exact selection probabilities are unknown for snowball samplesand need to be approximated in an appropriate way. This paper proposes d...
Principal Stratification in sample selection problems with non normal error terms

DEFF Research Database (Denmark)

Rocci, Roberto; Mellace, Giovanni

The aim of the paper is to relax distributional assumptions on the error terms, often imposed in parametric sample selection models to estimate causal effects, when plausible exclusion restrictions are not available. Within the principal stratification framework, we approximate the true distribut...... an application to the Job Corps training program....

40 CFR 89.507 - Sample selection.

Science.gov (United States)

2010-07-01

... Auditing § 89.507 Sample selection. (a) Engines comprising a test sample will be selected at the location...). However, once the manufacturer ships any test engine, it relinquishes the prerogative to conduct retests...
40 CFR 90.507 - Sample selection.

Science.gov (United States)

2010-07-01

... Auditing § 90.507 Sample selection. (a) Engines comprising a test sample will be selected at the location... manufacturer ships any test engine, it relinquishes the prerogative to conduct retests as provided in § 90.508...
An econometric method for estimating population parameters from non-random samples: An application to clinical case finding.

Science.gov (United States)

Burger, Rulof P; McLaren, Zoë M

2017-09-01

The problem of sample selection complicates the process of drawing inference about populations. Selective sampling arises in many real world situations when agents such as doctors and customs officials search for targets with high values of a characteristic. We propose a new method for estimating population characteristics from these types of selected samples. We develop a model that captures key features of the agent's sampling decision. We use a generalized method of moments with instrumental variables and maximum likelihood to estimate the population prevalence of the characteristic of interest and the agents' accuracy in identifying targets. We apply this method to tuberculosis (TB), which is the leading infectious disease cause of death worldwide. We use a national database of TB test data from South Africa to examine testing for multidrug resistant TB (MDR-TB). Approximately one quarter of MDR-TB cases was undiagnosed between 2004 and 2010. The official estimate of 2.5% is therefore too low, and MDR-TB prevalence is as high as 3.5%. Signal-to-noise ratios are estimated to be between 0.5 and 1. Our approach is widely applicable because of the availability of routinely collected data and abundance of potential instruments. Using routinely collected data to monitor population prevalence can guide evidence-based policy making. Copyright © 2017 John Wiley & Sons, Ltd.
Radiation dose estimates due to air particulate emissions from selected phosphate industry operations

International Nuclear Information System (INIS)

Partridge, J.E.; Horton, T.R.; Sensintaffar, E.L.; Boysen, G.A.

1978-06-01

The EPA Office of Radiation Programs has conducted a series of studies to determine the radiological impact of the phosphate mining and milling industry. This report describes the efforts to estimate the radiation doses due to airborne emissions of particulates from selected phosphate milling operations in Florida. Two wet process phosphoric acid plants and one ore drying facility were selected for this study. The 1976 Annual Operations/Emissions Report, submitted by each facility to the Florida Department of Environmental Regulation, and a field survey trip by EPA personnel to each facility were used to develop data for dose calculations. The field survey trip included sampling for stack emissions and ambient air samples collected in the general vicinity of each plant. Population and individual radiation dose estimates are made based on these sources of data
Heterogeneous Causal Effects and Sample Selection Bias

DEFF Research Database (Denmark)

Breen, Richard; Choi, Seongsoo; Holm, Anders

2015-01-01

The role of education in the process of socioeconomic attainment is a topic of long standing interest to sociologists and economists. Recently there has been growing interest not only in estimating the average causal effect of education on outcomes such as earnings, but also in estimating how...... causal effects might vary over individuals or groups. In this paper we point out one of the under-appreciated hazards of seeking to estimate heterogeneous causal effects: conventional selection bias (that is, selection on baseline differences) can easily be mistaken for heterogeneity of causal effects....... This might lead us to find heterogeneous effects when the true effect is homogenous, or to wrongly estimate not only the magnitude but also the sign of heterogeneous effects. We apply a test for the robustness of heterogeneous causal effects in the face of varying degrees and patterns of selection bias...
Optimal Bandwidth Selection for Kernel Density Functionals Estimation

Directory of Open Access Journals (Sweden)

Su Chen

2015-01-01

Full Text Available The choice of bandwidth is crucial to the kernel density estimation (KDE and kernel based regression. Various bandwidth selection methods for KDE and local least square regression have been developed in the past decade. It has been known that scale and location parameters are proportional to density functionals ∫γ(xf2(xdx with appropriate choice of γ(x and furthermore equality of scale and location tests can be transformed to comparisons of the density functionals among populations. ∫γ(xf2(xdx can be estimated nonparametrically via kernel density functionals estimation (KDFE. However, the optimal bandwidth selection for KDFE of ∫γ(xf2(xdx has not been examined. We propose a method to select the optimal bandwidth for the KDFE. The idea underlying this method is to search for the optimal bandwidth by minimizing the mean square error (MSE of the KDFE. Two main practical bandwidth selection techniques for the KDFE of ∫γ(xf2(xdx are provided: Normal scale bandwidth selection (namely, “Rule of Thumb” and direct plug-in bandwidth selection. Simulation studies display that our proposed bandwidth selection methods are superior to existing density estimation bandwidth selection methods in estimating density functionals.
Total arsenic in selected food samples from Argentina: Estimation of their contribution to inorganic arsenic dietary intake.

Science.gov (United States)

Sigrist, Mirna; Hilbe, Nandi; Brusa, Lucila; Campagnoli, Darío; Beldoménico, Horacio

2016-11-01

An optimized flow injection hydride generation atomic absorption spectroscopy (FI-HGAAS) method was used to determine total arsenic in selected food samples (beef, chicken, fish, milk, cheese, egg, rice, rice-based products, wheat flour, corn flour, oats, breakfast cereals, legumes and potatoes) and to estimate their contributions to inorganic arsenic dietary intake. The limit of detection (LOD) and limit of quantification (LOQ) values obtained were 6μgkg(-)(1) and 18μgkg(-)(1), respectively. The mean recovery range obtained for all food at a fortification level of 200μgkg(-)(1) was 85-110%. Accuracy was evaluated using dogfish liver certified reference material (DOLT-3 NRC) for trace metals. The highest total arsenic concentrations (in μgkg(-)(1)) were found in fish (152-439), rice (87-316) and rice-based products (52-201). The contribution to inorganic arsenic (i-As) intake was calculated from the mean i-As content of each food (calculated by applying conversion factors to total arsenic data) and the mean consumption per day. The primary contributors to inorganic arsenic intake were wheat flour, including its proportion in wheat flour-based products (breads, pasta and cookies), followed by rice; both foods account for close to 53% and 17% of the intake, respectively. The i-As dietary intake, estimated as 10.7μgday(-)(1), was significantly lower than that from drinking water in vast regions of Argentina. Copyright © 2016 Elsevier Ltd. All rights reserved.
Vast Volatility Matrix Estimation using High Frequency Data for Portfolio Selection*

Science.gov (United States)

Fan, Jianqing; Li, Yingying; Yu, Ke

2012-01-01

Portfolio allocation with gross-exposure constraint is an effective method to increase the efficiency and stability of portfolios selection among a vast pool of assets, as demonstrated in Fan et al. (2011). The required high-dimensional volatility matrix can be estimated by using high frequency financial data. This enables us to better adapt to the local volatilities and local correlations among vast number of assets and to increase significantly the sample size for estimating the volatility matrix. This paper studies the volatility matrix estimation using high-dimensional high-frequency data from the perspective of portfolio selection. Specifically, we propose the use of “pairwise-refresh time” and “all-refresh time” methods based on the concept of “refresh time” proposed by Barndorff-Nielsen et al. (2008) for estimation of vast covariance matrix and compare their merits in the portfolio selection. We establish the concentration inequalities of the estimates, which guarantee desirable properties of the estimated volatility matrix in vast asset allocation with gross exposure constraints. Extensive numerical studies are made via carefully designed simulations. Comparing with the methods based on low frequency daily data, our methods can capture the most recent trend of the time varying volatility and correlation, hence provide more accurate guidance for the portfolio allocation in the next time period. The advantage of using high-frequency data is significant in our simulation and empirical studies, which consist of 50 simulated assets and 30 constituent stocks of Dow Jones Industrial Average index. PMID:23264708
Vast Volatility Matrix Estimation using High Frequency Data for Portfolio Selection.

Science.gov (United States)

Fan, Jianqing; Li, Yingying; Yu, Ke

2012-01-01

Portfolio allocation with gross-exposure constraint is an effective method to increase the efficiency and stability of portfolios selection among a vast pool of assets, as demonstrated in Fan et al. (2011). The required high-dimensional volatility matrix can be estimated by using high frequency financial data. This enables us to better adapt to the local volatilities and local correlations among vast number of assets and to increase significantly the sample size for estimating the volatility matrix. This paper studies the volatility matrix estimation using high-dimensional high-frequency data from the perspective of portfolio selection. Specifically, we propose the use of "pairwise-refresh time" and "all-refresh time" methods based on the concept of "refresh time" proposed by Barndorff-Nielsen et al. (2008) for estimation of vast covariance matrix and compare their merits in the portfolio selection. We establish the concentration inequalities of the estimates, which guarantee desirable properties of the estimated volatility matrix in vast asset allocation with gross exposure constraints. Extensive numerical studies are made via carefully designed simulations. Comparing with the methods based on low frequency daily data, our methods can capture the most recent trend of the time varying volatility and correlation, hence provide more accurate guidance for the portfolio allocation in the next time period. The advantage of using high-frequency data is significant in our simulation and empirical studies, which consist of 50 simulated assets and 30 constituent stocks of Dow Jones Industrial Average index.
Population genetics inference for longitudinally-sampled mutants under strong selection.

Science.gov (United States)

Lacerda, Miguel; Seoighe, Cathal

2014-11-01

Longitudinal allele frequency data are becoming increasingly prevalent. Such samples permit statistical inference of the population genetics parameters that influence the fate of mutant variants. To infer these parameters by maximum likelihood, the mutant frequency is often assumed to evolve according to the Wright-Fisher model. For computational reasons, this discrete model is commonly approximated by a diffusion process that requires the assumption that the forces of natural selection and mutation are weak. This assumption is not always appropriate. For example, mutations that impart drug resistance in pathogens may evolve under strong selective pressure. Here, we present an alternative approximation to the mutant-frequency distribution that does not make any assumptions about the magnitude of selection or mutation and is much more computationally efficient than the standard diffusion approximation. Simulation studies are used to compare the performance of our method to that of the Wright-Fisher and Gaussian diffusion approximations. For large populations, our method is found to provide a much better approximation to the mutant-frequency distribution when selection is strong, while all three methods perform comparably when selection is weak. Importantly, maximum-likelihood estimates of the selection coefficient are severely attenuated when selection is strong under the two diffusion models, but not when our method is used. This is further demonstrated with an application to mutant-frequency data from an experimental study of bacteriophage evolution. We therefore recommend our method for estimating the selection coefficient when the effective population size is too large to utilize the discrete Wright-Fisher model. Copyright © 2014 by the Genetics Society of America.
Optimal complex exponentials BEM and channel estimation in doubly selective channel

International Nuclear Information System (INIS)

Song, Lijun; Lei, Xia; Yu, Feng; Jin, Maozhu

2016-01-01

Over doubly selective channel, the optimal complex exponentials BEM (CE-BEM) is required to characterize the transmission in transform domain in order to reducing the huge number of the estimated parameters during directly estimating the impulse response in time domain. This paper proposed an improved CE-BEM to alleviating the high frequency sampling error caused by conventional CE-BEM. On the one hand, exploiting the improved CE-BEM, we achieve the sampling point is in the Doppler spread spectrum and the maximum sampling frequency is equal to the maximum Doppler shift. On the other hand we optimize the function and dimension of basis in CE-BEM respectively ,and obtain the closed solution of the EM based channel estimation differential operator by exploiting the above optimal BEM. Finally, the numerical results and theoretic analysis show that the dimension of basis is mainly depend on the maximum Doppler shift and signal-to-noise ratio (SNR), and if fixing the number of the pilot symbol, the dimension of basis is higher, the modeling error is smaller, while the accuracy of the parameter estimation is reduced, which implies that we need to achieve a tradeoff between the modeling error and the accuracy of the parameter estimation and the basis function influences the accuracy of describing the Doppler spread spectrum after identifying the dimension of the basis.
Estimation of a multivariate mean under model selection uncertainty

Directory of Open Access Journals (Sweden)

Georges Nguefack-Tsague

2014-05-01

Full Text Available Model selection uncertainty would occur if we selected a model based on one data set and subsequently applied it for statistical inferences, because the "correct" model would not be selected with certainty. When the selection and inference are based on the same dataset, some additional problems arise due to the correlation of the two stages (selection and inference. In this paper model selection uncertainty is considered and model averaging is proposed. The proposal is related to the theory of James and Stein of estimating more than three parameters from independent normal observations. We suggest that a model averaging scheme taking into account the selection procedure could be more appropriate than model selection alone. Some properties of this model averaging estimator are investigated; in particular we show using Stein's results that it is a minimax estimator and can outperform Stein-type estimators.
Sample Selection for Training Cascade Detectors.

Science.gov (United States)

Vállez, Noelia; Deniz, Oscar; Bueno, Gloria

2015-01-01

Automatic detection systems usually require large and representative training datasets in order to obtain good detection and false positive rates. Training datasets are such that the positive set has few samples and/or the negative set should represent anything except the object of interest. In this respect, the negative set typically contains orders of magnitude more images than the positive set. However, imbalanced training databases lead to biased classifiers. In this paper, we focus our attention on a negative sample selection method to properly balance the training data for cascade detectors. The method is based on the selection of the most informative false positive samples generated in one stage to feed the next stage. The results show that the proposed cascade detector with sample selection obtains on average better partial AUC and smaller standard deviation than the other compared cascade detectors.
Estimating the encounter rate variance in distance sampling

Science.gov (United States)

Fewster, R.M.; Buckland, S.T.; Burnham, K.P.; Borchers, D.L.; Jupp, P.E.; Laake, J.L.; Thomas, L.

2009-01-01

The dominant source of variance in line transect sampling is usually the encounter rate variance. Systematic survey designs are often used to reduce the true variability among different realizations of the design, but estimating the variance is difficult and estimators typically approximate the variance by treating the design as a simple random sample of lines. We explore the properties of different encounter rate variance estimators under random and systematic designs. We show that a design-based variance estimator improves upon the model-based estimator of Buckland et al. (2001, Introduction to Distance Sampling. Oxford: Oxford University Press, p. 79) when transects are positioned at random. However, if populations exhibit strong spatial trends, both estimators can have substantial positive bias under systematic designs. We show that poststratification is effective in reducing this bias. ?? 2008, The International Biometric Society.
Sample Selection for Training Cascade Detectors.

Directory of Open Access Journals (Sweden)

Noelia Vállez

Full Text Available Automatic detection systems usually require large and representative training datasets in order to obtain good detection and false positive rates. Training datasets are such that the positive set has few samples and/or the negative set should represent anything except the object of interest. In this respect, the negative set typically contains orders of magnitude more images than the positive set. However, imbalanced training databases lead to biased classifiers. In this paper, we focus our attention on a negative sample selection method to properly balance the training data for cascade detectors. The method is based on the selection of the most informative false positive samples generated in one stage to feed the next stage. The results show that the proposed cascade detector with sample selection obtains on average better partial AUC and smaller standard deviation than the other compared cascade detectors.
Sample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys.

Science.gov (United States)

Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R

2017-09-14

While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. ©Elizabeth Fearon, Sungai T Chabata, Jennifer A Thompson, Frances M Cowan, James R Hargreaves. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 14.09.2017.
Estimation of the ancestral effective population sizes of African great apes under different selection regimes.

Science.gov (United States)

Schrago, Carlos G

2014-08-01

Reliable estimates of ancestral effective population sizes are necessary to unveil the population-level phenomena that shaped the phylogeny and molecular evolution of the African great apes. Although several methods have previously been applied to infer ancestral effective population sizes, an analysis of the influence of the selective regime on the estimates of ancestral demography has not been thoroughly conducted. In this study, three independent data sets under different selective regimes were used were composed to tackle this issue. The results showed that selection had a significant impact on the estimates of ancestral effective population sizes of the African great apes. The inference of the ancestral demography of African great apes was affected by the selection regime. The effects, however, were not homogeneous along the ancestral populations of great apes. The effective population size of the ancestor of humans and chimpanzees was more impacted by the selection regime when compared to the same parameter in the ancestor of humans, chimpanzees and gorillas. Because the selection regime influenced the estimates of ancestral effective population size, it is reasonable to assume that a portion of the discrepancy found in previous studies that inferred the ancestral effective population size may be attributable to the differential action of selection on the genes sampled.
Indirect estimation of signal-dependent noise with nonadaptive heterogeneous samples.

Science.gov (United States)

Azzari, Lucio; Foi, Alessandro

2014-08-01

We consider the estimation of signal-dependent noise from a single image. Unlike conventional algorithms that build a scatterplot of local mean-variance pairs from either small or adaptively selected homogeneous data samples, our proposed approach relies on arbitrarily large patches of heterogeneous data extracted at random from the image. We demonstrate the feasibility of our approach through an extensive theoretical analysis based on mixture of Gaussian distributions. A prototype algorithm is also developed in order to validate the approach on simulated data as well as on real camera raw images.
On efficiency of some ratio estimators in double sampling design ...

African Journals Online (AJOL)

In this paper, three sampling ratio estimators in double sampling design were proposed with the intention of finding an alternative double sampling design estimator to the conventional ratio estimator in double sampling design discussed by Cochran (1997), Okafor (2002) , Raj (1972) and Raj and Chandhok (1999).
Estimation of sample size and testing power (Part 4).

Science.gov (United States)

Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

2012-01-01

Sample size estimation is necessary for any experimental or survey research. An appropriate estimation of sample size based on known information and statistical knowledge is of great significance. This article introduces methods of sample size estimation of difference test for data with the design of one factor with two levels, including sample size estimation formulas and realization based on the formulas and the POWER procedure of SAS software for quantitative data and qualitative data with the design of one factor with two levels. In addition, this article presents examples for analysis, which will play a leading role for researchers to implement the repetition principle during the research design phase.

UNLABELED SELECTED SAMPLES IN FEATURE EXTRACTION FOR CLASSIFICATION OF HYPERSPECTRAL IMAGES WITH LIMITED TRAINING SAMPLES

Directory of Open Access Journals (Sweden)

A. Kianisarkaleh

2015-12-01

Full Text Available Feature extraction plays a key role in hyperspectral images classification. Using unlabeled samples, often unlimitedly available, unsupervised and semisupervised feature extraction methods show better performance when limited number of training samples exists. This paper illustrates the importance of selecting appropriate unlabeled samples that used in feature extraction methods. Also proposes a new method for unlabeled samples selection using spectral and spatial information. The proposed method has four parts including: PCA, prior classification, posterior classification and sample selection. As hyperspectral image passes these parts, selected unlabeled samples can be used in arbitrary feature extraction methods. The effectiveness of the proposed unlabeled selected samples in unsupervised and semisupervised feature extraction is demonstrated using two real hyperspectral datasets. Results show that through selecting appropriate unlabeled samples, the proposed method can improve the performance of feature extraction methods and increase classification accuracy.
Numerically stable algorithm for combining census and sample estimates with the multivariate composite estimator

Science.gov (United States)

R. L. Czaplewski

2009-01-01

The minimum variance multivariate composite estimator is a relatively simple sequential estimator for complex sampling designs (Czaplewski 2009). Such designs combine a probability sample of expensive field data with multiple censuses and/or samples of relatively inexpensive multi-sensor, multi-resolution remotely sensed data. Unfortunately, the multivariate composite...
Estimation of population mean under systematic sampling

Science.gov (United States)

Noor-ul-amin, Muhammad; Javaid, Amjad

2017-11-01

In this study we propose a generalized ratio estimator under non-response for systematic random sampling. We also generate a class of estimators through special cases of generalized estimator using different combinations of coefficients of correlation, kurtosis and variation. The mean square errors and mathematical conditions are also derived to prove the efficiency of proposed estimators. Numerical illustration is included using three populations to support the results.
Network Structure and Biased Variance Estimation in Respondent Driven Sampling.

Science.gov (United States)

Verdery, Ashton M; Mouw, Ted; Bauldry, Shawn; Mucha, Peter J

2015-01-01

This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance estimation for the construction of confidence intervals and hypothesis tests. In this paper, we show that the estimators of RDS sampling variance rely on a critical assumption that the network is First Order Markov (FOM) with respect to the dependent variable of interest. We demonstrate, through intuitive examples, mathematical generalizations, and computational experiments that current RDS variance estimators will always underestimate the population sampling variance of RDS in empirical networks that do not conform to the FOM assumption. Analysis of 215 observed university and school networks from Facebook and Add Health indicates that the FOM assumption is violated in every empirical network we analyze, and that these violations lead to substantially biased RDS estimators of sampling variance. We propose and test two alternative variance estimators that show some promise for reducing biases, but which also illustrate the limits of estimating sampling variance with only partial information on the underlying population social network.
Acrylamide exposure among Turkish toddlers from selected cereal-based baby food samples.

Science.gov (United States)

Cengiz, Mehmet Fatih; Gündüz, Cennet Pelin Boyacı

2013-10-01

In this study, acrylamide exposure from selected cereal-based baby food samples was investigated among toddlers aged 1-3 years in Turkey. The study contained three steps. The first step was collecting food consumption data and toddlers' physical properties, such as gender, age and body weight, using a questionnaire given to parents by a trained interviewer between January and March 2012. The second step was determining the acrylamide levels in food samples that were reported on by the parents in the questionnaire, using a gas chromatography-mass spectrometry (GC-MS) method. The last step was combining the determined acrylamide levels in selected food samples with individual food consumption and body weight data using a deterministic approach to estimate the acrylamide exposure levels. The mean acrylamide levels of baby biscuits, breads, baby bread-rusks, crackers, biscuits, breakfast cereals and powdered cereal-based baby foods were 153, 225, 121, 604, 495, 290 and 36 μg/kg, respectively. The minimum, mean and maximum acrylamide exposures were estimated to be 0.06, 1.43 and 6.41 μg/kg BW per day, respectively. The foods that contributed to acrylamide exposure were aligned from high to low as bread, crackers, biscuits, baby biscuits, powdered cereal-based baby foods, baby bread-rusks and breakfast cereals. Copyright © 2013 Elsevier Ltd. All rights reserved.
An online method for lithium-ion battery remaining useful life estimation using importance sampling and neural networks

International Nuclear Information System (INIS)

Wu, Ji; Zhang, Chenbin; Chen, Zonghai

2016-01-01

Highlights: • An online RUL estimation method for lithium-ion battery is proposed. • RUL is described by the difference among battery terminal voltage curves. • A feed forward neural network is employed for RUL estimation. • Importance sampling is utilized to select feed forward neural network inputs. - Abstract: An accurate battery remaining useful life (RUL) estimation can facilitate the design of a reliable battery system as well as the safety and reliability of actual operation. A reasonable definition and an effective prediction algorithm are indispensable for the achievement of an accurate RUL estimation result. In this paper, the analysis of battery terminal voltage curves under different cycle numbers during charge process is utilized for RUL definition. Moreover, the relationship between RUL and charge curve is simulated by feed forward neural network (FFNN) for its simplicity and effectiveness. Considering the nonlinearity of lithium-ion charge curve, importance sampling (IS) is employed for FFNN input selection. Based on these results, an online approach using FFNN and IS is presented to estimate lithium-ion battery RUL in this paper. Experiments and numerical comparisons are conducted to validate the proposed method. The results show that the FFNN with IS is an accurate estimation method for actual operation.
National HIV prevalence estimates for sub-Saharan Africa: controlling selection bias with Heckman-type selection models

Science.gov (United States)

Hogan, Daniel R; Salomon, Joshua A; Canning, David; Hammitt, James K; Zaslavsky, Alan M; Bärnighausen, Till

2012-01-01

Objectives Population-based HIV testing surveys have become central to deriving estimates of national HIV prevalence in sub-Saharan Africa. However, limited participation in these surveys can lead to selection bias. We control for selection bias in national HIV prevalence estimates using a novel approach, which unlike conventional imputation can account for selection on unobserved factors. Methods For 12 Demographic and Health Surveys conducted from 2001 to 2009 (N=138 300), we predict HIV status among those missing a valid HIV test with Heckman-type selection models, which allow for correlation between infection status and participation in survey HIV testing. We compare these estimates with conventional ones and introduce a simulation procedure that incorporates regression model parameter uncertainty into confidence intervals. Results Selection model point estimates of national HIV prevalence were greater than unadjusted estimates for 10 of 12 surveys for men and 11 of 12 surveys for women, and were also greater than the majority of estimates obtained from conventional imputation, with significantly higher HIV prevalence estimates for men in Cote d'Ivoire 2005, Mali 2006 and Zambia 2007. Accounting for selective non-participation yielded 95% confidence intervals around HIV prevalence estimates that are wider than those obtained with conventional imputation by an average factor of 4.5. Conclusions Our analysis indicates that national HIV prevalence estimates for many countries in sub-Saharan African are more uncertain than previously thought, and may be underestimated in several cases, underscoring the need for increasing participation in HIV surveys. Heckman-type selection models should be included in the set of tools used for routine estimation of HIV prevalence. PMID:23172342
An Improvement to Interval Estimation for Small Samples

Directory of Open Access Journals (Sweden)

SUN Hui-Ling

2017-02-01

Full Text Available Because it is difficult and complex to determine the probability distribution of small samples，it is improper to use traditional probability theory to process parameter estimation for small samples. Bayes Bootstrap method is always used in the project. Although，the Bayes Bootstrap method has its own limitation，In this article an improvement is given to the Bayes Bootstrap method，This method extended the amount of samples by numerical simulation without changing the circumstances in a small sample of the original sample. And the new method can give the accurate interval estimation for the small samples. Finally，by using the Monte Carlo simulation to model simulation to the specific small sample problems. The effectiveness and practicability of the Improved-Bootstrap method was proved.
A geostatistical estimation of zinc grade in bore-core samples

International Nuclear Information System (INIS)

Starzec, A.

1987-01-01

Possibilities and preliminary results of geostatistical interpretation of the XRF determination of zinc in bore-core samples are considered. For the spherical model of the variogram the estimation variance of grade in a disk-shape sample (estimated from the grade on the circumference sample) is calculated. Variograms of zinc grade in core samples are presented and examples of the grade estimation are discussed. 4 refs., 7 figs., 1 tab. (author)
Estimation variance bounds of importance sampling simulations in digital communication systems

Science.gov (United States)

Lu, D.; Yao, K.

1991-01-01

In practical applications of importance sampling (IS) simulation, two basic problems are encountered, that of determining the estimation variance and that of evaluating the proper IS parameters needed in the simulations. The authors derive new upper and lower bounds on the estimation variance which are applicable to IS techniques. The upper bound is simple to evaluate and may be minimized by the proper selection of the IS parameter. Thus, lower and upper bounds on the improvement ratio of various IS techniques relative to the direct Monte Carlo simulation are also available. These bounds are shown to be useful and computationally simple to obtain. Based on the proposed technique, one can readily find practical suboptimum IS parameters. Numerical results indicate that these bounding techniques are useful for IS simulations of linear and nonlinear communication systems with intersymbol interference in which bit error rate and IS estimation variances cannot be obtained readily using prior techniques.
Reliability of different sampling densities for estimating and mapping lichen diversity in biomonitoring studies

International Nuclear Information System (INIS)

Ferretti, M.; Brambilla, E.; Brunialti, G.; Fornasier, F.; Mazzali, C.; Giordani, P.; Nimis, P.L.

2004-01-01

Sampling requirements related to lichen biomonitoring include optimal sampling density for obtaining precise and unbiased estimates of population parameters and maps of known reliability. Two available datasets on a sub-national scale in Italy were used to determine a cost-effective sampling density to be adopted in medium-to-large-scale biomonitoring studies. As expected, the relative error in the mean Lichen Biodiversity (Italian acronym: BL) values and the error associated with the interpolation of BL values for (unmeasured) grid cells increased as the sampling density decreased. However, the increase in size of the error was not linear and even a considerable reduction (up to 50%) in the original sampling effort led to a far smaller increase in errors in the mean estimates (<6%) and in mapping (<18%) as compared with the original sampling densities. A reduction in the sampling effort can result in considerable savings of resources, which can then be used for a more detailed investigation of potentially problematic areas. It is, however, necessary to decide the acceptable level of precision at the design stage of the investigation, so as to select the proper sampling density. - An acceptable level of precision must be decided before determining a sampling design
Poisson sampling - The adjusted and unadjusted estimator revisited

Science.gov (United States)

Michael S. Williams; Hans T. Schreuder; Gerardo H. Terrazas

1998-01-01

The prevailing assumption, that for Poisson sampling the adjusted estimator "Y-hat a" is always substantially more efficient than the unadjusted estimator "Y-hat u" , is shown to be incorrect. Some well known theoretical results are applicable since "Y-hat a" is a ratio-of-means estimator and "Y-hat u" a simple unbiased estimator...
VSRR - Quarterly provisional estimates for selected birth indicators

Data.gov (United States)

U.S. Department of Health & Human Services — Provisional estimates of selected reproductive indicators. Estimates are presented for: general fertility rates, age-specific birth rates, total and low risk...
Detecting spatial structures in throughfall data: The effect of extent, sample size, sampling design, and variogram estimation method

Science.gov (United States)

Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander

2016-09-01

In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous
Procedure manual for the estimation of average indoor radon-daughter concentrations using the radon grab-sampling method

International Nuclear Information System (INIS)

George, J.L.

1986-04-01

The US Department of Energy (DOE) Office of Remedial Action and Waste Technology established the Technical Measurements Center to provide standardization, calibration, comparability, verification of data, quality assurance, and cost-effectiveness for the measurement requirements of DOE remedial action programs. One of the remedial-action measurement needs is the estimation of average indoor radon-daughter concentration. One method for accomplishing such estimations in support of DOE remedial action programs is the radon grab-sampling method. This manual describes procedures for radon grab sampling, with the application specifically directed to the estimation of average indoor radon-daughter concentration (RDC) in highly ventilated structures. This particular application of the measurement method is for cases where RDC estimates derived from long-term integrated measurements under occupied conditions are below the standard and where the structure being evaluated is considered to be highly ventilated. The radon grab-sampling method requires that sampling be conducted under standard maximized conditions. Briefly, the procedure for radon grab sampling involves the following steps: selection of sampling and counting equipment; sample acquisition and processing, including data reduction; calibration of equipment, including provisions to correct for pressure effects when sampling at various elevations; and incorporation of quality-control and assurance measures. This manual describes each of the above steps in detail and presents an example of a step-by-step radon grab-sampling procedure using a scintillation cell
Lifetime ultraviolet exposure estimates for selected population groups in south-east Queensland

International Nuclear Information System (INIS)

Parisi, A.V.; Meldrum, L.R.; Wong, J.C.F.; Fleming, R.A.; Aitken, J.

1999-01-01

The lifetime erythemal UV exposures received by selected population groups in south-east Queensland from birth up to an age of 55 years have been quantitatively estimated. A representative sample of teachers and other school workers received (64±22)x10 5 J m -2 to the neck compared with (4.1±1.4)x10 5 Jm -2 to the upper leg. A sample of indoor workers (bank officers, solicitors and psychologists) received approximately 2% less and a sample of outdoor workers (carpenters, tilers, electricians and labourers) received approximately 10% more to the neck than the school workers. These differences in erythemal UV exposures may influence the risk of non-melanoma skin cancer. (author)
Power Spectrum Estimation of Randomly Sampled Signals

DEFF Research Database (Denmark)

Velte, C. M.; Buchhave, P.; K. George, W.

algorithms; sample and-hold and the direct spectral estimator without residence time weighting. The computer generated signal is a Poisson process with a sample rate proportional to velocity magnitude that consist of well-defined frequency content, which makes bias easy to spot. The idea...
Transfer function design based on user selected samples for intuitive multivariate volume exploration

KAUST Repository

Zhou, Liang

2013-02-01

Multivariate volumetric datasets are important to both science and medicine. We propose a transfer function (TF) design approach based on user selected samples in the spatial domain to make multivariate volumetric data visualization more accessible for domain users. Specifically, the user starts the visualization by probing features of interest on slices and the data values are instantly queried by user selection. The queried sample values are then used to automatically and robustly generate high dimensional transfer functions (HDTFs) via kernel density estimation (KDE). Alternatively, 2D Gaussian TFs can be automatically generated in the dimensionality reduced space using these samples. With the extracted features rendered in the volume rendering view, the user can further refine these features using segmentation brushes. Interactivity is achieved in our system and different views are tightly linked. Use cases show that our system has been successfully applied for simulation and complicated seismic data sets. © 2013 IEEE.
Transfer function design based on user selected samples for intuitive multivariate volume exploration

KAUST Repository

Zhou, Liang; Hansen, Charles

2013-01-01

Multivariate volumetric datasets are important to both science and medicine. We propose a transfer function (TF) design approach based on user selected samples in the spatial domain to make multivariate volumetric data visualization more accessible for domain users. Specifically, the user starts the visualization by probing features of interest on slices and the data values are instantly queried by user selection. The queried sample values are then used to automatically and robustly generate high dimensional transfer functions (HDTFs) via kernel density estimation (KDE). Alternatively, 2D Gaussian TFs can be automatically generated in the dimensionality reduced space using these samples. With the extracted features rendered in the volume rendering view, the user can further refine these features using segmentation brushes. Interactivity is achieved in our system and different views are tightly linked. Use cases show that our system has been successfully applied for simulation and complicated seismic data sets. © 2013 IEEE.
Efficient estimation for ergodic diffusions sampled at high frequency

DEFF Research Database (Denmark)

Sørensen, Michael

A general theory of efficient estimation for ergodic diffusions sampled at high fre- quency is presented. High frequency sampling is now possible in many applications, in particular in finance. The theory is formulated in term of approximate martingale estimating functions and covers a large class...

The Impact of Selection, Gene Conversion, and Biased Sampling on the Assessment of Microbial Demography.

Science.gov (United States)

Lapierre, Marguerite; Blin, Camille; Lambert, Amaury; Achaz, Guillaume; Rocha, Eduardo P C

2016-07-01

Recent studies have linked demographic changes and epidemiological patterns in bacterial populations using coalescent-based approaches. We identified 26 studies using skyline plots and found that 21 inferred overall population expansion. This surprising result led us to analyze the impact of natural selection, recombination (gene conversion), and sampling biases on demographic inference using skyline plots and site frequency spectra (SFS). Forward simulations based on biologically relevant parameters from Escherichia coli populations showed that theoretical arguments on the detrimental impact of recombination and especially natural selection on the reconstructed genealogies cannot be ignored in practice. In fact, both processes systematically lead to spurious interpretations of population expansion in skyline plots (and in SFS for selection). Weak purifying selection, and especially positive selection, had important effects on skyline plots, showing patterns akin to those of population expansions. State-of-the-art techniques to remove recombination further amplified these biases. We simulated three common sampling biases in microbiological research: uniform, clustered, and mixed sampling. Alone, or together with recombination and selection, they further mislead demographic inferences producing almost any possible skyline shape or SFS. Interestingly, sampling sub-populations also affected skyline plots and SFS, because the coalescent rates of populations and their sub-populations had different distributions. This study suggests that extreme caution is needed to infer demographic changes solely based on reconstructed genealogies. We suggest that the development of novel sampling strategies and the joint analyzes of diverse population genetic methods are strictly necessary to estimate demographic changes in populations where selection, recombination, and biased sampling are present. © The Author 2016. Published by Oxford University Press on behalf of the Society for
Estimation of sample size and testing power (part 5).

Science.gov (United States)

Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

2012-02-01

Estimation of sample size and testing power is an important component of research design. This article introduced methods for sample size and testing power estimation of difference test for quantitative and qualitative data with the single-group design, the paired design or the crossover design. To be specific, this article introduced formulas for sample size and testing power estimation of difference test for quantitative and qualitative data with the above three designs, the realization based on the formulas and the POWER procedure of SAS software and elaborated it with examples, which will benefit researchers for implementing the repetition principle.
A model for estimating the minimum number of offspring to sample in studies of reproductive success.

Science.gov (United States)

Anderson, Joseph H; Ward, Eric J; Carlson, Stephanie M

2011-01-01

Molecular parentage permits studies of selection and evolution in fecund species with cryptic mating systems, such as fish, amphibians, and insects. However, there exists no method for estimating the number of offspring that must be assigned parentage to achieve robust estimates of reproductive success when only a fraction of offspring can be sampled. We constructed a 2-stage model that first estimated the mean (μ) and variance (v) in reproductive success from published studies on salmonid fishes and then sampled offspring from reproductive success distributions simulated from the μ and v estimates. Results provided strong support for modeling salmonid reproductive success via the negative binomial distribution and suggested that few offspring samples are needed to reject the null hypothesis of uniform offspring production. However, the sampled reproductive success distributions deviated significantly (χ(2) goodness-of-fit test p value reproductive success distribution at rates often >0.05 and as high as 0.24, even when hundreds of offspring were assigned parentage. In general, reproductive success patterns were less accurate when offspring were sampled from cohorts with larger numbers of parents and greater variance in reproductive success. Our model can be reparameterized with data from other species and will aid researchers in planning reproductive success studies by providing explicit sampling targets required to accurately assess reproductive success.
The effects of dominance, regular inbreeding and sampling design on Q(ST), an estimator of population differentiation for quantitative traits.

Science.gov (United States)

Goudet, Jérôme; Büchi, Lucie

2006-02-01

To test whether quantitative traits are under directional or homogenizing selection, it is common practice to compare population differentiation estimates at molecular markers (F(ST)) and quantitative traits (Q(ST)). If the trait is neutral and its determinism is additive, then theory predicts that Q(ST) = F(ST), while Q(ST) > F(ST) is predicted under directional selection for different local optima, and Q(ST) sampling designs and find that it is always best to sample many populations (>20) with few families (five) rather than few populations with many families. Provided that estimates of Q(ST) are derived from individuals originating from many populations, we conclude that the pattern Q(ST) > F(ST), and hence the inference of directional selection for different local optima, is robust to the effect of nonadditive gene actions.
Comparing interval estimates for small sample ordinal CFA models.

Science.gov (United States)

Natesan, Prathiba

2015-01-01

Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading
Improving the Network Scale-Up Estimator: Incorporating Means of Sums, Recursive Back Estimation, and Sampling Weights.

Directory of Open Access Journals (Sweden)

Patrick Habecker

Full Text Available Researchers interested in studying populations that are difficult to reach through traditional survey methods can now draw on a range of methods to access these populations. Yet many of these methods are more expensive and difficult to implement than studies using conventional sampling frames and trusted sampling methods. The network scale-up method (NSUM provides a middle ground for researchers who wish to estimate the size of a hidden population, but lack the resources to conduct a more specialized hidden population study. Through this method it is possible to generate population estimates for a wide variety of groups that are perhaps unwilling to self-identify as such (for example, users of illegal drugs or other stigmatized populations via traditional survey tools such as telephone or mail surveys--by asking a representative sample to estimate the number of people they know who are members of such a "hidden" subpopulation. The original estimator is formulated to minimize the weight a single scaling variable can exert upon the estimates. We argue that this introduces hidden and difficult to predict biases, and instead propose a series of methodological advances on the traditional scale-up estimation procedure, including a new estimator. Additionally, we formalize the incorporation of sample weights into the network scale-up estimation process, and propose a recursive process of back estimation "trimming" to identify and remove poorly performing predictors from the estimation process. To demonstrate these suggestions we use data from a network scale-up mail survey conducted in Nebraska during 2014. We find that using the new estimator and recursive trimming process provides more accurate estimates, especially when used in conjunction with sampling weights.
A probabilistic method for testing and estimating selection differences between populations.

Science.gov (United States)

He, Yungang; Wang, Minxian; Huang, Xin; Li, Ran; Xu, Hongyang; Xu, Shuhua; Jin, Li

2015-12-01

Human populations around the world encounter various environmental challenges and, consequently, develop genetic adaptations to different selection forces. Identifying the differences in natural selection between populations is critical for understanding the roles of specific genetic variants in evolutionary adaptation. Although numerous methods have been developed to detect genetic loci under recent directional selection, a probabilistic solution for testing and quantifying selection differences between populations is lacking. Here we report the development of a probabilistic method for testing and estimating selection differences between populations. By use of a probabilistic model of genetic drift and selection, we showed that logarithm odds ratios of allele frequencies provide estimates of the differences in selection coefficients between populations. The estimates approximate a normal distribution, and variance can be estimated using genome-wide variants. This allows us to quantify differences in selection coefficients and to determine the confidence intervals of the estimate. Our work also revealed the link between genetic association testing and hypothesis testing of selection differences. It therefore supplies a solution for hypothesis testing of selection differences. This method was applied to a genome-wide data analysis of Han and Tibetan populations. The results confirmed that both the EPAS1 and EGLN1 genes are under statistically different selection in Han and Tibetan populations. We further estimated differences in the selection coefficients for genetic variants involved in melanin formation and determined their confidence intervals between continental population groups. Application of the method to empirical data demonstrated the outstanding capability of this novel approach for testing and quantifying differences in natural selection. © 2015 He et al.; Published by Cold Spring Harbor Laboratory Press.
Estimation of creatinine in Urine sample by Jaffe's method

International Nuclear Information System (INIS)

Wankhede, Sonal; Arunkumar, Suja; Sawant, Pramilla D.; Rao, B.B.

2012-01-01

In-vitro bioassay monitoring is based on the determination of activity concentrations in biological samples excreted from the body and is most suitable for alpha and beta emitters. A truly representative bioassay sample is the one having all the voids collected during a 24-h period however, this being technically difficult, overnight urine samples collected by the workers are analyzed. These overnight urine samples are collected for 10-16 h, however in the absence of any specific information, 12 h duration is assumed and the observed results are then corrected accordingly obtain the daily excretion rate. To reduce the uncertainty due to unknown duration of sample collection, IAEA has recommended two methods viz., measurement of specific gravity and creatinine excretion rate in urine sample. Creatinine is a final metabolic product creatinine phosphate in the body and is excreted at a steady rate for people with normally functioning kidneys. It is, therefore, often used as a normalization factor for estimation of duration of sample collection. The present study reports the chemical procedure standardized and its application for the estimation of creatinine in urine samples collected from occupational workers. Chemical procedure for estimation of creatinine in bioassay samples was standardized and applied successfully for its estimation in bioassay samples collected from the workers. The creatinine excretion rate observed for these workers is lower than observed in literature. Further, work is in progress to generate a data bank of creatinine excretion rate for most of the workers and also to study the variability in creatinine coefficient for the same individual based on the analysis of samples collected for different duration
An unbiased estimator of the variance of simple random sampling using mixed random-systematic sampling

OpenAIRE

Padilla, Alberto

2009-01-01

Systematic sampling is a commonly used technique due to its simplicity and ease of implementation. The drawback of this simplicity is that it is not possible to estimate the design variance without bias. There are several ways to circumvent this problem. One method is to suppose that the variable of interest has a random order in the population, so the sample variance of simple random sampling without replacement is used. By means of a mixed random - systematic sample, an unbiased estimator o...
Comparison of Four Estimators under sampling without Replacement

African Journals Online (AJOL)

The results were obtained using a program written in Microsoft Visual C++ programming language. It was observed that the two-stage sampling under unequal probabilities without replacement is always better than the other three estimators considered. Keywords: Unequal probability sampling, two-stage sampling, ...
40 CFR 205.171-3 - Test motorcycle sample selection.

Science.gov (United States)

2010-07-01

... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Test motorcycle sample selection. 205... ABATEMENT PROGRAMS TRANSPORTATION EQUIPMENT NOISE EMISSION CONTROLS Motorcycle Exhaust Systems § 205.171-3 Test motorcycle sample selection. A test motorcycle to be used for selective enforcement audit testing...
Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range.

Science.gov (United States)

Wan, Xiang; Wang, Wenqian; Liu, Jiming; Tong, Tiejun

2014-12-19

In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials. In this paper, we propose to improve the existing literature in several directions. First, we show that the sample standard deviation estimation in Hozo et al.'s method (BMC Med Res Methodol 5:13, 2005) has some serious limitations and is always less satisfactory in practice. Inspired by this, we propose a new estimation method by incorporating the sample size. Second, we systematically study the sample mean and standard deviation estimation problem under several other interesting settings where the interquartile range is also available for the trials. We demonstrate the performance of the proposed methods through simulation studies for the three frequently encountered scenarios, respectively. For the first two scenarios, our method greatly improves existing methods and provides a nearly unbiased estimate of the true sample standard deviation for normal data and a slightly biased estimate for skewed data. For the third scenario, our method still performs very well for both normal data and skewed data. Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications. In this paper, we discuss different approximation methods in the estimation of the sample mean and standard deviation and propose some new estimation methods to improve the existing literature. We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different
Selection bias in population-based cancer case-control studies due to incomplete sampling frame coverage.

Science.gov (United States)

Walsh, Matthew C; Trentham-Dietz, Amy; Gangnon, Ronald E; Nieto, F Javier; Newcomb, Polly A; Palta, Mari

2012-06-01

Increasing numbers of individuals are choosing to opt out of population-based sampling frames due to privacy concerns. This is especially a problem in the selection of controls for case-control studies, as the cases often arise from relatively complete population-based registries, whereas control selection requires a sampling frame. If opt out is also related to risk factors, bias can arise. We linked breast cancer cases who reported having a valid driver's license from the 2004-2008 Wisconsin women's health study (N = 2,988) with a master list of licensed drivers from the Wisconsin Department of Transportation (WDOT). This master list excludes Wisconsin drivers that requested their information not be sold by the state. Multivariate-adjusted selection probability ratios (SPR) were calculated to estimate potential bias when using this driver's license sampling frame to select controls. A total of 962 cases (32%) had opted out of the WDOT sampling frame. Cases age <40 (SPR = 0.90), income either unreported (SPR = 0.89) or greater than $50,000 (SPR = 0.94), lower parity (SPR = 0.96 per one-child decrease), and hormone use (SPR = 0.93) were significantly less likely to be covered by the WDOT sampling frame (α = 0.05 level). Our results indicate the potential for selection bias due to differential opt out between various demographic and behavioral subgroups of controls. As selection bias may differ by exposure and study base, the assessment of potential bias needs to be ongoing. SPRs can be used to predict the direction of bias when cases and controls stem from different sampling frames in population-based case-control studies.
Estimation of uranium in bioassay samples of occupational workers by laser fluorimetry

International Nuclear Information System (INIS)

Suja, A.; Prabhu, S.P.; Sawant, P.D.; Sarkar, P.K.; Tiwari, A.K.; Sharma, R.

2012-01-01

A newly established uranium processing facility has been commissioned at BARC, Trombay. Monitoring of occupational workers is essential to assess intake of uranium in this facility. A group of 21 workers was selected for bioassay monitoring to assess the existing urinary excretion levels of uranium before the commencement of actual work. Bioassay samples collected from these workers were analyzed by ion-exchange technique followed by laser fluorimetry. Standard addition method was followed for estimation of uranium concentration in the samples. The minimum detectable activity by this technique is about 0.2 ng. The range of uranium observed in these samples varies from 19 to 132 ng/L. Few of these samples were also analyzed by fission track analysis technique and the results were found to be comparable to those obtained by laser fluorimetry. The urinary excretion rate observed for the individual can be regarded as a 'personal baseline' and will be treated as the existing level of uranium in urine for these workers at the facility. (author)
Estimating fluvial wood discharge from timelapse photography with varying sampling intervals

Science.gov (United States)

Anderson, N. K.

2013-12-01

There is recent focus on calculating wood budgets for streams and rivers to help inform management decisions, ecological studies and carbon/nutrient cycling models. Most work has measured in situ wood in temporary storage along stream banks or estimated wood inputs from banks. Little effort has been employed monitoring and quantifying wood in transport during high flows. This paper outlines a procedure for estimating total seasonal wood loads using non-continuous coarse interval sampling and examines differences in estimation between sampling at 1, 5, 10 and 15 minutes. Analysis is performed on wood transport for the Slave River in Northwest Territories, Canada. Relative to the 1 minute dataset, precision decreased by 23%, 46% and 60% for the 5, 10 and 15 minute datasets, respectively. Five and 10 minute sampling intervals provided unbiased equal variance estimates of 1 minute sampling, whereas 15 minute intervals were biased towards underestimation by 6%. Stratifying estimates by day and by discharge increased precision over non-stratification by 4% and 3%, respectively. Not including wood transported during ice break-up, the total minimum wood load estimated at this site is 3300 × 800$ m3 for the 2012 runoff season. The vast majority of the imprecision in total wood volumes came from variance in estimating average volume per log. Comparison of proportions and variance across sample intervals using bootstrap sampling to achieve equal n. Each trial was sampled for n=100, 10,000 times and averaged. All trials were then averaged to obtain an estimate for each sample interval. Dashed lines represent values from the one minute dataset.
Effects of systematic sampling on satellite estimates of deforestation rates

International Nuclear Information System (INIS)

Steininger, M K; Godoy, F; Harper, G

2009-01-01

Options for satellite monitoring of deforestation rates over large areas include the use of sampling. Sampling may reduce the cost of monitoring but is also a source of error in estimates of areas and rates. A common sampling approach is systematic sampling, in which sample units of a constant size are distributed in some regular manner, such as a grid. The proposed approach for the 2010 Forest Resources Assessment (FRA) of the UN Food and Agriculture Organization (FAO) is a systematic sample of 10 km wide squares at every 1 deg. intersection of latitude and longitude. We assessed the outcome of this and other systematic samples for estimating deforestation at national, sub-national and continental levels. The study is based on digital data on deforestation patterns for the five Amazonian countries outside Brazil plus the Brazilian Amazon. We tested these schemes by varying sample-unit size and frequency. We calculated two estimates of sampling error. First we calculated the standard errors, based on the size, variance and covariance of the samples, and from this calculated the 95% confidence intervals (CI). Second, we calculated the actual errors, based on the difference between the sample-based estimates and the estimates from the full-coverage maps. At the continental level, the 1 deg., 10 km scheme had a CI of 21% and an actual error of 8%. At the national level, this scheme had CIs of 126% for Ecuador and up to 67% for other countries. At this level, increasing sampling density to every 0.25 deg. produced a CI of 32% for Ecuador and CIs of up to 25% for other countries, with only Brazil having a CI of less than 10%. Actual errors were within the limits of the CIs in all but two of the 56 cases. Actual errors were half or less of the CIs in all but eight of these cases. These results indicate that the FRA 2010 should have CIs of smaller than or close to 10% at the continental level. However, systematic sampling at the national level yields large CIs unless the
Fatigue level estimation of monetary bills based on frequency band acoustic signals with feature selection by supervised SOM

Science.gov (United States)

Teranishi, Masaru; Omatu, Sigeru; Kosaka, Toshihisa

Fatigued monetary bills adversely affect the daily operation of automated teller machines (ATMs). In order to make the classification of fatigued bills more efficient, the development of an automatic fatigued monetary bill classification method is desirable. We propose a new method by which to estimate the fatigue level of monetary bills from the feature-selected frequency band acoustic energy pattern of banking machines. By using a supervised self-organizing map (SOM), we effectively estimate the fatigue level using only the feature-selected frequency band acoustic energy pattern. Furthermore, the feature-selected frequency band acoustic energy pattern improves the estimation accuracy of the fatigue level of monetary bills by adding frequency domain information to the acoustic energy pattern. The experimental results with real monetary bill samples reveal the effectiveness of the proposed method.
Channel Selection and Feature Projection for Cognitive Load Estimation Using Ambulatory EEG

Directory of Open Access Journals (Sweden)

Tian Lan

2007-01-01

Full Text Available We present an ambulatory cognitive state classification system to assess the subject's mental load based on EEG measurements. The ambulatory cognitive state estimator is utilized in the context of a real-time augmented cognition (AugCog system that aims to enhance the cognitive performance of a human user through computer-mediated assistance based on assessments of cognitive states using physiological signals including, but not limited to, EEG. This paper focuses particularly on the offline channel selection and feature projection phases of the design and aims to present mutual-information-based techniques that use a simple sample estimator for this quantity. Analyses conducted on data collected from 3 subjects performing 2 tasks (n-back/Larson at 2 difficulty levels (low/high demonstrate that the proposed mutual-information-based dimensionality reduction scheme can achieve up to 94% cognitive load estimation accuracy.
Optimizing Sampling Efficiency for Biomass Estimation Across NEON Domains

Science.gov (United States)

Abercrombie, H. H.; Meier, C. L.; Spencer, J. J.

2013-12-01

Over the course of 30 years, the National Ecological Observatory Network (NEON) will measure plant biomass and productivity across the U.S. to enable an understanding of terrestrial carbon cycle responses to ecosystem change drivers. Over the next several years, prior to operational sampling at a site, NEON will complete construction and characterization phases during which a limited amount of sampling will be done at each site to inform sampling designs, and guide standardization of data collection across all sites. Sampling biomass in 60+ sites distributed among 20 different eco-climatic domains poses major logistical and budgetary challenges. Traditional biomass sampling methods such as clip harvesting and direct measurements of Leaf Area Index (LAI) involve collecting and processing plant samples, and are time and labor intensive. Possible alternatives include using indirect sampling methods for estimating LAI such as digital hemispherical photography (DHP) or using a LI-COR 2200 Plant Canopy Analyzer. These LAI estimations can then be used as a proxy for biomass. The biomass estimates calculated can then inform the clip harvest sampling design during NEON operations, optimizing both sample size and number so that standardized uncertainty limits can be achieved with a minimum amount of sampling effort. In 2011, LAI and clip harvest data were collected from co-located sampling points at the Central Plains Experimental Range located in northern Colorado, a short grass steppe ecosystem that is the NEON Domain 10 core site. LAI was measured with a LI-COR 2200 Plant Canopy Analyzer. The layout of the sampling design included four, 300 meter transects, with clip harvests plots spaced every 50m, and LAI sub-transects spaced every 10m. LAI was measured at four points along 6m sub-transects running perpendicular to the 300m transect. Clip harvest plots were co-located 4m from corresponding LAI transects, and had dimensions of 0.1m by 2m. We conducted regression analyses
Sample selection and taste correlation in discrete choice transport modelling

DEFF Research Database (Denmark)

Mabit, Stefan Lindhard

2008-01-01

explain counterintuitive results in value of travel time estimation. However, the results also point at the difficulty of finding suitable instruments for the selection mechanism. Taste heterogeneity is another important aspect of discrete choice modelling. Mixed logit models are designed to capture...... the question for a broader class of models. It is shown that the original result may be somewhat generalised. Another question investigated is whether mode choice operates as a self-selection mechanism in the estimation of the value of travel time. The results show that self-selection can at least partly...... of taste correlation in willingness-to-pay estimation are presented. The first contribution addresses how to incorporate taste correlation in the estimation of the value of travel time for public transport. Given a limited dataset the approach taken is to use theory on the value of travel time as guidance...

Risk Attitudes, Sample Selection and Attrition in a Longitudinal Field Experiment

DEFF Research Database (Denmark)

Harrison, Glenn W.; Lau, Morten Igel

with respect to risk attitudes. Our design builds in explicit randomization on the incentives for participation. We show that there are significant sample selection effects on inferences about the extent of risk aversion, but that the effects of subsequent sample attrition are minimal. Ignoring sample...... selection leads to inferences that subjects in the population are more risk averse than they actually are. Correcting for sample selection and attrition affects utility curvature, but does not affect inferences about probability weighting. Properly accounting for sample selection and attrition effects leads...... to findings of temporal stability in overall risk aversion. However, that stability is around different levels of risk aversion than one might naively infer without the controls for sample selection and attrition we are able to implement. This evidence of “randomization bias” from sample selection...
Efficiently adapting graphical models for selectivity estimation

DEFF Research Database (Denmark)

Tzoumas, Kostas; Deshpande, Amol; Jensen, Christian S.

2013-01-01

cardinality estimation without making the independence assumption. By carefully using concepts from the field of graphical models, we are able to factor the joint probability distribution over all the attributes in the database into small, usually two-dimensional distributions, without a significant loss...... in estimation accuracy. We show how to efficiently construct such a graphical model from the database using only two-way join queries, and we show how to perform selectivity estimation in a highly efficient manner. We integrate our algorithms into the PostgreSQL DBMS. Experimental results indicate...
Estimating abundance of mountain lions from unstructured spatial sampling

Science.gov (United States)

Russell, Robin E.; Royle, J. Andrew; Desimone, Richard; Schwartz, Michael K.; Edwards, Victoria L.; Pilgrim, Kristy P.; Mckelvey, Kevin S.

2012-01-01

Mountain lions (Puma concolor) are often difficult to monitor because of their low capture probabilities, extensive movements, and large territories. Methods for estimating the abundance of this species are needed to assess population status, determine harvest levels, evaluate the impacts of management actions on populations, and derive conservation and management strategies. Traditional mark–recapture methods do not explicitly account for differences in individual capture probabilities due to the spatial distribution of individuals in relation to survey effort (or trap locations). However, recent advances in the analysis of capture–recapture data have produced methods estimating abundance and density of animals from spatially explicit capture–recapture data that account for heterogeneity in capture probabilities due to the spatial organization of individuals and traps. We adapt recently developed spatial capture–recapture models to estimate density and abundance of mountain lions in western Montana. Volunteers and state agency personnel collected mountain lion DNA samples in portions of the Blackfoot drainage (7,908 km2) in west-central Montana using 2 methods: snow back-tracking mountain lion tracks to collect hair samples and biopsy darting treed mountain lions to obtain tissue samples. Overall, we recorded 72 individual capture events, including captures both with and without tissue sample collection and hair samples resulting in the identification of 50 individual mountain lions (30 females, 19 males, and 1 unknown sex individual). We estimated lion densities from 8 models containing effects of distance, sex, and survey effort on detection probability. Our population density estimates ranged from a minimum of 3.7 mountain lions/100 km2 (95% Cl 2.3–5.7) under the distance only model (including only an effect of distance on detection probability) to 6.7 (95% Cl 3.1–11.0) under the full model (including effects of distance, sex, survey effort, and
Estimation of active pharmaceutical ingredients content using locally weighted partial least squares and statistical wavelength selection.

Science.gov (United States)

Kim, Sanghong; Kano, Manabu; Nakagawa, Hiroshi; Hasebe, Shinji

2011-12-15

Development of quality estimation models using near infrared spectroscopy (NIRS) and multivariate analysis has been accelerated as a process analytical technology (PAT) tool in the pharmaceutical industry. Although linear regression methods such as partial least squares (PLS) are widely used, they cannot always achieve high estimation accuracy because physical and chemical properties of a measuring object have a complex effect on NIR spectra. In this research, locally weighted PLS (LW-PLS) which utilizes a newly defined similarity between samples is proposed to estimate active pharmaceutical ingredient (API) content in granules for tableting. In addition, a statistical wavelength selection method which quantifies the effect of API content and other factors on NIR spectra is proposed. LW-PLS and the proposed wavelength selection method were applied to real process data provided by Daiichi Sankyo Co., Ltd., and the estimation accuracy was improved by 38.6% in root mean square error of prediction (RMSEP) compared to the conventional PLS using wavelengths selected on the basis of variable importance on the projection (VIP). The results clearly show that the proposed calibration modeling technique is useful for API content estimation and is superior to the conventional one. Copyright © 2011 Elsevier B.V. All rights reserved.
Estimation after classification using lot quality assurance sampling: corrections for curtailed sampling with application to evaluating polio vaccination campaigns.

Science.gov (United States)

Olives, Casey; Valadez, Joseph J; Pagano, Marcello

2014-03-01

To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.
The genealogy of samples in models with selection.

Science.gov (United States)

Neuhauser, C; Krone, S M

1997-02-01

We introduce the genealogy of a random sample of genes taken from a large haploid population that evolves according to random reproduction with selection and mutation. Without selection, the genealogy is described by Kingman's well-known coalescent process. In the selective case, the genealogy of the sample is embedded in a graph with a coalescing and branching structure. We describe this graph, called the ancestral selection graph, and point out differences and similarities with Kingman's coalescent. We present simulations for a two-allele model with symmetric mutation in which one of the alleles has a selective advantage over the other. We find that when the allele frequencies in the population are already in equilibrium, then the genealogy does not differ much from the neutral case. This is supported by rigorous results. Furthermore, we describe the ancestral selection graph for other selective models with finitely many selection classes, such as the K-allele models, infinitely-many-alleles models. DNA sequence models, and infinitely-many-sites models, and briefly discuss the diploid case.
An alternative procedure for estimating the population mean in simple random sampling

Directory of Open Access Journals (Sweden)

Housila P. Singh

2012-03-01

Full Text Available This paper deals with the problem of estimating the finite population mean using auxiliary information in simple random sampling. Firstly we have suggested a correction to the mean squared error of the estimator proposed by Gupta and Shabbir [On improvement in estimating the population mean in simple random sampling. Jour. Appl. Statist. 35(5 (2008, pp. 559-566]. Later we have proposed a ratio type estimator and its properties are studied in simple random sampling. Numerically we have shown that the proposed class of estimators is more efficient than different known estimators including Gupta and Shabbir (2008 estimator.
Comparison of sampling techniques for Bayesian parameter estimation

Science.gov (United States)

Allison, Rupert; Dunkley, Joanna

2014-02-01

The posterior probability distribution for a set of model parameters encodes all that the data have to tell us in the context of a given model; it is the fundamental quantity for Bayesian parameter estimation. In order to infer the posterior probability distribution we have to decide how to explore parameter space. Here we compare three prescriptions for how parameter space is navigated, discussing their relative merits. We consider Metropolis-Hasting sampling, nested sampling and affine-invariant ensemble Markov chain Monte Carlo (MCMC) sampling. We focus on their performance on toy-model Gaussian likelihoods and on a real-world cosmological data set. We outline the sampling algorithms themselves and elaborate on performance diagnostics such as convergence time, scope for parallelization, dimensional scaling, requisite tunings and suitability for non-Gaussian distributions. We find that nested sampling delivers high-fidelity estimates for posterior statistics at low computational cost, and should be adopted in favour of Metropolis-Hastings in many cases. Affine-invariant MCMC is competitive when computing clusters can be utilized for massive parallelization. Affine-invariant MCMC and existing extensions to nested sampling naturally probe multimodal and curving distributions.
Sensor Selection for Aircraft Engine Performance Estimation and Gas Path Fault Diagnostics

Science.gov (United States)

Simon, Donald L.; Rinehart, Aidan W.

2016-01-01

This paper presents analytical techniques for aiding system designers in making aircraft engine health management sensor selection decisions. The presented techniques, which are based on linear estimation and probability theory, are tailored for gas turbine engine performance estimation and gas path fault diagnostics applications. They enable quantification of the performance estimation and diagnostic accuracy offered by different candidate sensor suites. For performance estimation, sensor selection metrics are presented for two types of estimators including a Kalman filter and a maximum a posteriori estimator. For each type of performance estimator, sensor selection is based on minimizing the theoretical sum of squared estimation errors in health parameters representing performance deterioration in the major rotating modules of the engine. For gas path fault diagnostics, the sensor selection metric is set up to maximize correct classification rate for a diagnostic strategy that performs fault classification by identifying the fault type that most closely matches the observed measurement signature in a weighted least squares sense. Results from the application of the sensor selection metrics to a linear engine model are presented and discussed. Given a baseline sensor suite and a candidate list of optional sensors, an exhaustive search is performed to determine the optimal sensor suites for performance estimation and fault diagnostics. For any given sensor suite, Monte Carlo simulation results are found to exhibit good agreement with theoretical predictions of estimation and diagnostic accuracies.
Small sample GEE estimation of regression parameters for longitudinal data.

Science.gov (United States)

Paul, Sudhir; Zhang, Xuemao

2014-09-28

Longitudinal (clustered) response data arise in many bio-statistical applications which, in general, cannot be assumed to be independent. Generalized estimating equation (GEE) is a widely used method to estimate marginal regression parameters for correlated responses. The advantage of the GEE is that the estimates of the regression parameters are asymptotically unbiased even if the correlation structure is misspecified, although their small sample properties are not known. In this paper, two bias adjusted GEE estimators of the regression parameters in longitudinal data are obtained when the number of subjects is small. One is based on a bias correction, and the other is based on a bias reduction. Simulations show that the performances of both the bias-corrected methods are similar in terms of bias, efficiency, coverage probability, average coverage length, impact of misspecification of correlation structure, and impact of cluster size on bias correction. Both these methods show superior properties over the GEE estimates for small samples. Further, analysis of data involving a small number of subjects also shows improvement in bias, MSE, standard error, and length of the confidence interval of the estimates by the two bias adjusted methods over the GEE estimates. For small to moderate sample sizes (N ≤50), either of the bias-corrected methods GEEBc and GEEBr can be used. However, the method GEEBc should be preferred over GEEBr, as the former is computationally easier. For large sample sizes, the GEE method can be used. Copyright © 2014 John Wiley & Sons, Ltd.
Bayesian Simultaneous Estimation for Means in k Sample Problems

OpenAIRE

Imai, Ryo; Kubokawa, Tatsuya; Ghosh, Malay

2017-01-01

This paper is concerned with the simultaneous estimation of k population means when one suspects that the k means are nearly equal. As an alternative to the preliminary test estimator based on the test statistics for testing hypothesis of equal means, we derive Bayesian and minimax estimators which shrink individual sample means toward a pooled mean estimator given under the hypothesis. Interestingly, it is shown that both the preliminary test estimator and the Bayesian minimax shrinkage esti...
Turbidity-controlled sampling for suspended sediment load estimation

Science.gov (United States)

Jack Lewis

2003-01-01

Abstract - Automated data collection is essential to effectively measure suspended sediment loads in storm events, particularly in small basins. Continuous turbidity measurements can be used, along with discharge, in an automated system that makes real-time sampling decisions to facilitate sediment load estimation. The Turbidity Threshold Sampling method distributes...
Feasibility and reliability of digital imaging for estimating food selection and consumption from students' packed lunches.

Science.gov (United States)

Taylor, Jennifer C; Sutter, Carolyn; Ontai, Lenna L; Nishina, Adrienne; Zidenberg-Cherr, Sheri

2018-01-01

Although increasing attention is placed on the quality of foods in children's packed lunches, few studies have examined the capacity of observational methods to reliably determine both what is selected and consumed from these lunches. The objective of this project was to assess the feasibility and inter-rater reliability of digital imaging for determining selection and consumption from students' packed lunches, by adapting approaches previously applied to school lunches. Study 1 assessed feasibility and reliability of data collection among a sample of packed lunches (n = 155), while Study 2 further examined reliability in a larger sample of packed (n = 386) as well as school (n = 583) lunches. Based on the results from Study 1, it was feasible to collect and code most items in packed lunch images; missing data were most commonly attributed to packaging that limited visibility of contents. Across both studies, there was satisfactory reliability for determining food types selected, quantities selected, and quantities consumed in the eight food categories examined (weighted kappa coefficients 0.68-0.97 for packed lunches, 0.74-0.97 for school lunches), with lowest reliability for estimating condiments and meats/meat alternatives in packed lunches. In extending methods predominately applied to school lunches, these findings demonstrate the capacity of digital imaging for the objective estimation of selection and consumption from both school and packed lunches. Copyright © 2017 Elsevier Ltd. All rights reserved.
Working covariance model selection for generalized estimating equations.

Science.gov (United States)

Carey, Vincent J; Wang, You-Gan

2011-11-20

We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice. Copyright © 2011 John Wiley & Sons, Ltd.
Estimating mean change in population salt intake using spot urine samples.

Science.gov (United States)

Petersen, Kristina S; Wu, Jason H Y; Webster, Jacqui; Grimes, Carley; Woodward, Mark; Nowson, Caryl A; Neal, Bruce

2017-10-01

Spot urine samples are easier to collect than 24-h urine samples and have been used with estimating equations to derive the mean daily salt intake of a population. Whether equations using data from spot urine samples can also be used to estimate change in mean daily population salt intake over time is unknown. We compared estimates of change in mean daily population salt intake based upon 24-h urine collections with estimates derived using equations based on spot urine samples. Paired and unpaired 24-h urine samples and spot urine samples were collected from individuals in two Australian populations, in 2011 and 2014. Estimates of change in daily mean population salt intake between 2011 and 2014 were obtained directly from the 24-h urine samples and by applying established estimating equations (Kawasaki, Tanaka, Mage, Toft, INTERSALT) to the data from spot urine samples. Differences between 2011 and 2014 were calculated using mixed models. A total of 1000 participants provided a 24-h urine sample and a spot urine sample in 2011, and 1012 did so in 2014 (paired samples n = 870; unpaired samples n = 1142). The participants were community-dwelling individuals living in the State of Victoria or the town of Lithgow in the State of New South Wales, Australia, with a mean age of 55 years in 2011. The mean (95% confidence interval) difference in population salt intake between 2011 and 2014 determined from the 24-h urine samples was -0.48g/day (-0.74 to -0.21; P spot urine samples was -0.24 g/day (-0.42 to -0.06; P = 0.01) using the Tanaka equation, -0.42 g/day (-0.70 to -0.13; p = 0.004) using the Kawasaki equation, -0.51 g/day (-1.00 to -0.01; P = 0.046) using the Mage equation, -0.26 g/day (-0.42 to -0.10; P = 0.001) using the Toft equation, -0.20 g/day (-0.32 to -0.09; P = 0.001) using the INTERSALT equation and -0.27 g/day (-0.39 to -0.15; P 0.058). Separate analysis of the unpaired and paired data showed that detection of
Estimating waste disposal quantities from raw waste samples

International Nuclear Information System (INIS)

Negin, C.A.; Urland, C.S.; Hitz, C.G.; GPU Nuclear Corp., Middletown, PA)

1985-01-01

Estimating the disposal quantity of waste resulting from stabilization of radioactive sludge is complex because of the many factors relating to sample analysis results, radioactive decay, allowable disposal concentrations, and options for disposal containers. To facilitate this estimation, a microcomputer spread sheet template was created. The spread sheet has saved considerable engineering hours. 1 fig., 3 tabs
Iterative importance sampling algorithms for parameter estimation

OpenAIRE

Morzfeld, Matthias; Day, Marcus S.; Grout, Ray W.; Pau, George Shu Heng; Finsterle, Stefan A.; Bell, John B.

2016-01-01

In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov Chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing systems because samples are drawn independently. However, finding a suitable proposal distribution is ...
40 CFR 205.160-2 - Test sample selection and preparation.

Science.gov (United States)

2010-07-01

... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Test sample selection and preparation... sample selection and preparation. (a) Vehicles comprising the sample which are required to be tested... maintained in any manner unless such preparation, tests, modifications, adjustments or maintenance are part...
Selective parathyroid venous sampling in primary hyperparathyroidism: A systematic review and meta-analysis.

Science.gov (United States)

Ibraheem, Kareem; Toraih, Eman A; Haddad, Antoine B; Farag, Mahmoud; Randolph, Gregory W; Kandil, Emad

2018-05-14

Minimally invasive parathyroidectomy requires accurate preoperative localization techniques. There is considerable controversy about the effectiveness of selective parathyroid venous sampling (sPVS) in primary hyperparathyroidism (PHPT) patients. The aim of this meta-analysis is to examine the diagnostic accuracy of sPVS as a preoperative localization modality in PHPT. Studies evaluating the diagnostic accuracy of sPVS for PHPT were electronically searched in the PubMed, EMBASE, Web of Science, and Cochrane Controlled Trials Register databases. Two independent authors reviewed the studies, and revised quality assessment of diagnostic accuracy study tool was used for the quality assessment. Study heterogeneity and pooled estimates were calculated. Two hundred and two unique studies were identified. Of those, 12 studies were included in the meta-analysis. Pooled sensitivity, specificity, and positive likelihood ratio (PLR) of sPVS were 74%, 41%, and 1.55, respectively. The area-under-the-receiver operating characteristic curve was 0.684, indicating an average discriminatory ability of sPVS. On comparison between sPVS and noninvasive imaging modalities, sensitivity, PLR, and positive posttest probability were significantly higher in sPVS compared to noninvasive imaging modalities. Interestingly, super-selective venous sampling had the highest sensitivity, accuracy, and positive posttest probability compared to other parathyroid venous sampling techniques. This is the first meta-analysis to examine the accuracy of sPVS in PHPT. sPVS had higher pooled sensitivity when compared to noninvasive modalities in revision parathyroid surgery. However, the invasiveness of this technique does not favor its routine use for preoperative localization. Super-selective venous sampling was the most accurate among all other parathyroid venous sampling techniques. Laryngoscope, 2018. © 2018 The American Laryngological, Rhinological and Otological Society, Inc.
Sampling strategies for estimating brook trout effective population size

Science.gov (United States)

Andrew R. Whiteley; Jason A. Coombs; Mark Hudy; Zachary Robinson; Keith H. Nislow; Benjamin H. Letcher

2012-01-01

The influence of sampling strategy on estimates of effective population size (Ne) from single-sample genetic methods has not been rigorously examined, though these methods are increasingly used. For headwater salmonids, spatially close kin association among age-0 individuals suggests that sampling strategy (number of individuals and location from...

Optimal Tuner Selection for Kalman Filter-Based Aircraft Engine Performance Estimation

Science.gov (United States)

Simon, Donald L.; Garg, Sanjay

2010-01-01

A linear point design methodology for minimizing the error in on-line Kalman filter-based aircraft engine performance estimation applications is presented. This technique specifically addresses the underdetermined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multi-variable iterative search routine which seeks to minimize the theoretical mean-squared estimation error. This paper derives theoretical Kalman filter estimation error bias and variance values at steady-state operating conditions, and presents the tuner selection routine applied to minimize these values. Results from the application of the technique to an aircraft engine simulation are presented and compared to the conventional approach of tuner selection. Experimental simulation results are found to be in agreement with theoretical predictions. The new methodology is shown to yield a significant improvement in on-line engine performance estimation accuracy
Creel survey sampling designs for estimating effort in short-duration Chinook salmon fisheries

Science.gov (United States)

McCormick, Joshua L.; Quist, Michael C.; Schill, Daniel J.

2013-01-01

Chinook Salmon Oncorhynchus tshawytscha sport fisheries in the Columbia River basin are commonly monitored using roving creel survey designs and require precise, unbiased catch estimates. The objective of this study was to examine the relative bias and precision of total catch estimates using various sampling designs to estimate angling effort under the assumption that mean catch rate was known. We obtained information on angling populations based on direct visual observations of portions of Chinook Salmon fisheries in three Idaho river systems over a 23-d period. Based on the angling population, Monte Carlo simulations were used to evaluate the properties of effort and catch estimates for each sampling design. All sampling designs evaluated were relatively unbiased. Systematic random sampling (SYS) resulted in the most precise estimates. The SYS and simple random sampling designs had mean square error (MSE) estimates that were generally half of those observed with cluster sampling designs. The SYS design was more efficient (i.e., higher accuracy per unit cost) than a two-cluster design. Increasing the number of clusters available for sampling within a day decreased the MSE of estimates of daily angling effort, but the MSE of total catch estimates was variable depending on the fishery. The results of our simulations provide guidelines on the relative influence of sample sizes and sampling designs on parameters of interest in short-duration Chinook Salmon fisheries.
Estimating population salt intake in India using spot urine samples.

Science.gov (United States)

Petersen, Kristina S; Johnson, Claire; Mohan, Sailesh; Rogers, Kris; Shivashankar, Roopa; Thout, Sudhir Raj; Gupta, Priti; He, Feng J; MacGregor, Graham A; Webster, Jacqui; Santos, Joseph Alvin; Krishnan, Anand; Maulik, Pallab K; Reddy, K Srinath; Gupta, Ruby; Prabhakaran, Dorairaj; Neal, Bruce

2017-11-01

To compare estimates of mean population salt intake in North and South India derived from spot urine samples versus 24-h urine collections. In a cross-sectional survey, participants were sampled from slum, urban and rural communities in North and in South India. Participants provided 24-h urine collections, and random morning spot urine samples. Salt intake was estimated from the spot urine samples using a series of established estimating equations. Salt intake data from the 24-h urine collections and spot urine equations were weighted to provide estimates of salt intake for Delhi and Haryana, and Andhra Pradesh. A total of 957 individuals provided a complete 24-h urine collection and a spot urine sample. Weighted mean salt intake based on the 24-h urine collection, was 8.59 (95% confidence interval 7.73-9.45) and 9.46 g/day (8.95-9.96) in Delhi and Haryana, and Andhra Pradesh, respectively. Corresponding estimates based on the Tanaka equation [9.04 (8.63-9.45) and 9.79 g/day (9.62-9.96) for Delhi and Haryana, and Andhra Pradesh, respectively], the Mage equation [8.80 (7.67-9.94) and 10.19 g/day (95% CI 9.59-10.79)], the INTERSALT equation [7.99 (7.61-8.37) and 8.64 g/day (8.04-9.23)] and the INTERSALT equation with potassium [8.13 (7.74-8.52) and 8.81 g/day (8.16-9.46)] were all within 1 g/day of the estimate based upon 24-h collections. For the Toft equation, estimates were 1-2 g/day higher [9.94 (9.24-10.64) and 10.69 g/day (9.44-11.93)] and for the Kawasaki equation they were 3-4 g/day higher [12.14 (11.30-12.97) and 13.64 g/day (13.15-14.12)]. In urban and rural areas in North and South India, most spot urine-based equations provided reasonable estimates of mean population salt intake. Equations that did not provide good estimates may have failed because specimen collection was not aligned with the original method.
Networked Estimation for Event-Based Sampling Systems with Packet Dropouts

Directory of Open Access Journals (Sweden)

Young Soo Suh

2009-04-01

Full Text Available This paper is concerned with a networked estimation problem in which sensor data are transmitted over the network. In the event-based sampling scheme known as level-crossing or send-on-delta (SOD, sensor data are transmitted to the estimator node if the difference between the current sensor value and the last transmitted one is greater than a given threshold. Event-based sampling has been shown to be more efficient than the time-triggered one in some situations, especially in network bandwidth improvement. However, it cannot detect packet dropout situations because data transmission and reception do not use a periodical time-stamp mechanism as found in time-triggered sampling systems. Motivated by this issue, we propose a modified event-based sampling scheme called modified SOD in which sensor data are sent when either the change of sensor output exceeds a given threshold or the time elapses more than a given interval. Through simulation results, we show that the proposed modified SOD sampling significantly improves estimation performance when packet dropouts happen.
The quasar luminosity function from a variability-selected sample

Science.gov (United States)

Hawkins, M. R. S.; Veron, P.

1993-01-01

A sample of quasars is selected from a 10-yr sequence of 30 UK Schmidt plates. Luminosity functions are derived in several redshift intervals, which in each case show a featureless power-law rise towards low luminosities. There is no sign of the 'break' found in the recent UVX sample of Boyle et al. It is suggested that reasons for the disagreement are connected with biases in the selection of the UVX sample. The question of the nature of quasar evolution appears to be still unresolved.
Comparison of distance sampling estimates to a known population ...

African Journals Online (AJOL)

Line-transect sampling was used to obtain abundance estimates of an Ant-eating Chat Myrmecocichla formicivora population to compare these with the true size of the population. The population size was determined by a long-term banding study, and abundance estimates were obtained by surveying line transects.
A method to combine non-probability sample data with probability sample data in estimating spatial means of environmental variables

NARCIS (Netherlands)

Brus, D.J.; Gruijter, de J.J.

2003-01-01

In estimating spatial means of environmental variables of a region from data collected by convenience or purposive sampling, validity of the results can be ensured by collecting additional data through probability sampling. The precision of the pi estimator that uses the probability sample can be
Estimation of uranium in bioassay samples of occupational workers by laser fluorimetry

International Nuclear Information System (INIS)

Suja, A.; Prabhu, S.P.; Sawant, P.D.; Sarkar, P.K.; Tiwari, A.K.; Sharma, R.

2010-01-01

A newly established uranium processing facility has been commissioned at BARC, Trombay. Monitoring of occupational workers at regulars intervals is essential to assess intake of uranium by the workers in this facility. The design and engineering safety features of the plant are such that there is very low probability of uranium getting air borne during normal operations. However, the leakages from the system during routine maintenance of the plant may result in intake of uranium by workers. As per the new biokinetic model for uranium, 63% of uranium entering the blood stream gets directly excreted in urine. Therefore, bioassay monitoring (urinalysis) was recommended for these workers. A group of 21 workers was selected for bioassay monitoring to assess the existing urinary excretion levels of uranium before the commencement of actual work. For this purpose, sample collection kit along with an instruction slip was provided to the workers. Bioassay samples received were wet ashed with conc. nitric acid and hydrogen peroxide to break down the metabolized complexes of uranium and it was co-precipitated with calcium phosphate. Separation of uranium from the matrix was done using ion exchange technique and final activity quantification in these samples was done using laser fluorimeter (Quantalase, Model No. NFL/02). Calibration of the laser fluorimeter is done using 10 ppb uranium standard (WHO, France Ref. No. 180000). Verification of the system performance is done by measuring concentration of uranium in the standards (1 ppb to 100 ppb). Standard addition method was followed for estimation of uranium concentration in the samples. Uranyl ions present in the sample get excited by pulsed nitrogen laser at 337.1 nm, and on de-excitation emit fluorescence light (540 nm) intensity which is measured by the PMT. To estimate the uranium in the bioassay samples, a known aliquot of the sample was mixed with 5% sodium pyrophosphate and fluorescence intensity was measured
Spatially explicit population estimates for black bears based on cluster sampling

Science.gov (United States)

Humm, J.; McCown, J. Walter; Scheick, B.K.; Clark, Joseph D.

2017-01-01

We estimated abundance and density of the 5 major black bear (Ursus americanus) subpopulations (i.e., Eglin, Apalachicola, Osceola, Ocala-St. Johns, Big Cypress) in Florida, USA with spatially explicit capture-mark-recapture (SCR) by extracting DNA from hair samples collected at barbed-wire hair sampling sites. We employed a clustered sampling configuration with sampling sites arranged in 3 × 3 clusters spaced 2 km apart within each cluster and cluster centers spaced 16 km apart (center to center). We surveyed all 5 subpopulations encompassing 38,960 km2 during 2014 and 2015. Several landscape variables, most associated with forest cover, helped refine density estimates for the 5 subpopulations we sampled. Detection probabilities were affected by site-specific behavioral responses coupled with individual capture heterogeneity associated with sex. Model-averaged bear population estimates ranged from 120 (95% CI = 59–276) bears or a mean 0.025 bears/km2 (95% CI = 0.011–0.44) for the Eglin subpopulation to 1,198 bears (95% CI = 949–1,537) or 0.127 bears/km2 (95% CI = 0.101–0.163) for the Ocala-St. Johns subpopulation. The total population estimate for our 5 study areas was 3,916 bears (95% CI = 2,914–5,451). The clustered sampling method coupled with information on land cover was efficient and allowed us to estimate abundance across extensive areas that would not have been possible otherwise. Clustered sampling combined with spatially explicit capture-recapture methods has the potential to provide rigorous population estimates for a wide array of species that are extensive and heterogeneous in their distribution.
Low-sampling-rate ultra-wideband channel estimation using a bounded-data-uncertainty approach

KAUST Repository

Ballal, Tarig

2014-01-01

This paper proposes a low-sampling-rate scheme for ultra-wideband channel estimation. In the proposed scheme, P pulses are transmitted to produce P observations. These observations are exploited to produce channel impulse response estimates at a desired sampling rate, while the ADC operates at a rate that is P times less. To avoid loss of fidelity, the interpulse interval, given in units of sampling periods of the desired rate, is restricted to be co-prime with P. This condition is affected when clock drift is present and the transmitted pulse locations change. To handle this situation and to achieve good performance without using prior information, we derive an improved estimator based on the bounded data uncertainty (BDU) model. This estimator is shown to be related to the Bayesian linear minimum mean squared error (LMMSE) estimator. The performance of the proposed sub-sampling scheme was tested in conjunction with the new estimator. It is shown that high reduction in sampling rate can be achieved. The proposed estimator outperforms the least squares estimator in most cases; while in the high SNR regime, it also outperforms the LMMSE estimator. © 2014 IEEE.
Optimal sampling designs for estimation of Plasmodium falciparum clearance rates in patients treated with artemisinin derivatives

Science.gov (United States)

2013-01-01

Background The emergence of Plasmodium falciparum resistance to artemisinins in Southeast Asia threatens the control of malaria worldwide. The pharmacodynamic hallmark of artemisinin derivatives is rapid parasite clearance (a short parasite half-life), therefore, the in vivo phenotype of slow clearance defines the reduced susceptibility to the drug. Measurement of parasite counts every six hours during the first three days after treatment have been recommended to measure the parasite clearance half-life, but it remains unclear whether simpler sampling intervals and frequencies might also be sufficient to reliably estimate this parameter. Methods A total of 2,746 parasite density-time profiles were selected from 13 clinical trials in Thailand, Cambodia, Mali, Vietnam, and Kenya. In these studies, parasite densities were measured every six hours until negative after treatment with an artemisinin derivative (alone or in combination with a partner drug). The WWARN Parasite Clearance Estimator (PCE) tool was used to estimate “reference” half-lives from these six-hourly measurements. The effect of four alternative sampling schedules on half-life estimation was investigated, and compared to the reference half-life (time zero, 6, 12, 24 (A1); zero, 6, 18, 24 (A2); zero, 12, 18, 24 (A3) or zero, 12, 24 (A4) hours and then every 12 hours). Statistical bootstrap methods were used to estimate the sampling distribution of half-lives for parasite populations with different geometric mean half-lives. A simulation study was performed to investigate a suite of 16 potential alternative schedules and half-life estimates generated by each of the schedules were compared to the “true” half-life. The candidate schedules in the simulation study included (among others) six-hourly sampling, schedule A1, schedule A4, and a convenience sampling schedule at six, seven, 24, 25, 48 and 49 hours. Results The median (range) parasite half-life for all clinical studies combined was 3.1 (0
The use of Thompson sampling to increase estimation precision

NARCIS (Netherlands)

Kaptein, M.C.

2015-01-01

In this article, we consider a sequential sampling scheme for efficient estimation of the difference between the means of two independent treatments when the population variances are unequal across groups. The sampling scheme proposed is based on a solution to bandit problems called Thompson
Estimation of river and stream temperature trends under haphazard sampling

Science.gov (United States)

Gray, Brian R.; Lyubchich, Vyacheslav; Gel, Yulia R.; Rogala, James T.; Robertson, Dale M.; Wei, Xiaoqiao

2015-01-01

Long-term temporal trends in water temperature in rivers and streams are typically estimated under the assumption of evenly-spaced space-time measurements. However, sampling times and dates associated with historical water temperature datasets and some sampling designs may be haphazard. As a result, trends in temperature may be confounded with trends in time or space of sampling which, in turn, may yield biased trend estimators and thus unreliable conclusions. We address this concern using multilevel (hierarchical) linear models, where time effects are allowed to vary randomly by day and date effects by year. We evaluate the proposed approach by Monte Carlo simulations with imbalance, sparse data and confounding by trend in time and date of sampling. Simulation results indicate unbiased trend estimators while results from a case study of temperature data from the Illinois River, USA conform to river thermal assumptions. We also propose a new nonparametric bootstrap inference on multilevel models that allows for a relatively flexible and distribution-free quantification of uncertainties. The proposed multilevel modeling approach may be elaborated to accommodate nonlinearities within days and years when sampling times or dates typically span temperature extremes.
Obscured AGN at z ~ 1 from the zCOSMOS-Bright Survey. I. Selection and optical properties of a [Ne v]-selected sample

Science.gov (United States)

Mignoli, M.; Vignali, C.; Gilli, R.; Comastri, A.; Zamorani, G.; Bolzonella, M.; Bongiorno, A.; Lamareille, F.; Nair, P.; Pozzetti, L.; Lilly, S. J.; Carollo, C. M.; Contini, T.; Kneib, J.-P.; Le Fèvre, O.; Mainieri, V.; Renzini, A.; Scodeggio, M.; Bardelli, S.; Caputi, K.; Cucciati, O.; de la Torre, S.; de Ravel, L.; Franzetti, P.; Garilli, B.; Iovino, A.; Kampczyk, P.; Knobel, C.; Kovač, K.; Le Borgne, J.-F.; Le Brun, V.; Maier, C.; Pellò, R.; Peng, Y.; Perez Montero, E.; Presotto, V.; Silverman, J. D.; Tanaka, M.; Tasca, L.; Tresse, L.; Vergani, D.; Zucca, E.; Bordoloi, R.; Cappi, A.; Cimatti, A.; Koekemoer, A. M.; McCracken, H. J.; Moresco, M.; Welikala, N.

2013-08-01

Aims: The application of multi-wavelength selection techniques is essential for obtaining a complete and unbiased census of active galactic nuclei (AGN). We present here a method for selecting z ~ 1 obscured AGN from optical spectroscopic surveys. Methods: A sample of 94 narrow-line AGN with 0.65 advantage of the large amount of data available in the COSMOS field, the properties of the [Ne v]-selected type 2 AGN were investigated, focusing on their host galaxies, X-ray emission, and optical line-flux ratios. Finally, a previously developed diagnostic, based on the X-ray-to-[Ne v] luminosity ratio, was exploited to search for the more heavily obscured AGN. Results: We found that [Ne v]-selected narrow-line AGN have Seyfert 2-like optical spectra, although their emission line ratios are diluted by a star-forming component. The ACS morphologies and stellar component in the optical spectra indicate a preference for our type 2 AGN to be hosted in early-type spirals with stellar masses greater than 109.5 - 10 M⊙, on average higher than those of the galaxy parent sample. The fraction of galaxies hosting [Ne v]-selected obscured AGN increases with the stellar mass, reaching a maximum of about 3% at ≈2 × 1011 M⊙. A comparison with other selection techniques at z ~ 1, namely the line-ratio diagnostics and X-ray detections, shows that the detection of the [Ne v] λ3426 line is an effective method for selecting AGN in the optical band, in particular the most heavily obscured ones, but cannot provide a complete census of type 2 AGN by itself. Finally, the high fraction of [Ne v]-selected type 2 AGN not detected in medium-deep (≈100-200 ks) Chandra observations (67%) is suggestive of the inclusion of Compton-thick (i.e., with NH > 1024 cm-2) sources in our sample. The presence of a population of heavily obscured AGN is corroborated by the X-ray-to-[Ne v] ratio; we estimated, by means of an X-ray stacking technique and simulations, that the Compton-thick fraction in our
Estimating Sample Size for Usability Testing

Directory of Open Access Journals (Sweden)

Alex Cazañas

2017-02-01

Full Text Available One strategy used to assure that an interface meets user requirements is to conduct usability testing. When conducting such testing one of the unknowns is sample size. Since extensive testing is costly, minimizing the number of participants can contribute greatly to successful resource management of a project. Even though a significant number of models have been proposed to estimate sample size in usability testing, there is still not consensus on the optimal size. Several studies claim that 3 to 5 users suffice to uncover 80% of problems in a software interface. However, many other studies challenge this assertion. This study analyzed data collected from the user testing of a web application to verify the rule of thumb, commonly known as the “magic number 5”. The outcomes of the analysis showed that the 5-user rule significantly underestimates the required sample size to achieve reasonable levels of problem detection.
Determining Sample Size for Accurate Estimation of the Squared Multiple Correlation Coefficient.

Science.gov (United States)

Algina, James; Olejnik, Stephen

2000-01-01

Discusses determining sample size for estimation of the squared multiple correlation coefficient and presents regression equations that permit determination of the sample size for estimating this parameter for up to 20 predictor variables. (SLD)
Density meter algorithm and system for estimating sampling/mixing uncertainty

International Nuclear Information System (INIS)

Shine, E.P.

1986-01-01

The Laboratories Department at the Savannah River Plant (SRP) has installed a six-place density meter with an automatic sampling device. This paper describes the statistical software developed to analyze the density of uranyl nitrate solutions using this automated system. The purpose of this software is twofold: to estimate the sampling/mixing and measurement uncertainties in the process and to provide a measurement control program for the density meter. Non-uniformities in density are analyzed both analytically and graphically. The mean density and its limit of error are estimated. Quality control standards are analyzed concurrently with process samples and used to control the density meter measurement error. The analyses are corrected for concentration due to evaporation of samples waiting to be analyzed. The results of this program have been successful in identifying sampling/mixing problems and controlling the quality of analyses
Density meter algorithm and system for estimating sampling/mixing uncertainty

International Nuclear Information System (INIS)

Shine, E.P.

1986-01-01

The Laboratories Department at the Savannah River Plant (SRP) has installed a six-place density meter with an automatic sampling device. This paper describes the statisical software developed to analyze the density of uranyl nitrate solutions using this automated system. The purpose of this software is twofold: to estimate the sampling/mixing and measurement uncertainties in the process and to provide a measurement control program for the density meter. Non-uniformities in density are analyzed both analytically and graphically. The mean density and its limit of error are estimated. Quality control standards are analyzed concurrently with process samples and used to control the density meter measurement error. The analyses are corrected for concentration due to evaporation of samples waiting to be analyzed. The results of this program have been successful in identifying sampling/mixing problems and controlling the quality of analyses
Estimates of selection parameters in protein mutants of spring barley

International Nuclear Information System (INIS)

Gaul, H.; Walther, H.; Seibold, K.H.; Brunner, H.; Mikaelsen, K.

1976-01-01

Detailed studies have been made with induced protein mutants regarding a possible genetic advance in selection including the estimation of the genetic variation and heritability coefficients. Estimates were obtained for protein content and protein yield. The variation of mutant lines in different environments was found to be many times as large as the variation of the line means. The detection of improved protein mutants seems therefore possible only in trials with more than one environment. The heritability of protein content and protein yield was estimated in different sets of environments and was found to be low. However, higher values were found with an increasing number of environments. At least four environments seem to be necessary to obtain reliable heritability estimates. The geneticall component of the variation between lines was significant for protein content in all environmental combinations. For protein yield some environmental combinations only showed significant differences. The expected genetic advance with one selection step was small for both protein traits. Genetically significant differences between protein micromutants give, however, a first indication that selection among protein mutants with small differences seems also possible. (author)
Estimating the variation, autocorrelation, and environmental sensitivity of phenotypic selection

NARCIS (Netherlands)

Chevin, Luis-Miguel; Visser, Marcel E.; Tufto, Jarle

2015-01-01

Despite considerable interest in temporal and spatial variation of phenotypic selection, very few methods allow quantifying this variation while correctly accounting for the error variance of each individual estimate. Furthermore, the available methods do not estimate the autocorrelation of

Estimating the variation, autocorrelation, and environmental sensitivity of phenotypic selection

NARCIS (Netherlands)

Chevin, Luis-Miguel; Visser, Marcel E.; Tufto, Jarle

Despite considerable interest in temporal and spatial variation of phenotypic selection, very few methods allow quantifying this variation while correctly accounting for the error variance of each individual estimate. Furthermore, the available methods do not estimate the autocorrelation of
Optimal Tuner Selection for Kalman-Filter-Based Aircraft Engine Performance Estimation

Science.gov (United States)

Simon, Donald L.; Garg, Sanjay

2011-01-01

An emerging approach in the field of aircraft engine controls and system health management is the inclusion of real-time, onboard models for the inflight estimation of engine performance variations. This technology, typically based on Kalman-filter concepts, enables the estimation of unmeasured engine performance parameters that can be directly utilized by controls, prognostics, and health-management applications. A challenge that complicates this practice is the fact that an aircraft engine s performance is affected by its level of degradation, generally described in terms of unmeasurable health parameters such as efficiencies and flow capacities related to each major engine module. Through Kalman-filter-based estimation techniques, the level of engine performance degradation can be estimated, given that there are at least as many sensors as health parameters to be estimated. However, in an aircraft engine, the number of sensors available is typically less than the number of health parameters, presenting an under-determined estimation problem. A common approach to address this shortcoming is to estimate a subset of the health parameters, referred to as model tuning parameters. The problem/objective is to optimally select the model tuning parameters to minimize Kalman-filterbased estimation error. A tuner selection technique has been developed that specifically addresses the under-determined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multi-variable iterative search routine that seeks to minimize the theoretical mean-squared estimation error of the Kalman filter. This approach can significantly reduce the error in onboard aircraft engine parameter estimation
Finite Sample Comparison of Parametric, Semiparametric, and Wavelet Estimators of Fractional Integration

DEFF Research Database (Denmark)

Nielsen, Morten Ø.; Frederiksen, Per Houmann

2005-01-01

In this paper we compare through Monte Carlo simulations the finite sample properties of estimators of the fractional differencing parameter, d. This involves frequency domain, time domain, and wavelet based approaches, and we consider both parametric and semiparametric estimation methods. The es...... the time domain parametric methods, and (4) without sufficient trimming of scales the wavelet-based estimators are heavily biased.......In this paper we compare through Monte Carlo simulations the finite sample properties of estimators of the fractional differencing parameter, d. This involves frequency domain, time domain, and wavelet based approaches, and we consider both parametric and semiparametric estimation methods....... The estimators are briefly introduced and compared, and the criteria adopted for measuring finite sample performance are bias and root mean squared error. Most importantly, the simulations reveal that (1) the frequency domain maximum likelihood procedure is superior to the time domain parametric methods, (2) all...
HOT-DUST-POOR QUASARS IN MID-INFRARED AND OPTICALLY SELECTED SAMPLES

International Nuclear Information System (INIS)

Hao Heng; Elvis, Martin; Civano, Francesca; Lawrence, Andy

2011-01-01

We show that the hot-dust-poor (HDP) quasars, originally found in the X-ray-selected XMM-COSMOS type 1 active galactic nucleus (AGN) sample, are just as common in two samples selected at optical/infrared wavelengths: the Richards et al. Spitzer/SDSS sample (8.7% ± 2.2%) and the Palomar-Green-quasar-dominated sample of Elvis et al. (9.5% ± 5.0%). The properties of the HDP quasars in these two samples are consistent with the XMM-COSMOS sample, except that, at the 99% (∼ 2.5σ) significance, a larger proportion of the HDP quasars in the Spitzer/SDSS sample have weak host galaxy contributions, probably due to the selection criteria used. Either the host dust is destroyed (dynamically or by radiation) or is offset from the central black hole due to recoiling. Alternatively, the universality of HDP quasars in samples with different selection methods and the continuous distribution of dust covering factor in type 1 AGNs suggest that the range of spectral energy distributions could be related to the range of tilts in warped fueling disks, as in the model of Lawrence and Elvis, with HDP quasars having relatively small warps.
The effect of selection on genetic parameter estimates

African Journals Online (AJOL)

Unknown

The South African Journal of Animal Science is available online at ... A simulation study was carried out to investigate the effect of selection on the estimation of genetic ... The model contained a fixed effect, random genetic and random.
HICOSMO - cosmology with a complete sample of galaxy clusters - I. Data analysis, sample selection and luminosity-mass scaling relation

Science.gov (United States)

Schellenberger, G.; Reiprich, T. H.

2017-08-01

The X-ray regime, where the most massive visible component of galaxy clusters, the intracluster medium, is visible, offers directly measured quantities, like the luminosity, and derived quantities, like the total mass, to characterize these objects. The aim of this project is to analyse a complete sample of galaxy clusters in detail and constrain cosmological parameters, like the matter density, Ωm, or the amplitude of initial density fluctuations, σ8. The purely X-ray flux-limited sample (HIFLUGCS) consists of the 64 X-ray brightest galaxy clusters, which are excellent targets to study the systematic effects, that can bias results. We analysed in total 196 Chandra observations of the 64 HIFLUGCS clusters, with a total exposure time of 7.7 Ms. Here, we present our data analysis procedure (including an automated substructure detection and an energy band optimization for surface brightness profile analysis) that gives individually determined, robust total mass estimates. These masses are tested against dynamical and Planck Sunyaev-Zeldovich (SZ) derived masses of the same clusters, where good overall agreement is found with the dynamical masses. The Planck SZ masses seem to show a mass-dependent bias to our hydrostatic masses; possible biases in this mass-mass comparison are discussed including the Planck selection function. Furthermore, we show the results for the (0.1-2.4) keV luminosity versus mass scaling relation. The overall slope of the sample (1.34) is in agreement with expectations and values from literature. Splitting the sample into galaxy groups and clusters reveals, even after a selection bias correction, that galaxy groups exhibit a significantly steeper slope (1.88) compared to clusters (1.06).
Assessment of sampling strategies for estimation of site mean concentrations of stormwater pollutants.

Science.gov (United States)

McCarthy, David T; Zhang, Kefeng; Westerlund, Camilla; Viklander, Maria; Bertrand-Krajewski, Jean-Luc; Fletcher, Tim D; Deletic, Ana

2018-02-01

The estimation of stormwater pollutant concentrations is a primary requirement of integrated urban water management. In order to determine effective sampling strategies for estimating pollutant concentrations, data from extensive field measurements at seven different catchments was used. At all sites, 1-min resolution continuous flow measurements, as well as flow-weighted samples, were taken and analysed for total suspend solids (TSS), total nitrogen (TN) and Escherichia coli (E. coli). For each of these parameters, the data was used to calculate the Event Mean Concentrations (EMCs) for each event. The measured Site Mean Concentrations (SMCs) were taken as the volume-weighted average of these EMCs for each parameter, at each site. 17 different sampling strategies, including random and fixed strategies were tested to estimate SMCs, which were compared with the measured SMCs. The ratios of estimated/measured SMCs were further analysed to determine the most effective sampling strategies. Results indicate that the random sampling strategies were the most promising method in reproducing SMCs for TSS and TN, while some fixed sampling strategies were better for estimating the SMC of E. coli. The differences in taking one, two or three random samples were small (up to 20% for TSS, and 10% for TN and E. coli), indicating that there is little benefit in investing in collection of more than one sample per event if attempting to estimate the SMC through monitoring of multiple events. It was estimated that an average of 27 events across the studied catchments are needed for characterising SMCs of TSS with a 90% confidence interval (CI) width of 1.0, followed by E.coli (average 12 events) and TN (average 11 events). The coefficient of variation of pollutant concentrations was linearly and significantly correlated to the 90% confidence interval ratio of the estimated/measured SMCs (R 2 = 0.49; P sampling frequency needed to accurately estimate SMCs of pollutants. Crown
Optimum sample size to estimate mean parasite abundance in fish parasite surveys

Directory of Open Access Journals (Sweden)

Shvydka S.

2018-03-01

Full Text Available To reach ethically and scientifically valid mean abundance values in parasitological and epidemiological studies this paper considers analytic and simulation approaches for sample size determination. The sample size estimation was carried out by applying mathematical formula with predetermined precision level and parameter of the negative binomial distribution estimated from the empirical data. A simulation approach to optimum sample size determination aimed at the estimation of true value of the mean abundance and its confidence interval (CI was based on the Bag of Little Bootstraps (BLB. The abundance of two species of monogenean parasites Ligophorus cephali and L. mediterraneus from Mugil cephalus across the Azov-Black Seas localities were subjected to the analysis. The dispersion pattern of both helminth species could be characterized as a highly aggregated distribution with the variance being substantially larger than the mean abundance. The holistic approach applied here offers a wide range of appropriate methods in searching for the optimum sample size and the understanding about the expected precision level of the mean. Given the superior performance of the BLB relative to formulae with its few assumptions, the bootstrap procedure is the preferred method. Two important assessments were performed in the present study: i based on CIs width a reasonable precision level for the mean abundance in parasitological surveys of Ligophorus spp. could be chosen between 0.8 and 0.5 with 1.6 and 1x mean of the CIs width, and ii the sample size equal 80 or more host individuals allows accurate and precise estimation of mean abundance. Meanwhile for the host sample size in range between 25 and 40 individuals, the median estimates showed minimal bias but the sampling distribution skewed to the low values; a sample size of 10 host individuals yielded to unreliable estimates.
Gray bootstrap method for estimating frequency-varying random vibration signals with small samples

Directory of Open Access Journals (Sweden)

Wang Yanqing

2014-04-01

Full Text Available During environment testing, the estimation of random vibration signals (RVS is an important technique for the airborne platform safety and reliability. However, the available methods including extreme value envelope method (EVEM, statistical tolerances method (STM and improved statistical tolerance method (ISTM require large samples and typical probability distribution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated interval, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM and gray method (GM in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.
Influence of Sampling Effort on the Estimated Richness of Road-Killed Vertebrate Wildlife

Science.gov (United States)

Bager, Alex; da Rosa, Clarissa A.

2011-05-01

Road-killed mammals, birds, and reptiles were collected weekly from highways in southern Brazil in 2002 and 2005. The objective was to assess variation in estimates of road-kill impacts on species richness produced by different sampling efforts, and to provide information to aid in the experimental design of future sampling. Richness observed in weekly samples was compared with sampling for different periods. In each period, the list of road-killed species was evaluated based on estimates the community structure derived from weekly samplings, and by the presence of the ten species most subject to road mortality, and also of threatened species. Weekly samples were sufficient only for reptiles and mammals, considered separately. Richness estimated from the biweekly samples was equal to that found in the weekly samples, and gave satisfactory results for sampling the most abundant and threatened species. The ten most affected species showed constant road-mortality rates, independent of sampling interval, and also maintained their dominance structure. Birds required greater sampling effort. When the composition of road-killed species varies seasonally, it is necessary to take biweekly samples for a minimum of one year. Weekly or more-frequent sampling for periods longer than two years is necessary to provide a reliable estimate of total species richness.
A novel heterogeneous training sample selection method on space-time adaptive processing

Science.gov (United States)

Wang, Qiang; Zhang, Yongshun; Guo, Yiduo

2018-04-01

The performance of ground target detection about space-time adaptive processing (STAP) decreases when non-homogeneity of clutter power is caused because of training samples contaminated by target-like signals. In order to solve this problem, a novel nonhomogeneous training sample selection method based on sample similarity is proposed, which converts the training sample selection into a convex optimization problem. Firstly, the existing deficiencies on the sample selection using generalized inner product (GIP) are analyzed. Secondly, the similarities of different training samples are obtained by calculating mean-hausdorff distance so as to reject the contaminated training samples. Thirdly, cell under test (CUT) and the residual training samples are projected into the orthogonal subspace of the target in the CUT, and mean-hausdorff distances between the projected CUT and training samples are calculated. Fourthly, the distances are sorted in order of value and the training samples which have the bigger value are selective preference to realize the reduced-dimension. Finally, simulation results with Mountain-Top data verify the effectiveness of the proposed method.
Estimating rare events in biochemical systems using conditional sampling

Science.gov (United States)

Sundar, V. S.

2017-01-01

The paper focuses on development of variance reduction strategies to estimate rare events in biochemical systems. Obtaining this probability using brute force Monte Carlo simulations in conjunction with the stochastic simulation algorithm (Gillespie's method) is computationally prohibitive. To circumvent this, important sampling tools such as the weighted stochastic simulation algorithm and the doubly weighted stochastic simulation algorithm have been proposed. However, these strategies require an additional step of determining the important region to sample from, which is not straightforward for most of the problems. In this paper, we apply the subset simulation method, developed as a variance reduction tool in the context of structural engineering, to the problem of rare event estimation in biochemical systems. The main idea is that the rare event probability is expressed as a product of more frequent conditional probabilities. These conditional probabilities are estimated with high accuracy using Monte Carlo simulations, specifically the Markov chain Monte Carlo method with the modified Metropolis-Hastings algorithm. Generating sample realizations of the state vector using the stochastic simulation algorithm is viewed as mapping the discrete-state continuous-time random process to the standard normal random variable vector. This viewpoint opens up the possibility of applying more sophisticated and efficient sampling schemes developed elsewhere to problems in stochastic chemical kinetics. The results obtained using the subset simulation method are compared with existing variance reduction strategies for a few benchmark problems, and a satisfactory improvement in computational time is demonstrated.
Estimation for small domains in double sampling for stratification ...

African Journals Online (AJOL)

In this article, we investigate the effect of randomness of the size of a small domain on the precision of an estimator of mean for the domain under double sampling for stratification. The result shows that for a small domain that cuts across various strata with unknown weights, the sampling variance depends on the within ...
Sampling strategies for efficient estimation of tree foliage biomass

Science.gov (United States)

Hailemariam Temesgen; Vicente Monleon; Aaron Weiskittel; Duncan Wilson

2011-01-01

Conifer crowns can be highly variable both within and between trees, particularly with respect to foliage biomass and leaf area. A variety of sampling schemes have been used to estimate biomass and leaf area at the individual tree and stand scales. Rarely has the effectiveness of these sampling schemes been compared across stands or even across species. In addition,...
Effects of sample size on estimates of population growth rates calculated with matrix models.

Directory of Open Access Journals (Sweden)

Ian J Fiske

Full Text Available BACKGROUND: Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. METHODOLOGY/PRINCIPAL FINDINGS: Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. CONCLUSIONS/SIGNIFICANCE: We found significant bias at small sample sizes when survival was low (survival = 0.5, and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high
Effects of sample size on estimates of population growth rates calculated with matrix models.

Science.gov (United States)

Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M

2008-08-28

Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
A method for estimating radioactive cesium concentrations in cattle blood using urine samples.

Science.gov (United States)

Sato, Itaru; Yamagishi, Ryoma; Sasaki, Jun; Satoh, Hiroshi; Miura, Kiyoshi; Kikuchi, Kaoru; Otani, Kumiko; Okada, Keiji

2017-12-01

In the region contaminated by the Fukushima nuclear accident, radioactive contamination of live cattle should be checked before slaughter. In this study, we establish a precise method for estimating radioactive cesium concentrations in cattle blood using urine samples. Blood and urine samples were collected from a total of 71 cattle on two farms in the 'difficult-to-return zone'. Urine 137 Cs, specific gravity, electrical conductivity, pH, sodium, potassium, calcium, and creatinine were measured and various estimation methods for blood 137 Cs were tested. The average error rate of the estimation was 54.2% without correction. Correcting for urine creatinine, specific gravity, electrical conductivity, or potassium improved the precision of the estimation. Correcting for specific gravity using the following formula gave the most precise estimate (average error rate = 16.9%): [blood 137 Cs] = [urinary 137 Cs]/([specific gravity] - 1)/329. Urine samples are faster to measure than blood samples because urine can be obtained in larger quantities and has a higher 137 Cs concentration than blood. These advantages of urine and the estimation precision demonstrated in our study, indicate that estimation of blood 137 Cs using urine samples is a practical means of monitoring radioactive contamination in live cattle. © 2017 Japanese Society of Animal Science.
The importance of estimating selection bias on prevalence estimates shortly after a disaster.

NARCIS (Netherlands)

Grievink, Linda; Velden, Peter G van der; Yzermans, C Joris; Roorda, Jan; Stellato, Rebecca K

2006-01-01

PURPOSE: The aim was to study selective participation and its effect on prevalence estimates in a health survey of affected residents 3 weeks after a man-made disaster in The Netherlands (May 13, 2000). METHODS: All affected adult residents were invited to participate. Survey (questionnaire) data
The importance of estimating selection bias on prevalence estimates, shortly after a disaster.

NARCIS (Netherlands)

Grievink, L.; Velden, P.G. van der; Yzermans, C.J.; Roorda, J.; Stellato, R.K.

2006-01-01

PURPOSE: The aim was to study selective participation and its effect on prevalence estimates in a health survey of affected residents 3 weeks after a man-made disaster in The Netherlands (May 13, 2000). METHODS: All affected adult residents were invited to participate. Survey (questionnaire) data
6. Label-free selective plane illumination microscopy of tissue samples

Directory of Open Access Journals (Sweden)

Muteb Alharbi

2017-10-01

Conclusion: Overall this method meets the demands of the current needs for 3D imaging tissue samples in a label-free manner. Label-free Selective Plane Microscopy directly provides excellent information about the structure of the tissue samples. This work has highlighted the superiority of Label-free Selective Plane Microscopy to current approaches to label-free 3D imaging of tissue.

Maximum likelihood estimation for Cox's regression model under nested case-control sampling

DEFF Research Database (Denmark)

Scheike, Thomas; Juul, Anders

2004-01-01

Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazard...
Development of a sampling strategy and sample size calculation to estimate the distribution of mammographic breast density in Korean women.

Science.gov (United States)

Jun, Jae Kwan; Kim, Mi Jin; Choi, Kui Son; Suh, Mina; Jung, Kyu-Won

2012-01-01

Mammographic breast density is a known risk factor for breast cancer. To conduct a survey to estimate the distribution of mammographic breast density in Korean women, appropriate sampling strategies for representative and efficient sampling design were evaluated through simulation. Using the target population from the National Cancer Screening Programme (NCSP) for breast cancer in 2009, we verified the distribution estimate by repeating the simulation 1,000 times using stratified random sampling to investigate the distribution of breast density of 1,340,362 women. According to the simulation results, using a sampling design stratifying the nation into three groups (metropolitan, urban, and rural), with a total sample size of 4,000, we estimated the distribution of breast density in Korean women at a level of 0.01% tolerance. Based on the results of our study, a nationwide survey for estimating the distribution of mammographic breast density among Korean women can be conducted efficiently.
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival.

Directory of Open Access Journals (Sweden)

Ziya Kordjazi

Full Text Available Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day and four levels of the number of sampling-days (2, 4, 6 and 7 days. The most parsimonious Cormack-Jolly-Seber (CJS model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery.
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival

Science.gov (United States)

Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas

2016-01-01

Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561
Replication Variance Estimation under Two-phase Sampling in the Presence of Non-response

Directory of Open Access Journals (Sweden)

Muqaddas Javed

2014-09-01

Full Text Available Kim and Yu (2011 discussed replication variance estimator for two-phase stratified sampling. In this paper estimators for mean have been proposed in two-phase stratified sampling for different situation of existence of non-response at first phase and second phase. The expressions of variances of these estimators have been derived. Furthermore, replication-based jackknife variance estimators of these variances have also been derived. Simulation study has been conducted to investigate the performance of the suggested estimators.
Randomization of grab-sampling strategies for estimating the annual exposure of U miners to Rn daughters.

Science.gov (United States)

Borak, T B

1986-04-01

Periodic grab sampling in combination with time-of-occupancy surveys has been the accepted procedure for estimating the annual exposure of underground U miners to Rn daughters. Temporal variations in the concentration of potential alpha energy in the mine generate uncertainties in this process. A system to randomize the selection of locations for measurement is described which can reduce uncertainties and eliminate systematic biases in the data. In general, a sample frequency of 50 measurements per year is sufficient to satisfy the criteria that the annual exposure be determined in working level months to within +/- 50% of the true value with a 95% level of confidence. Suggestions for implementing this randomization scheme are presented.
Inverse sampled Bernoulli (ISB) procedure for estimating a population proportion, with nuclear material applications

International Nuclear Information System (INIS)

Wright, T.

1982-01-01

A new sampling procedure is introduced for estimating a population proportion. The procedure combines the ideas of inverse binomial sampling and Bernoulli sampling. An unbiased estimator is given with its variance. The procedure can be viewed as a generalization of inverse binomial sampling
Using maximum entropy modeling for optimal selection of sampling sites for monitoring networks

Science.gov (United States)

Stohlgren, Thomas J.; Kumar, Sunil; Barnett, David T.; Evangelista, Paul H.

2011-01-01

Environmental monitoring programs must efficiently describe state shifts. We propose using maximum entropy modeling to select dissimilar sampling sites to capture environmental variability at low cost, and demonstrate a specific application: sample site selection for the Central Plains domain (453,490 km2) of the National Ecological Observatory Network (NEON). We relied on four environmental factors: mean annual temperature and precipitation, elevation, and vegetation type. A “sample site” was defined as a 20 km × 20 km area (equal to NEON’s airborne observation platform [AOP] footprint), within which each 1 km2 cell was evaluated for each environmental factor. After each model run, the most environmentally dissimilar site was selected from all potential sample sites. The iterative selection of eight sites captured approximately 80% of the environmental envelope of the domain, an improvement over stratified random sampling and simple random designs for sample site selection. This approach can be widely used for cost-efficient selection of survey and monitoring sites.
Estimating black bear density in New Mexico using noninvasive genetic sampling coupled with spatially explicit capture-recapture methods

Science.gov (United States)

Gould, Matthew J.; Cain, James W.; Roemer, Gary W.; Gould, William R.

2016-01-01

During the 2004–2005 to 2015–2016 hunting seasons, the New Mexico Department of Game and Fish (NMDGF) estimated black bear abundance (Ursus americanus) across the state by coupling density estimates with the distribution of primary habitat generated by Costello et al. (2001). These estimates have been used to set harvest limits. For example, a density of 17 bears/100 km2 for the Sangre de Cristo and Sacramento Mountains and 13.2 bears/100 km2 for the Sandia Mountains were used to set harvest levels. The advancement and widespread acceptance of non-invasive sampling and mark-recapture methods, prompted the NMDGF to collaborate with the New Mexico Cooperative Fish and Wildlife Research Unit and New Mexico State University to update their density estimates for black bear populations in select mountain ranges across the state.We established 5 study areas in 3 mountain ranges: the northern (NSC; sampled in 2012) and southern Sangre de Cristo Mountains (SSC; sampled in 2013), the Sandia Mountains (Sandias; sampled in 2014), and the northern (NSacs) and southern Sacramento Mountains (SSacs; both sampled in 2014). We collected hair samples from black bears using two concurrent non-invasive sampling methods, hair traps and bear rubs. We used a gender marker and a suite of microsatellite loci to determine the individual identification of hair samples that were suitable for genetic analysis. We used these data to generate mark-recapture encounter histories for each bear and estimated density in a spatially explicit capture-recapture framework (SECR). We constructed a suite of SECR candidate models using sex, elevation, land cover type, and time to model heterogeneity in detection probability and the spatial scale over which detection probability declines. We used Akaike’s Information Criterion corrected for small sample size (AICc) to rank and select the most supported model from which we estimated density.We set 554 hair traps, 117 bear rubs and collected 4,083 hair
Instance Selection for Classifier Performance Estimation in Meta Learning

OpenAIRE

Marcin Blachnik

2017-01-01

Building an accurate prediction model is challenging and requires appropriate model selection. This process is very time consuming but can be accelerated with meta-learning–automatic model recommendation by estimating the performances of given prediction models without training them. Meta-learning utilizes metadata extracted from the dataset to effectively estimate the accuracy of the model in question. To achieve that goal, metadata descriptors must be gathered efficiently and must be inform...
Instance Selection for Classifier Performance Estimation in Meta Learning

Directory of Open Access Journals (Sweden)

Marcin Blachnik

2017-11-01

Full Text Available Building an accurate prediction model is challenging and requires appropriate model selection. This process is very time consuming but can be accelerated with meta-learning–automatic model recommendation by estimating the performances of given prediction models without training them. Meta-learning utilizes metadata extracted from the dataset to effectively estimate the accuracy of the model in question. To achieve that goal, metadata descriptors must be gathered efficiently and must be informative to allow the precise estimation of prediction accuracy. In this paper, a new type of metadata descriptors is analyzed. These descriptors are based on the compression level obtained from the instance selection methods at the data-preprocessing stage. To verify their suitability, two types of experiments on real-world datasets have been conducted. In the first one, 11 instance selection methods were examined in order to validate the compression–accuracy relation for three classifiers: k-nearest neighbors (kNN, support vector machine (SVM, and random forest. From this analysis, two methods are recommended (instance-based learning type 2 (IB2, and edited nearest neighbor (ENN which are then compared with the state-of-the-art metaset descriptors. The obtained results confirm that the two suggested compression-based meta-features help to predict accuracy of the base model much more accurately than the state-of-the-art solution.
Patch-based visual tracking with online representative sample selection

Science.gov (United States)

Ou, Weihua; Yuan, Di; Li, Donghao; Liu, Bin; Xia, Daoxun; Zeng, Wu

2017-05-01

Occlusion is one of the most challenging problems in visual object tracking. Recently, a lot of discriminative methods have been proposed to deal with this problem. For the discriminative methods, it is difficult to select the representative samples for the target template updating. In general, the holistic bounding boxes that contain tracked results are selected as the positive samples. However, when the objects are occluded, this simple strategy easily introduces the noises into the training data set and the target template and then leads the tracker to drift away from the target seriously. To address this problem, we propose a robust patch-based visual tracker with online representative sample selection. Different from previous works, we divide the object and the candidates into several patches uniformly and propose a score function to calculate the score of each patch independently. Then, the average score is adopted to determine the optimal candidate. Finally, we utilize the non-negative least square method to find the representative samples, which are used to update the target template. The experimental results on the object tracking benchmark 2013 and on the 13 challenging sequences show that the proposed method is robust to the occlusion and achieves promising results.
A test of alternative estimators for volume at time 1 from remeasured point samples

Science.gov (United States)

Francis A. Roesch; Edwin J. Green; Charles T. Scott

1993-01-01

Two estimators for volume at time 1 for use with permanent horizontal point samples are evaluated. One estimator, used traditionally, uses only the trees sampled at time 1, while the second estimator, originally presented by Roesch and coauthors (F.A. Roesch, Jr., E.J. Green, and C.T. Scott. 1989. For. Sci. 35(2):281-293). takes advantage of additional sample...
Turbidity-controlled suspended sediment sampling for runoff-event load estimation

Science.gov (United States)

Jack Lewis

1996-01-01

Abstract - For estimating suspended sediment concentration (SSC) in rivers, turbidity is generally a much better predictor than water discharge. Although it is now possible to collect continuous turbidity data even at remote sites, sediment sampling and load estimation are still conventionally based on discharge. With frequent calibration the relation of turbidity to...
Performance of sampling methods to estimate log characteristics for wildlife.

Science.gov (United States)

Lisa J. Bate; Torolf R. Torgersen; Michael J. Wisdom; Edward O. Garton

2004-01-01

Accurate estimation of the characteristics of log resources, or coarse woody debris (CWD), is critical to effective management of wildlife and other forest resources. Despite the importance of logs as wildlife habitat, methods for sampling logs have traditionally focused on silvicultural and fire applications. These applications have emphasized estimates of log volume...
Estimation of Sensitive Proportion by Randomized Response Data in Successive Sampling

Directory of Open Access Journals (Sweden)

Bo Yu

2015-01-01

Full Text Available This paper considers the problem of estimation for binomial proportions of sensitive or stigmatizing attributes in the population of interest. Randomized response techniques are suggested for protecting the privacy of respondents and reducing the response bias while eliciting information on sensitive attributes. In many sensitive question surveys, the same population is often sampled repeatedly on each occasion. In this paper, we apply successive sampling scheme to improve the estimation of the sensitive proportion on current occasion.
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses

Science.gov (United States)

Lanfear, Robert; Hua, Xia; Warren, Dan L.

2016-01-01

Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794
Sampling designs and methods for estimating fish-impingement losses at cooling-water intakes

International Nuclear Information System (INIS)

Murarka, I.P.; Bodeau, D.J.

1977-01-01

Several systems for estimating fish impingement at power plant cooling-water intakes are compared to determine the most statistically efficient sampling designs and methods. Compared to a simple random sampling scheme the stratified systematic random sampling scheme, the systematic random sampling scheme, and the stratified random sampling scheme yield higher efficiencies and better estimators for the parameters in two models of fish impingement as a time-series process. Mathematical results and illustrative examples of the applications of the sampling schemes to simulated and real data are given. Some sampling designs applicable to fish-impingement studies are presented in appendixes
Distributive estimation of frequency selective channels for massive MIMO systems

KAUST Repository

Zaib, Alam

2015-12-28

We consider frequency selective channel estimation in the uplink of massive MIMO-OFDM systems, where our major concern is complexity. A low complexity distributed LMMSE algorithm is proposed that attains near optimal channel impulse response (CIR) estimates from noisy observations at receive antenna array. In proposed method, every antenna estimates the CIRs of its neighborhood followed by recursive sharing of estimates with immediate neighbors. At each step, every antenna calculates the weighted average of shared estimates which converges to near optimal LMMSE solution. The simulation results validate the near optimal performance of proposed algorithm in terms of mean square error (MSE). © 2015 EURASIP.
A 172 $\\mu$W Compressively Sampled Photoplethysmographic (PPG) Readout ASIC With Heart Rate Estimation Directly From Compressively Sampled Data.

Science.gov (United States)

Pamula, Venkata Rajesh; Valero-Sarmiento, Jose Manuel; Yan, Long; Bozkurt, Alper; Hoof, Chris Van; Helleputte, Nick Van; Yazicioglu, Refet Firat; Verhelst, Marian

2017-06-01

A compressive sampling (CS) photoplethysmographic (PPG) readout with embedded feature extraction to estimate heart rate (HR) directly from compressively sampled data is presented. It integrates a low-power analog front end together with a digital back end to perform feature extraction to estimate the average HR over a 4 s interval directly from compressively sampled PPG data. The application-specified integrated circuit (ASIC) supports uniform sampling mode (1x compression) as well as CS modes with compression ratios of 8x, 10x, and 30x. CS is performed through nonuniformly subsampling the PPG signal, while feature extraction is performed using least square spectral fitting through Lomb-Scargle periodogram. The ASIC consumes 172 μ W of power from a 1.2 V supply while reducing the relative LED driver power consumption by up to 30 times without significant loss of relevant information for accurate HR estimation.

Sample size for estimation of the Pearson correlation coefficient in cherry tomato tests

Directory of Open Access Journals (Sweden)

Bruno Giacomini Sari

2017-09-01

Full Text Available ABSTRACT: The aim of this study was to determine the required sample size for estimation of the Pearson coefficient of correlation between cherry tomato variables. Two uniformity tests were set up in a protected environment in the spring/summer of 2014. The observed variables in each plant were mean fruit length, mean fruit width, mean fruit weight, number of bunches, number of fruits per bunch, number of fruits, and total weight of fruits, with calculation of the Pearson correlation matrix between them. Sixty eight sample sizes were planned for one greenhouse and 48 for another, with the initial sample size of 10 plants, and the others were obtained by adding five plants. For each planned sample size, 3000 estimates of the Pearson correlation coefficient were obtained through bootstrap re-samplings with replacement. The sample size for each correlation coefficient was determined when the 95% confidence interval amplitude value was less than or equal to 0.4. Obtaining estimates of the Pearson correlation coefficient with high precision is difficult for parameters with a weak linear relation. Accordingly, a larger sample size is necessary to estimate them. Linear relations involving variables dealing with size and number of fruits per plant have less precision. To estimate the coefficient of correlation between productivity variables of cherry tomato, with a confidence interval of 95% equal to 0.4, it is necessary to sample 275 plants in a 250m² greenhouse, and 200 plants in a 200m² greenhouse.
Data Quality Objectives For Selecting Waste Samples For Bench-Scale Reformer Treatability Studies

International Nuclear Information System (INIS)

Banning, D.L.

2011-01-01

This document describes the data quality objectives to select archived samples located at the 222-S Laboratory for Bench-Scale Reforming testing. The type, quantity, and quality of the data required to select the samples for Fluid Bed Steam Reformer testing are discussed. In order to maximize the efficiency and minimize the time to treat Hanford tank waste in the Waste Treatment and Immobilization Plant, additional treatment processes may be required. One of the potential treatment processes is the fluidized bed steam reformer. A determination of the adequacy of the fluidized bed steam reformer process to treat Hanford tank waste is required. The initial step in determining the adequacy of the fluidized bed steam reformer process is to select archived waste samples from the 222-S Laboratory that will be used in a bench scale tests. Analyses of the selected samples will be required to confirm the samples meet the shipping requirements and for comparison to the bench scale reformer (BSR) test sample selection requirements.
Fixed-location hydroacoustic monitoring designs for estimating fish passage using stratified random and systematic sampling

International Nuclear Information System (INIS)

Skalski, J.R.; Hoffman, A.; Ransom, B.H.; Steig, T.W.

1993-01-01

Five alternate sampling designs are compared using 15 d of 24-h continuous hydroacoustic data to identify the most favorable approach to fixed-location hydroacoustic monitoring of salmonid outmigrants. Four alternative aproaches to systematic sampling are compared among themselves and with stratified random sampling (STRS). Stratifying systematic sampling (STSYS) on a daily basis is found to reduce sampling error in multiday monitoring studies. Although sampling precision was predictable with varying levels of effort in STRS, neither magnitude nor direction of change in precision was predictable when effort was varied in systematic sampling (SYS). Furthermore, modifying systematic sampling to include replicated (e.g., nested) sampling (RSYS) is further shown to provide unbiased point and variance estimates as does STRS. Numerous short sampling intervals (e.g., 12 samples of 1-min duration per hour) must be monitored hourly using RSYS to provide efficient, unbiased point and interval estimates. For equal levels of effort, STRS outperformed all variations of SYS examined. Parametric approaches to confidence interval estimates are found to be superior to nonparametric interval estimates (i.e., bootstrap and jackknife) in estimating total fish passage. 10 refs., 1 fig., 8 tabs
Nondestructive, stereological estimation of canopy surface area

DEFF Research Database (Denmark)

Wulfsohn, Dvora-Laio; Sciortino, Marco; Aaslyng, Jesper M.

2010-01-01

We describe a stereological procedure to estimate the total leaf surface area of a plant canopy in vivo, and address the problem of how to predict the variance of the corresponding estimator. The procedure involves three nested systematic uniform random sampling stages: (i) selection of plants from...... a canopy using the smooth fractionator, (ii) sampling of leaves from the selected plants using the fractionator, and (iii) area estimation of the sampled leaves using point counting. We apply this procedure to estimate the total area of a chrysanthemum (Chrysanthemum morifolium L.) canopy and evaluate both...... the time required and the precision of the estimator. Furthermore, we compare the precision of point counting for three different grid intensities with that of several standard leaf area measurement techniques. Results showed that the precision of the plant leaf area estimator based on point counting...
Robust online tracking via adaptive samples selection with saliency detection

Science.gov (United States)

Yan, Jia; Chen, Xi; Zhu, QiuPing

2013-12-01

Online tracking has shown to be successful in tracking of previously unknown objects. However, there are two important factors which lead to drift problem of online tracking, the one is how to select the exact labeled samples even when the target locations are inaccurate, and the other is how to handle the confusors which have similar features with the target. In this article, we propose a robust online tracking algorithm with adaptive samples selection based on saliency detection to overcome the drift problem. To deal with the problem of degrading the classifiers using mis-aligned samples, we introduce the saliency detection method to our tracking problem. Saliency maps and the strong classifiers are combined to extract the most correct positive samples. Our approach employs a simple yet saliency detection algorithm based on image spectral residual analysis. Furthermore, instead of using the random patches as the negative samples, we propose a reasonable selection criterion, in which both the saliency confidence and similarity are considered with the benefits that confusors in the surrounding background are incorporated into the classifiers update process before the drift occurs. The tracking task is formulated as a binary classification via online boosting framework. Experiment results in several challenging video sequences demonstrate the accuracy and stability of our tracker.
Accurate Frequency Estimation Based On Three-Parameter Sine-Fitting With Three FFT Samples

Directory of Open Access Journals (Sweden)

Liu Xin

2015-09-01

Full Text Available This paper presents a simple DFT-based golden section searching algorithm (DGSSA for the single tone frequency estimation. Because of truncation and discreteness in signal samples, Fast Fourier Transform (FFT and Discrete Fourier Transform (DFT are inevitable to cause the spectrum leakage and fence effect which lead to a low estimation accuracy. This method can improve the estimation accuracy under conditions of a low signal-to-noise ratio (SNR and a low resolution. This method firstly uses three FFT samples to determine the frequency searching scope, then – besides the frequency – the estimated values of amplitude, phase and dc component are obtained by minimizing the least square (LS fitting error of three-parameter sine fitting. By setting reasonable stop conditions or the number of iterations, the accurate frequency estimation can be realized. The accuracy of this method, when applied to observed single-tone sinusoid samples corrupted by white Gaussian noise, is investigated by different methods with respect to the unbiased Cramer-Rao Low Bound (CRLB. The simulation results show that the root mean square error (RMSE of the frequency estimation curve is consistent with the tendency of CRLB as SNR increases, even in the case of a small number of samples. The average RMSE of the frequency estimation is less than 1.5 times the CRLB with SNR = 20 dB and N = 512.
40 CFR 205.57-2 - Test vehicle sample selection.

Science.gov (United States)

2010-07-01

... pursuant to a test request in accordance with this subpart will be selected in the manner specified in the... then using a table of random numbers to select the number of vehicles as specified in paragraph (c) of... with the desig-nated AQL are contained in Appendix I, -Table II. (c) The appropriate batch sample size...
Estimates and sampling schemes for the instrumentation of accountability systems

International Nuclear Information System (INIS)

Jewell, W.S.; Kwiatkowski, J.W.

1976-10-01

The problem of estimation of a physical quantity from a set of measurements is considered, where the measurements are made on samples with a hierarchical error structure, and where within-groups error variances may vary from group to group at each level of the structure; minimum mean squared-error estimators are developed, and the case where the physical quantity is a random variable with known prior mean and variance is included. Estimators for the error variances are also given, and optimization of experimental design is considered
Increasing fMRI sampling rate improves Granger causality estimates.

Directory of Open Access Journals (Sweden)

Fa-Hsuan Lin

Full Text Available Estimation of causal interactions between brain areas is necessary for elucidating large-scale functional brain networks underlying behavior and cognition. Granger causality analysis of time series data can quantitatively estimate directional information flow between brain regions. Here, we show that such estimates are significantly improved when the temporal sampling rate of functional magnetic resonance imaging (fMRI is increased 20-fold. Specifically, healthy volunteers performed a simple visuomotor task during blood oxygenation level dependent (BOLD contrast based whole-head inverse imaging (InI. Granger causality analysis based on raw InI BOLD data sampled at 100-ms resolution detected the expected causal relations, whereas when the data were downsampled to the temporal resolution of 2 s typically used in echo-planar fMRI, the causality could not be detected. An additional control analysis, in which we SINC interpolated additional data points to the downsampled time series at 0.1-s intervals, confirmed that the improvements achieved with the real InI data were not explainable by the increased time-series length alone. We therefore conclude that the high-temporal resolution of InI improves the Granger causality connectivity analysis of the human brain.
Assessing the joint effect of population stratification and sample selection in studies of gene-gene (environment interactions

Directory of Open Access Journals (Sweden)

Cheng KF

2012-01-01

Full Text Available Abstract Background It is well known that the presence of population stratification (PS may cause the usual test in case-control studies to produce spurious gene-disease associations. However, the impact of the PS and sample selection (SS is less known. In this paper, we provide a systematic study of the joint effect of PS and SS under a more general risk model containing genetic and environmental factors. We provide simulation results to show the magnitude of the bias and its impact on type I error rate of the usual chi-square test under a wide range of PS level and selection bias. Results The biases to the estimation of main and interaction effect are quantified and then their bounds derived. The estimated bounds can be used to compute conservative p-values for the association test. If the conservative p-value is smaller than the significance level, we can safely claim that the association test is significant regardless of the presence of PS or not, or if there is any selection bias. We also identify conditions for the null bias. The bias depends on the allele frequencies, exposure rates, gene-environment odds ratios and disease risks across subpopulations and the sampling of the cases and controls. Conclusion Our results show that the bias cannot be ignored even the case and control data were matched in ethnicity. A real example is given to illustrate application of the conservative p-value. These results are useful to the genetic association studies of main and interaction effects.
Design unbiased estimation in line intersect sampling using segmented transects

Science.gov (United States)

David L.R. Affleck; Timothy G. Gregoire; Harry T. Valentine; Harry T. Valentine

2005-01-01

In many applications of line intersect sampling. transects consist of multiple, connected segments in a prescribed configuration. The relationship between the transect configuration and the selection probability of a population element is illustrated and a consistent sampling protocol, applicable to populations composed of arbitrarily shaped elements, is proposed. It...
Zinc estimates in ore and slag samples and analysis of ash in coal samples

International Nuclear Information System (INIS)

Umamaheswara Rao, K.; Narayana, D.G.S.; Subrahmanyam, Y.

1984-01-01

Zinc estimates in ore and slag samples were made using the radioisotope X-ray fluorescence method. A 10 mCi 238 Pu was employed as the primary source of radiation and a thin crystal NaI(Ti) spectrometer was used to accomplish the detection of the 8.64 keV Zinc K-characteristic X-ray line. The results are reported. Ash content of coal concerning about 100 samples from Ravindra Khani VI and VII mines in Andhra Pradesh were measured using X-ray backscattering method with compensation for varying concentrations of iron in different coal samples through iron-X-ray fluorescent intensity measurements. The ash percent is found to range from 10 to 40. (author)
Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions

DEFF Research Database (Denmark)

Tzoumas, Kostas; Deshpande, Amol; Jensen, Christian S.

2011-01-01

the attributes in the database into small, usually two-dimensional distributions. We describe several optimizations that can make selectivity estimation highly efficient, and we present a complete implementation inside PostgreSQL’s query optimizer. Experimental results indicate an order of magnitude better...
Estimating U.S. residential demand for fuelwood in the presence of selectivity

Science.gov (United States)

Daly, Ryan Michael

Residential energy consumers have options for home heating. With many applications, appliances, and fuel types, fuelwood used for heating faces stiff competition in modern society from other fuels. This study estimates demand for domestic fuelwood. It also examines whether evidence of bias exists from residential homes choosing to use fuelwood. The use of OLS as an estimator will yield biased results if such selectivity exists. Selectivity is addressed with a Heckman (1979) two-step procedure; bias in fuelwood demand estimation using OLS is reduced. Non-wood energy prices and income are major determinants of fuelwood demand. Geographical regions and urbanization confirm results from prior studies.
Estimation of sample size and testing power (Part 3).

Science.gov (United States)

Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

2011-12-01

This article introduces the definition and sample size estimation of three special tests (namely, non-inferiority test, equivalence test and superiority test) for qualitative data with the design of one factor with two levels having a binary response variable. Non-inferiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is not clinically inferior to that of the positive control drug. Equivalence test refers to the research design of which the objective is to verify that the experimental drug and the control drug have clinically equivalent efficacy. Superiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is clinically superior to that of the control drug. By specific examples, this article introduces formulas of sample size estimation for the three special tests, and their SAS realization in detail.
Estimated ventricle size using Evans index: reference values from a population-based sample.

Science.gov (United States)

Jaraj, D; Rabiei, K; Marlow, T; Jensen, C; Skoog, I; Wikkelsø, C

2017-03-01

Evans index is an estimate of ventricular size used in the diagnosis of idiopathic normal-pressure hydrocephalus (iNPH). Values >0.3 are considered pathological and are required by guidelines for the diagnosis of iNPH. However, there are no previous epidemiological studies on Evans index, and normal values in adults are thus not precisely known. We examined a representative sample to obtain reference values and descriptive data on Evans index. A population-based sample (n = 1235) of men and women aged ≥70 years was examined. The sample comprised people living in private households and residential care, systematically selected from the Swedish population register. Neuropsychiatric examinations, including head computed tomography, were performed between 1986 and 2000. Evans index ranged from 0.11 to 0.46. The mean value in the total sample was 0.28 (SD, 0.04) and 20.6% (n = 255) had values >0.3. Among men aged ≥80 years, the mean value of Evans index was 0.3 (SD, 0.03). Individuals with dementia had a mean value of Evans index of 0.31 (SD, 0.05) and those with radiological signs of iNPH had a mean value of 0.36 (SD, 0.04). A substantial number of subjects had ventricular enlargement according to current criteria. Clinicians and researchers need to be aware of the range of values among older individuals. © 2017 EAN.
Seasonal and temporal variation in release of antibiotics in hospital wastewater: estimation using continuous and grab sampling.

Science.gov (United States)

Diwan, Vishal; Stålsby Lundborg, Cecilia; Tamhankar, Ashok J

2013-01-01

The presence of antibiotics in the environment and their subsequent impact on resistance development has raised concerns globally. Hospitals are a major source of antibiotics released into the environment. To reduce these residues, research to improve knowledge of the dynamics of antibiotic release from hospitals is essential. Therefore, we undertook a study to estimate seasonal and temporal variation in antibiotic release from two hospitals in India over a period of two years. For this, 6 sampling sessions of 24 hours each were conducted in the three prominent seasons of India, at all wastewater outlets of the two hospitals, using continuous and grab sampling methods. An in-house wastewater sampler was designed for continuous sampling. Eight antibiotics from four major antibiotic groups were selected for the study. To understand the temporal pattern of antibiotic release, each of the 24-hour sessions were divided in three sub-sampling sessions of 8 hours each. Solid phase extraction followed by liquid chromatography/tandem mass spectrometry (LC-MS/MS) was used to determine the antibiotic residues. Six of the eight antibiotics studied were detected in the wastewater samples. Both continuous and grab sampling methods indicated that the highest quantities of fluoroquinolones were released in winter followed by the rainy season and the summer. No temporal pattern in antibiotic release was detected. In general, in a common timeframe, continuous sampling showed less concentration of antibiotics in wastewater as compared to grab sampling. It is suggested that continuous sampling should be the method of choice as grab sampling gives erroneous results, it being indicative of the quantities of antibiotics present in wastewater only at the time of sampling. Based on our studies, calculations indicate that from hospitals in India, an estimated 89, 1 and 25 ng/L/day of fluroquinolones, metronidazole and sulfamethoxazole respectively, might be getting released into the
Graph Sampling for Covariance Estimation

KAUST Repository

Chepuri, Sundeep Prabhakar

2017-04-25

In this paper the focus is on subsampling as well as reconstructing the second-order statistics of signals residing on nodes of arbitrary undirected graphs. Second-order stationary graph signals may be obtained by graph filtering zero-mean white noise and they admit a well-defined power spectrum whose shape is determined by the frequency response of the graph filter. Estimating the graph power spectrum forms an important component of stationary graph signal processing and related inference tasks such as Wiener prediction or inpainting on graphs. The central result of this paper is that by sampling a significantly smaller subset of vertices and using simple least squares, we can reconstruct the second-order statistics of the graph signal from the subsampled observations, and more importantly, without any spectral priors. To this end, both a nonparametric approach as well as parametric approaches including moving average and autoregressive models for the graph power spectrum are considered. The results specialize for undirected circulant graphs in that the graph nodes leading to the best compression rates are given by the so-called minimal sparse rulers. A near-optimal greedy algorithm is developed to design the subsampling scheme for the non-parametric and the moving average models, whereas a particular subsampling scheme that allows linear estimation for the autoregressive model is proposed. Numerical experiments on synthetic as well as real datasets related to climatology and processing handwritten digits are provided to demonstrate the developed theory.
Systematic sampling for suspended sediment

Science.gov (United States)

Robert B. Thomas

1991-01-01

Abstract - Because of high costs or complex logistics, scientific populations cannot be measured entirely and must be sampled. Accepted scientific practice holds that sample selection be based on statistical principles to assure objectivity when estimating totals and variances. Probability sampling--obtaining samples with known probabilities--is the only method that...
A probabilistic sampling method (PSM for estimating geographic distance to health services when only the region of residence is known

Directory of Open Access Journals (Sweden)

Peek-Asa Corinne

2011-01-01

Full Text Available Abstract Background The need to estimate the distance from an individual to a service provider is common in public health research. However, estimated distances are often imprecise and, we suspect, biased due to a lack of specific residential location data. In many cases, to protect subject confidentiality, data sets contain only a ZIP Code or a county. Results This paper describes an algorithm, known as "the probabilistic sampling method" (PSM, which was used to create a distribution of estimated distances to a health facility for a person whose region of residence was known, but for which demographic details and centroids were known for smaller areas within the region. From this distribution, the median distance is the most likely distance to the facility. The algorithm, using Monte Carlo sampling methods, drew a probabilistic sample of all the smaller areas (Census blocks within each participant's reported region (ZIP Code, weighting these areas by the number of residents in the same age group as the participant. To test the PSM, we used data from a large cross-sectional study that screened women at a clinic for intimate partner violence (IPV. We had data on each woman's age and ZIP Code, but no precise residential address. We used the PSM to select a sample of census blocks, then calculated network distances from each census block's centroid to the closest IPV facility, resulting in a distribution of distances from these locations to the geocoded locations of known IPV services. We selected the median distance as the most likely distance traveled and computed confidence intervals that describe the shortest and longest distance within which any given percent of the distance estimates lie. We compared our results to those obtained using two other geocoding approaches. We show that one method overestimated the most likely distance and the other underestimated it. Neither of the alternative methods produced confidence intervals for the distance

A probabilistic sampling method (PSM) for estimating geographic distance to health services when only the region of residence is known

Science.gov (United States)

2011-01-01

Background The need to estimate the distance from an individual to a service provider is common in public health research. However, estimated distances are often imprecise and, we suspect, biased due to a lack of specific residential location data. In many cases, to protect subject confidentiality, data sets contain only a ZIP Code or a county. Results This paper describes an algorithm, known as "the probabilistic sampling method" (PSM), which was used to create a distribution of estimated distances to a health facility for a person whose region of residence was known, but for which demographic details and centroids were known for smaller areas within the region. From this distribution, the median distance is the most likely distance to the facility. The algorithm, using Monte Carlo sampling methods, drew a probabilistic sample of all the smaller areas (Census blocks) within each participant's reported region (ZIP Code), weighting these areas by the number of residents in the same age group as the participant. To test the PSM, we used data from a large cross-sectional study that screened women at a clinic for intimate partner violence (IPV). We had data on each woman's age and ZIP Code, but no precise residential address. We used the PSM to select a sample of census blocks, then calculated network distances from each census block's centroid to the closest IPV facility, resulting in a distribution of distances from these locations to the geocoded locations of known IPV services. We selected the median distance as the most likely distance traveled and computed confidence intervals that describe the shortest and longest distance within which any given percent of the distance estimates lie. We compared our results to those obtained using two other geocoding approaches. We show that one method overestimated the most likely distance and the other underestimated it. Neither of the alternative methods produced confidence intervals for the distance estimates. The algorithm
Infusion and sampling site effects on two-pool model estimates of leucine metabolism

International Nuclear Information System (INIS)

Helland, S.J.; Grisdale-Helland, B.; Nissen, S.

1988-01-01

To assess the effect of site of isotope infusion on estimates of leucine metabolism infusions of alpha-[4,5-3H]ketoisocaproate (KIC) and [U- 14 C]leucine were made into the left or right ventricles of sheep and pigs. Blood was sampled from the opposite ventricle. In both species, left ventricular infusions resulted in significantly lower specific radioactivities (SA) of [ 14 C]leucine and [ 3 H]KIC. [ 14 C]KIC SA was found to be insensitive to infusion and sampling sites. [ 14 C]KIC was in addition found to be equal to the SA of [ 14 C]leucine only during the left heart infusions. Therefore, [ 14 C]KIC SA was used as the only estimate for [ 14 C]SA in the equations for the two-pool model. This model eliminated the influence of site of infusion and blood sampling on the estimates for leucine entry and reduced the impact on the estimates for proteolysis and oxidation. This two-pool model could not compensate for the underestimation of transamination reactions occurring during the traditional venous isotope infusion and arterial blood sampling
40 CFR 205.171-2 - Test exhaust system sample selection and preparation.

Science.gov (United States)

2010-07-01

... Systems § 205.171-2 Test exhaust system sample selection and preparation. (a)(1) Exhaust systems... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Test exhaust system sample selection and preparation. 205.171-2 Section 205.171-2 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY...
Integration of sampling based battery state of health estimation method in electric vehicles

International Nuclear Information System (INIS)

Ozkurt, Celil; Camci, Fatih; Atamuradov, Vepa; Odorry, Christopher

2016-01-01

Highlights: • Presentation of a prototype system with full charge discharge cycling capability. • Presentation of SoH estimation results for systems degraded in the lab. • Discussion of integration alternatives of the presented method in EVs. • Simulation model based on presented SoH estimation for a real EV battery system. • Optimization of number of battery cells to be selected for SoH test. - Abstract: Battery cost is one of the crucial parameters affecting high deployment of Electric Vehicles (EVs) negatively. Accurate State of Health (SoH) estimation plays an important role in reducing the total ownership cost, availability, and safety of the battery avoiding early disposal of the batteries and decreasing unexpected failures. A circuit design for SoH estimation in a battery system that bases on selected battery cells and its integration to EVs are presented in this paper. A prototype microcontroller has been developed and used for accelerated aging tests for a battery system. The data collected in the lab tests have been utilized to simulate a real EV battery system. Results of accelerated aging tests and simulation have been presented in the paper. The paper also discusses identification of the best number of battery cells to be selected for SoH estimation test. In addition, different application options of the presented approach for EV batteries have been discussed in the paper.
Selective information sampling

Directory of Open Access Journals (Sweden)

Peter A. F. Fraser-Mackenzie

2009-06-01

Full Text Available This study investigates the amount and valence of information selected during single item evaluation. One hundred and thirty-five participants evaluated a cell phone by reading hypothetical customers reports. Some participants were first asked to provide a preliminary rating based on a picture of the phone and some technical specifications. The participants who were given the customer reports only after they made a preliminary rating exhibited valence bias in their selection of customers reports. In contrast, the participants that did not make an initial rating sought subsequent information in a more balanced, albeit still selective, manner. The preliminary raters used the least amount of information in their final decision, resulting in faster decision times. The study appears to support the notion that selective exposure is utilized in order to develop cognitive coherence.
Novel joint selection methods can reduce sample size for rheumatoid arthritis clinical trials with ultrasound endpoints.

Science.gov (United States)

Allen, John C; Thumboo, Julian; Lye, Weng Kit; Conaghan, Philip G; Chew, Li-Ching; Tan, York Kiat

2018-03-01

To determine whether novel methods of selecting joints through (i) ultrasonography (individualized-ultrasound [IUS] method), or (ii) ultrasonography and clinical examination (individualized-composite-ultrasound [ICUS] method) translate into smaller rheumatoid arthritis (RA) clinical trial sample sizes when compared to existing methods utilizing predetermined joint sites for ultrasonography. Cohen's effect size (ES) was estimated (ES^) and a 95% CI (ES^L, ES^U) calculated on a mean change in 3-month total inflammatory score for each method. Corresponding 95% CIs [nL(ES^U), nU(ES^L)] were obtained on a post hoc sample size reflecting the uncertainty in ES^. Sample size calculations were based on a one-sample t-test as the patient numbers needed to provide 80% power at α = 0.05 to reject a null hypothesis H 0 : ES = 0 versus alternative hypotheses H 1 : ES = ES^, ES = ES^L and ES = ES^U. We aimed to provide point and interval estimates on projected sample sizes for future studies reflecting the uncertainty in our study ES^S. Twenty-four treated RA patients were followed up for 3 months. Utilizing the 12-joint approach and existing methods, the post hoc sample size (95% CI) was 22 (10-245). Corresponding sample sizes using ICUS and IUS were 11 (7-40) and 11 (6-38), respectively. Utilizing a seven-joint approach, the corresponding sample sizes using ICUS and IUS methods were nine (6-24) and 11 (6-35), respectively. Our pilot study suggests that sample size for RA clinical trials with ultrasound endpoints may be reduced using the novel methods, providing justification for larger studies to confirm these observations. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Perpendicular distance sampling: an alternative method for sampling downed coarse woody debris

Science.gov (United States)

Michael S. Williams; Jeffrey H. Gove

2003-01-01

Coarse woody debris (CWD) plays an important role in many forest ecosystem processes. In recent years, a number of new methods have been proposed to sample CWD. These methods select individual logs into the sample using some form of unequal probability sampling. One concern with most of these methods is the difficulty in estimating the volume of each log. A new method...
An improved selective sampling method

International Nuclear Information System (INIS)

Miyahara, Hiroshi; Iida, Nobuyuki; Watanabe, Tamaki

1986-01-01

The coincidence methods which are currently used for the accurate activity standardisation of radio-nuclides, require dead time and resolving time corrections which tend to become increasingly uncertain as countrates exceed about 10 K. To reduce the dependence on such corrections, Muller, in 1981, proposed the selective sampling method using a fast multichannel analyser (50 ns ch -1 ) for measuring the countrates. It is, in many ways, more convenient and possibly potentially more reliable to replace the MCA with scalers and a circuit is described employing five scalers; two of them serving to measure the background correction. Results of comparisons using our new method and the coincidence method for measuring the activity of 60 Co sources yielded agree-ment within statistical uncertainties. (author)
Estimating group size: effects of category membership, differential construal and selective exposure

NARCIS (Netherlands)

Bosveld, W.; Koomen, W.; van der Pligt, J.

1996-01-01

Examined the role of category membership, differential construal, and selective exposure in consensus estimation concerning the social categorization of religion. 54 involved and less involved Christians and 40 non-believers were asked to estimate the percentage of Christians in the Netherlands
The impact of fecal sample processing on prevalence estimates for antibiotic-resistant Escherichia coli.

Science.gov (United States)

Omulo, Sylvia; Lofgren, Eric T; Mugoh, Maina; Alando, Moshe; Obiya, Joshua; Kipyegon, Korir; Kikwai, Gilbert; Gumbi, Wilson; Kariuki, Samuel; Call, Douglas R

2017-05-01

Investigators often rely on studies of Escherichia coli to characterize the burden of antibiotic resistance in a clinical or community setting. To determine if prevalence estimates for antibiotic resistance are sensitive to sample handling and interpretive criteria, we collected presumptive E. coli isolates (24 or 95 per stool sample) from a community in an urban informal settlement in Kenya. Isolates were tested for susceptibility to nine antibiotics using agar breakpoint assays and results were analyzed using generalized linear mixed models. We observed a 0.1). Prevalence estimates did not differ for five distinct E. coli colony morphologies on MacConkey agar plates (P>0.2). Successive re-plating of samples for up to five consecutive days had little to no impact on prevalence estimates. Finally, culturing E. coli under different conditions (with 5% CO 2 or micro-aerobic) did not affect estimates of prevalence. For the conditions tested in these experiments, minor modifications in sample processing protocols are unlikely to bias estimates of the prevalence of antibiotic-resistance for fecal E. coli. Copyright © 2017 Elsevier B.V. All rights reserved.
Exact run length distribution of the double sampling x-bar chart with estimated process parameters

Directory of Open Access Journals (Sweden)

Teoh, W. L.

2016-05-01

Full Text Available Since the run length distribution is generally highly skewed, a significant concern about focusing too much on the average run length (ARL criterion is that we may miss some crucial information about a control chart’s performance. Thus it is important to investigate the entire run length distribution of a control chart for an in-depth understanding before implementing the chart in process monitoring. In this paper, the percentiles of the run length distribution for the double sampling (DS X chart with estimated process parameters are computed. Knowledge of the percentiles of the run length distribution provides a more comprehensive understanding of the expected behaviour of the run length. This additional information includes the early false alarm, the skewness of the run length distribution, and the median run length (MRL. A comparison of the run length distribution between the optimal ARL-based and MRL-based DS X chart with estimated process parameters is presented in this paper. Examples of applications are given to aid practitioners to select the best design scheme of the DS X chart with estimated process parameters, based on their specific purpose.
Conditional estimation of exponential random graph models from snowball sampling designs

NARCIS (Netherlands)

Pattison, Philippa E.; Robins, Garry L.; Snijders, Tom A. B.; Wang, Peng

2013-01-01

A complete survey of a network in a large population may be prohibitively difficult and costly. So it is important to estimate models for networks using data from various network sampling designs, such as link-tracing designs. We focus here on snowball sampling designs, designs in which the members
Estimates of laboratory accuracy and precision on Hanford waste tank samples

International Nuclear Information System (INIS)

Dodd, D.A.

1995-01-01

A review was performed on three sets of analyses generated in Battelle, Pacific Northwest Laboratories and three sets generated by Westinghouse Hanford Company, 222-S Analytical Laboratory. Laboratory accuracy and precision was estimated by analyte and is reported in tables. The sources used to generate this estimate is of limited size but does include the physical forms, liquid and solid, which are representative of samples from tanks to be characterized. This estimate was published as an aid to programs developing data quality objectives in which specified limits are established. Data resulting from routine analyses of waste matrices can be expected to be bounded by the precision and accuracy estimates of the tables. These tables do not preclude or discourage direct negotiations between program and laboratory personnel while establishing bounding conditions. Programmatic requirements different than those listed may be reliably met on specific measurements and matrices. It should be recognized, however, that these are specific to waste tank matrices and may not be indicative of performance on samples from other sources
Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage

DEFF Research Database (Denmark)

Yang, Ziheng; Nielsen, Rasmus

2008-01-01

Current models of codon substitution are formulated at the levels of nucleotide substitution and do not explicitly consider the separate effects of mutation and selection. They are thus incapable of inferring whether mutation or selection is responsible for evolution at silent sites. Here we impl...... codon usage in mammals. Estimates of selection coefficients nevertheless suggest that selection on codon usage is weak and most mutations are nearly neutral. The sensitivity of the analysis on the assumed mutation model is discussed.......Current models of codon substitution are formulated at the levels of nucleotide substitution and do not explicitly consider the separate effects of mutation and selection. They are thus incapable of inferring whether mutation or selection is responsible for evolution at silent sites. Here we...... implement a few population genetics models of codon substitution that explicitly consider mutation bias and natural selection at the DNA level. Selection on codon usage is modeled by introducing codon-fitness parameters, which together with mutation-bias parameters, predict optimal codon frequencies...
A Probabilistic Mass Estimation Algorithm for a Novel 7- Channel Capacitive Sample Verification Sensor

Science.gov (United States)

Wolf, Michael

2012-01-01

A document describes an algorithm created to estimate the mass placed on a sample verification sensor (SVS) designed for lunar or planetary robotic sample return missions. A novel SVS measures the capacitance between a rigid bottom plate and an elastic top membrane in seven locations. As additional sample material (soil and/or small rocks) is placed on the top membrane, the deformation of the membrane increases the capacitance. The mass estimation algorithm addresses both the calibration of each SVS channel, and also addresses how to combine the capacitances read from each of the seven channels into a single mass estimate. The probabilistic approach combines the channels according to the variance observed during the training phase, and provides not only the mass estimate, but also a value for the certainty of the estimate. SVS capacitance data is collected for known masses under a wide variety of possible loading scenarios, though in all cases, the distribution of sample within the canister is expected to be approximately uniform. A capacitance-vs-mass curve is fitted to this data, and is subsequently used to determine the mass estimate for the single channel s capacitance reading during the measurement phase. This results in seven different mass estimates, one for each SVS channel. Moreover, the variance of the calibration data is used to place a Gaussian probability distribution function (pdf) around this mass estimate. To blend these seven estimates, the seven pdfs are combined into a single Gaussian distribution function, providing the final mean and variance of the estimate. This blending technique essentially takes the final estimate as an average of the estimates of the seven channels, weighted by the inverse of the channel s variance.
Estimation of tritium activity in bioassay samples having chemiluminescence

International Nuclear Information System (INIS)

Dwivedi, R.K.; Manu, Kumar; Kumar, Vinay; Soni, Ashish; Kaushik, A.K.; Tiwari, S.K.; Gupta, Ashok

2008-01-01

Tritium is recognized as major internal dose contributor in PHWR type of reactors. Estimation of internal dose due to tritium is carried out by analyzing urine samples in liquid scintillation analyzer (LSA). Presence of residual biochemical species in urine samples of some individuals under medical administration shows significant amount of chemiluminescence. If appropriate care is not taken the results obtained by liquid scintillation counter may be mistaken as genuine uptake of tritium. The distillation method described in this paper is used at RAPS-3 and 4 to assess correct tritium uptake. (author)
Sampling of systematic errors to estimate likelihood weights in nuclear data uncertainty propagation

International Nuclear Information System (INIS)

Helgesson, P.; Sjöstrand, H.; Koning, A.J.; Rydén, J.; Rochman, D.; Alhassan, E.; Pomp, S.

2016-01-01

In methodologies for nuclear data (ND) uncertainty assessment and propagation based on random sampling, likelihood weights can be used to infer experimental information into the distributions for the ND. As the included number of correlated experimental points grows large, the computational time for the matrix inversion involved in obtaining the likelihood can become a practical problem. There are also other problems related to the conventional computation of the likelihood, e.g., the assumption that all experimental uncertainties are Gaussian. In this study, a way to estimate the likelihood which avoids matrix inversion is investigated; instead, the experimental correlations are included by sampling of systematic errors. It is shown that the model underlying the sampling methodology (using univariate normal distributions for random and systematic errors) implies a multivariate Gaussian for the experimental points (i.e., the conventional model). It is also shown that the likelihood estimates obtained through sampling of systematic errors approach the likelihood obtained with matrix inversion as the sample size for the systematic errors grows large. In studied practical cases, it is seen that the estimates for the likelihood weights converge impractically slowly with the sample size, compared to matrix inversion. The computational time is estimated to be greater than for matrix inversion in cases with more experimental points, too. Hence, the sampling of systematic errors has little potential to compete with matrix inversion in cases where the latter is applicable. Nevertheless, the underlying model and the likelihood estimates can be easier to intuitively interpret than the conventional model and the likelihood function involving the inverted covariance matrix. Therefore, this work can both have pedagogical value and be used to help motivating the conventional assumption of a multivariate Gaussian for experimental data. The sampling of systematic errors could also
Identification and estimation of nonlinear models using two samples with nonclassical measurement errors

KAUST Repository

Carroll, Raymond J.

2010-05-01

This paper considers identification and estimation of a general nonlinear Errors-in-Variables (EIV) model using two samples. Both samples consist of a dependent variable, some error-free covariates, and an error-prone covariate, for which the measurement error has unknown distribution and could be arbitrarily correlated with the latent true values; and neither sample contains an accurate measurement of the corresponding true variable. We assume that the regression model of interest - the conditional distribution of the dependent variable given the latent true covariate and the error-free covariates - is the same in both samples, but the distributions of the latent true covariates vary with observed error-free discrete covariates. We first show that the general latent nonlinear model is nonparametrically identified using the two samples when both could have nonclassical errors, without either instrumental variables or independence between the two samples. When the two samples are independent and the nonlinear regression model is parameterized, we propose sieve Quasi Maximum Likelihood Estimation (Q-MLE) for the parameter of interest, and establish its root-n consistency and asymptotic normality under possible misspecification, and its semiparametric efficiency under correct specification, with easily estimated standard errors. A Monte Carlo simulation and a data application are presented to show the power of the approach.
PARAMETER ESTIMATION AND MODEL SELECTION FOR INDOOR ENVIRONMENTS BASED ON SPARSE OBSERVATIONS

Directory of Open Access Journals (Sweden)

Y. Dehbi

2017-09-01

Full Text Available This paper presents a novel method for the parameter estimation and model selection for the reconstruction of indoor environments based on sparse observations. While most approaches for the reconstruction of indoor models rely on dense observations, we predict scenes of the interior with high accuracy in the absence of indoor measurements. We use a model-based top-down approach and incorporate strong but profound prior knowledge. The latter includes probability density functions for model parameters and sparse observations such as room areas and the building footprint. The floorplan model is characterized by linear and bi-linear relations with discrete and continuous parameters. We focus on the stochastic estimation of model parameters based on a topological model derived by combinatorial reasoning in a first step. A Gauss-Markov model is applied for estimation and simulation of the model parameters. Symmetries are represented and exploited during the estimation process. Background knowledge as well as observations are incorporated in a maximum likelihood estimation and model selection is performed with AIC/BIC. The likelihood is also used for the detection and correction of potential errors in the topological model. Estimation results are presented and discussed.
Parameter Estimation and Model Selection for Indoor Environments Based on Sparse Observations

Science.gov (United States)

Dehbi, Y.; Loch-Dehbi, S.; Plümer, L.

2017-09-01

This paper presents a novel method for the parameter estimation and model selection for the reconstruction of indoor environments based on sparse observations. While most approaches for the reconstruction of indoor models rely on dense observations, we predict scenes of the interior with high accuracy in the absence of indoor measurements. We use a model-based top-down approach and incorporate strong but profound prior knowledge. The latter includes probability density functions for model parameters and sparse observations such as room areas and the building footprint. The floorplan model is characterized by linear and bi-linear relations with discrete and continuous parameters. We focus on the stochastic estimation of model parameters based on a topological model derived by combinatorial reasoning in a first step. A Gauss-Markov model is applied for estimation and simulation of the model parameters. Symmetries are represented and exploited during the estimation process. Background knowledge as well as observations are incorporated in a maximum likelihood estimation and model selection is performed with AIC/BIC. The likelihood is also used for the detection and correction of potential errors in the topological model. Estimation results are presented and discussed.

A Multiwavelength Study of a Sample of 70 μm Selected Galaxies in the COSMOS Field : I. Spectral Energy Distributions and Luminosities

NARCIS (Netherlands)

Kartaltepe, Jeyhan S.; Sanders, D. B.; Le Floc'h, E.; Frayer, D. T.; Aussel, H.; Arnouts, S.; Ilbert, O.; Salvato, M.; Scoville, N. Z.; Surace, J.; Yan, L.; Brusa, M.; Capak, P.; Caputi, K.; Carollo, C. M.; Civano, F.; Elvis, M.; Faure, C.; Hasinger, G.; Koekemoer, A. M.; Lee, N.; Lilly, S.; Liu, C. T.; McCracken, H. J.; Schinnerer, E.; Smolčić, V.; Taniguchi, Y.; Thompson, D. J.; Trump, J.

We present a large robust sample of 1503 reliable and unconfused 70 μm selected sources from the multiwavelength data set of the Cosmic Evolution Survey. Using the Spitzer IRAC and MIPS photometry, we estimate the total infrared (IR) luminosity, L IR (8-1000 μm), by finding the best-fit template
Mineral Composition of Selected Serbian Propolis Samples

Directory of Open Access Journals (Sweden)

Tosic Snezana

2017-06-01

Full Text Available The aim of this work was to determine the content of 22 macro- and microelements in ten raw Serbian propolis samples which differ in geographical and botanical origin as well as in polluted agent contents by atomic emission spectrometry with inductively coupled plasma (ICP-OES. The macroelements were more common and present Ca content was the highest while Na content the lowest. Among the studied essential trace elements Fe was the most common element. The levels of toxic elements (Pb, Cd, As and Hg were also analyzed, since they were possible environmental contaminants that could be transferred into propolis products for human consumption. As and Hg were not detected in any of the analyzed samples but a high level of Pb (2.0-9.7 mg/kg was detected and only selected portions of raw propolis could be used to produce natural medicines and dietary supplements for humans. Obtained results were statistically analyzed, and the examined samples showed a wide range of element content.
A Heckman Selection- t Model

KAUST Repository

Marchenko, Yulia V.

2012-03-01

Sample selection arises often in practice as a result of the partial observability of the outcome of interest in a study. In the presence of sample selection, the observed data do not represent a random sample from the population, even after controlling for explanatory variables. That is, data are missing not at random. Thus, standard analysis using only complete cases will lead to biased results. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. The method was criticized in the literature because of its sensitivity to the normality assumption. In practice, data, such as income or expenditure data, often violate the normality assumption because of heavier tails. We first establish a new link between sample selection models and recently studied families of extended skew-elliptical distributions. Then, this allows us to introduce a selection-t (SLt) model, which models the error distribution using a Student\\'s t distribution. We study its properties and investigate the finite-sample performance of the maximum likelihood estimators for this model. We compare the performance of the SLt model to the conventional Heckman selection-normal (SLN) model and apply it to analyze ambulatory expenditures. Unlike the SLNmodel, our analysis using the SLt model provides statistical evidence for the existence of sample selection bias in these data. We also investigate the performance of the test for sample selection bias based on the SLt model and compare it with the performances of several tests used with the SLN model. Our findings indicate that the latter tests can be misleading in the presence of heavy-tailed data. © 2012 American Statistical Association.
Estimation and variable selection for generalized additive partial linear models

KAUST Repository

Wang, Li

2011-08-01

We study generalized additive partial linear models, proposing the use of polynomial spline smoothing for estimation of nonparametric functions, and deriving quasi-likelihood based estimators for the linear parameters. We establish asymptotic normality for the estimators of the parametric components. The procedure avoids solving large systems of equations as in kernel-based procedures and thus results in gains in computational simplicity. We further develop a class of variable selection procedures for the linear parameters by employing a nonconcave penalized quasi-likelihood, which is shown to have an asymptotic oracle property. Monte Carlo simulations and an empirical example are presented for illustration. © Institute of Mathematical Statistics, 2011.
Reliability estimation system: its application to the nuclear geophysical sampling of ore deposits

International Nuclear Information System (INIS)

Khaykovich, I.M.; Savosin, S.I.

1992-01-01

The reliability estimation system accepted in the Soviet Union for sampling data in nuclear geophysics is based on unique requirements in metrology and methodology. It involves estimating characteristic errors in calibration, as well as errors in measurement and interpretation. This paper describes the methods of estimating the levels of systematic and random errors at each stage of the problem. The data of nuclear geophysics sampling are considered to be reliable if there are no statistically significant, systematic differences between ore intervals determined by this method and by geological control, or by other methods of sampling; the reliability of the latter having been verified. The difference between the random errors is statistically insignificant. The system allows one to obtain information on the parameters of ore intervals with a guaranteed random error and without systematic errors. (Author)
Does self-selection affect samples' representativeness in online surveys? An investigation in online video game research.

Science.gov (United States)

Khazaal, Yasser; van Singer, Mathias; Chatton, Anne; Achab, Sophia; Zullino, Daniele; Rothen, Stephane; Khan, Riaz; Billieux, Joel; Thorens, Gabriel

2014-07-07

The number of medical studies performed through online surveys has increased dramatically in recent years. Despite their numerous advantages (eg, sample size, facilitated access to individuals presenting stigmatizing issues), selection bias may exist in online surveys. However, evidence on the representativeness of self-selected samples in online studies is patchy. Our objective was to explore the representativeness of a self-selected sample of online gamers using online players' virtual characters (avatars). All avatars belonged to individuals playing World of Warcraft (WoW), currently the most widely used online game. Avatars' characteristics were defined using various games' scores, reported on the WoW's official website, and two self-selected samples from previous studies were compared with a randomly selected sample of avatars. We used scores linked to 1240 avatars (762 from the self-selected samples and 478 from the random sample). The two self-selected samples of avatars had higher scores on most of the assessed variables (except for guild membership and exploration). Furthermore, some guilds were overrepresented in the self-selected samples. Our results suggest that more proficient players or players more involved in the game may be more likely to participate in online surveys. Caution is needed in the interpretation of studies based on online surveys that used a self-selection recruitment procedure. Epidemiological evidence on the reduced representativeness of sample of online surveys is warranted.
Estimating time to pregnancy from current durations in a cross-sectional sample

DEFF Research Database (Denmark)

Keiding, Niels; Kvist, Kajsa; Hartvig, Helle

2002-01-01

A new design for estimating the distribution of time to pregnancy is proposed and investigated. The design is based on recording current durations in a cross-sectional sample of women, leading to statistical problems similar to estimating renewal time distributions from backward recurrence times....
Out-of-pocket costs, primary care frequent attendance and sample selection: Estimates from a longitudinal cohort design.

Science.gov (United States)

Pymont, Carly; McNamee, Paul; Butterworth, Peter

2018-03-20

This paper examines the effect of out-of-pocket costs on subsequent frequent attendance in primary care using data from the Personality and Total Health (PATH) Through Life Project, a representative community cohort study from Canberra, Australia. The analysis sample comprised 1197 respondents with two or more GP consultations, and uses survey data linked to administrative health service use (Medicare) data which provides data on the number of consultations and out-of-pocket costs. Respondents identified in the highest decile of GP use in a year were defined as Frequent Attenders (FAs). Logistic regression models that did not account for potential selection effects showed that out-of-pocket costs incurred during respondents' prior two consultations were significantly associated with subsequent FA status. Respondents who incurred higher costs ($15-$35; or >$35) were less likely to become FAs than those who incurred no or low (attenders. Copyright © 2018. Published by Elsevier B.V.
Statistical Methods and Sampling Design for Estimating Step Trends in Surface-Water Quality

Science.gov (United States)

Hirsch, Robert M.

1988-01-01

This paper addresses two components of the problem of estimating the magnitude of step trends in surface water quality. The first is finding a robust estimator appropriate to the data characteristics expected in water-quality time series. The J. L. Hodges-E. L. Lehmann class of estimators is found to be robust in comparison to other nonparametric and moment-based estimators. A seasonal Hodges-Lehmann estimator is developed and shown to have desirable properties. Second, the effectiveness of various sampling strategies is examined using Monte Carlo simulation coupled with application of this estimator. The simulation is based on a large set of total phosphorus data from the Potomac River. To assure that the simulated records have realistic properties, the data are modeled in a multiplicative fashion incorporating flow, hysteresis, seasonal, and noise components. The results demonstrate the importance of balancing the length of the two sampling periods and balancing the number of data values between the two periods.
Is a 'convenience' sample useful for estimating immunization coverage in a small population?

Science.gov (United States)

Weir, Jean E; Jones, Carrie

2008-01-01

Rapid survey methodologies are widely used for assessing immunization coverage in developing countries, approximating true stratified random sampling. Non-random ('convenience') sampling is not considered appropriate for estimating immunization coverage rates but has the advantages of low cost and expediency. We assessed the validity of a convenience sample of children presenting to a travelling clinic by comparing the coverage rate in the convenience sample to the true coverage established by surveying each child in three villages in rural Papua New Guinea. The rate of DTF immunization coverage as estimated by the convenience sample was within 10% of the true coverage when the proportion of children in the sample was two-thirds or when only children over the age of one year were counted, but differed by 11% when the sample included only 53% of the children and when all eligible children were included. The convenience sample may be sufficiently accurate for reporting purposes and is useful for identifying areas of low coverage.
COPS model estimates of LLEA availability near selected reactor sites

International Nuclear Information System (INIS)

Berkbigler, K.P.

1979-11-01

The COPS computer model has been used to estimate local law enforcement agency (LLEA) officer availability in the neighborhood of selected nuclear reactor sites. The results of these analyses are presented both in graphic and tabular form in this report
A proposed selection index for feedlot profitability based on estimated breeding values.

Science.gov (United States)

van der Westhuizen, R R; van der Westhuizen, J

2009-04-22

It is generally accepted that feed intake and growth (gain) are the most important economic components when calculating profitability in a growth test or feedlot. We developed a single post-weaning growth (feedlot) index based on the economic values of different components. Variance components, heritabilities and genetic correlations for and between initial weight (IW), final weight (FW), feed intake (FI), and shoulder height (SHD) were estimated by multitrait restricted maximum likelihood procedures. The estimated breeding values (EBVs) and the economic values for IW, FW and FI were used in a selection index to estimate a post-weaning or feedlot profitability value. Heritabilities for IW, FW, FI, and SHD were 0.41, 0.40, 0.33, and 0.51, respectively. The highest genetic correlations were 0.78 (between IW and FW) and 0.70 (between FI and FW). EBVs were used in a selection index to calculate a single economical value for each animal. This economic value is an indication of the gross profitability value or the gross test value (GTV) of the animal in a post-weaning growth test. GTVs varied between -R192.17 and R231.38 with an average of R9.31 and a standard deviation of R39.96. The Pearson correlations between EBVs (for production and efficiency traits) and GTV ranged from -0.51 to 0.68. The lowest correlation (closest to zero) was 0.26 between the Kleiber ratio and GTV. Correlations of 0.68 and -0.51 were estimated between average daily gain and GTV and feed conversion ratio and GTV, respectively. These results showed that it is possible to select for GTV. The selection index can benefit feedlotting in selecting offspring of bulls with high GTVs to maximize profitability.
Evaluating the performance of species richness estimators: sensitivity to sample grain size

DEFF Research Database (Denmark)

Hortal, Joaquín; Borges, Paulo A. V.; Gaspar, Clara

2006-01-01

and several recent estimators [proposed by Rosenzweig et al. (Conservation Biology, 2003, 17, 864-874), and Ugland et al. (Journal of Animal Ecology, 2003, 72, 888-897)] performed poorly. 3. Estimations developed using the smaller grain sizes (pair of traps, traps, records and individuals) presented similar....... Data obtained with standardized sampling of 78 transects in natural forest remnants of five islands were aggregated in seven different grains (i.e. ways of defining a single sample): islands, natural areas, transects, pairs of traps, traps, database records and individuals to assess the effect of using...
An empirical comparison of isolate-based and sample-based definitions of antimicrobial resistance and their effect on estimates of prevalence.

Science.gov (United States)

Humphry, R W; Evans, J; Webster, C; Tongue, S C; Innocent, G T; Gunn, G J

2018-02-01

Antimicrobial resistance is primarily a problem in human medicine but there are unquantified links of transmission in both directions between animal and human populations. Quantitative assessment of the costs and benefits of reduced antimicrobial usage in livestock requires robust quantification of transmission of resistance between animals, the environment and the human population. This in turn requires appropriate measurement of resistance. To tackle this we selected two different methods for determining whether a sample is resistant - one based on screening a sample, the other on testing individual isolates. Our overall objective was to explore the differences arising from choice of measurement. A literature search demonstrated the widespread use of testing of individual isolates. The first aim of this study was to compare, quantitatively, sample level and isolate level screening. Cattle or sheep faecal samples (n=41) submitted for routine parasitology were tested for antimicrobial resistance in two ways: (1) "streak" direct culture onto plates containing the antimicrobial of interest; (2) determination of minimum inhibitory concentration (MIC) of 8-10 isolates per sample compared to published MIC thresholds. Two antibiotics (ampicillin and nalidixic acid) were tested. With ampicillin, direct culture resulted in more than double the number of resistant samples than the MIC method based on eight individual isolates. The second aim of this study was to demonstrate the utility of the observed relationship between these two measures of antimicrobial resistance to re-estimate the prevalence of antimicrobial resistance from a previous study, in which we had used "streak" cultures. Boot-strap methods were used to estimate the proportion of samples that would have tested resistant in the historic study, had we used the isolate-based MIC method instead. Our boot-strap results indicate that our estimates of prevalence of antimicrobial resistance would have been
Bayesian Estimation of Fish Disease Prevalence from Pooled Samples Incorporating Sensitivity and Specificity

Science.gov (United States)

Williams, Christopher J.; Moffitt, Christine M.

2003-03-01

An important emerging issue in fisheries biology is the health of free-ranging populations of fish, particularly with respect to the prevalence of certain pathogens. For many years, pathologists focused on captive populations and interest was in the presence or absence of certain pathogens, so it was economically attractive to test pooled samples of fish. Recently, investigators have begun to study individual fish prevalence from pooled samples. Estimation of disease prevalence from pooled samples is straightforward when assay sensitivity and specificity are perfect, but this assumption is unrealistic. Here we illustrate the use of a Bayesian approach for estimating disease prevalence from pooled samples when sensitivity and specificity are not perfect. We also focus on diagnostic plots to monitor the convergence of the Gibbs-sampling-based Bayesian analysis. The methods are illustrated with a sample data set.
Dynamic frame selection for in vivo ultrasound temperature estimation during radiofrequency ablation

International Nuclear Information System (INIS)

Daniels, Matthew J; Varghese, Tomy

2010-01-01

Minimally invasive therapies such as radiofrequency ablation have been developed to treat cancers of the liver, prostate and kidney without invasive surgery. Prior work has demonstrated that ultrasound echo shifts due to temperature changes can be utilized to track the temperature distribution in real time. In this paper, a motion compensation algorithm is evaluated to reduce the impact of cardiac and respiratory motion on ultrasound-based temperature tracking methods. The algorithm dynamically selects the next suitable frame given a start frame (selected during the exhale or expiration phase where extraneous motion is reduced), enabling optimization of the computational time in addition to reducing displacement noise artifacts incurred with the estimation of smaller frame-to-frame displacements at the full frame rate. A region of interest that does not undergo ablation is selected in the first frame and the algorithm searches through subsequent frames to find a similarly located region of interest in subsequent frames, with a high value of the mean normalized cross-correlation coefficient value. In conjunction with dynamic frame selection, two different two-dimensional displacement estimation algorithms namely a block matching and multilevel cross-correlation are compared. The multi-level cross-correlation method incorporates tracking of the lateral tissue expansion in addition to the axial deformation to improve the estimation performance. Our results demonstrate the ability of the proposed motion compensation using dynamic frame selection in conjunction with the two-dimensional multilevel cross-correlation to track the temperature distribution.
Porosity estimation by semi-supervised learning with sparsely available labeled samples

Science.gov (United States)

Lima, Luiz Alberto; Görnitz, Nico; Varella, Luiz Eduardo; Vellasco, Marley; Müller, Klaus-Robert; Nakajima, Shinichi

2017-09-01

This paper addresses the porosity estimation problem from seismic impedance volumes and porosity samples located in a small group of exploratory wells. Regression methods, trained on the impedance as inputs and the porosity as output labels, generally suffer from extremely expensive (and hence sparsely available) porosity samples. To optimally make use of the valuable porosity data, a semi-supervised machine learning method was proposed, Transductive Conditional Random Field Regression (TCRFR), showing good performance (Görnitz et al., 2017). TCRFR, however, still requires more labeled data than those usually available, which creates a gap when applying the method to the porosity estimation problem in realistic situations. In this paper, we aim to fill this gap by introducing two graph-based preprocessing techniques, which adapt the original TCRFR for extremely weakly supervised scenarios. Our new method outperforms the previous automatic estimation methods on synthetic data and provides a comparable result to the manual labored, time-consuming geostatistics approach on real data, proving its potential as a practical industrial tool.
Simulation of range imaging-based estimation of respiratory lung motion. Influence of noise, signal dimensionality and sampling patterns.

Science.gov (United States)

Wilms, M; Werner, R; Blendowski, M; Ortmüller, J; Handels, H

2014-01-01

A major problem associated with the irradiation of thoracic and abdominal tumors is respiratory motion. In clinical practice, motion compensation approaches are frequently steered by low-dimensional breathing signals (e.g., spirometry) and patient-specific correspondence models, which are used to estimate the sought internal motion given a signal measurement. Recently, the use of multidimensional signals derived from range images of the moving skin surface has been proposed to better account for complex motion patterns. In this work, a simulation study is carried out to investigate the motion estimation accuracy of such multidimensional signals and the influence of noise, the signal dimensionality, and different sampling patterns (points, lines, regions). A diffeomorphic correspondence modeling framework is employed to relate multidimensional breathing signals derived from simulated range images to internal motion patterns represented by diffeomorphic non-linear transformations. Furthermore, an automatic approach for the selection of optimal signal combinations/patterns within this framework is presented. This simulation study focuses on lung motion estimation and is based on 28 4D CT data sets. The results show that the use of multidimensional signals instead of one-dimensional signals significantly improves the motion estimation accuracy, which is, however, highly affected by noise. Only small differences exist between different multidimensional sampling patterns (lines and regions). Automatically determined optimal combinations of points and lines do not lead to accuracy improvements compared to results obtained by using all points or lines. Our results show the potential of multidimensional breathing signals derived from range images for the model-based estimation of respiratory motion in radiation therapy.
Comparison of chlorzoxazone one-sample methods to estimate CYP2E1 activity in humans

DEFF Research Database (Denmark)

Kramer, Iza; Dalhoff, Kim; Clemmesen, Jens O

2003-01-01

OBJECTIVE: Comparison of a one-sample with a multi-sample method (the metabolic fractional clearance) to estimate CYP2E1 activity in humans. METHODS: Healthy, male Caucasians ( n=19) were included. The multi-sample fractional clearance (Cl(fe)) of chlorzoxazone was compared with one...... estimates, Cl(est) at 3 h or 6 h, and MR at 3 h, can serve as reliable markers of CYP2E1 activity. The one-sample clearance method is an accurate, renal function-independent measure of the intrinsic activity; it is simple to use and easily applicable to humans.......-time-point clearance estimation (Cl(est)) at 3, 4, 5 and 6 h. Furthermore, the metabolite/drug ratios (MRs) estimated from one-time-point samples at 1, 2, 3, 4, 5 and 6 h were compared with Cl(fe). RESULTS: The concordance between Cl(est) and Cl(fe) was highest at 6 h. The minimal mean prediction error (MPE) of Cl...
Estimation of Uncertainty in Aerosol Concentration Measured by Aerosol Sampling System

Energy Technology Data Exchange (ETDEWEB)

Lee, Jong Chan; Song, Yong Jae; Jung, Woo Young; Lee, Hyun Chul; Kim, Gyu Tae; Lee, Doo Yong [FNC Technology Co., Yongin (Korea, Republic of)

2016-10-15

FNC Technology Co., Ltd has been developed test facilities for the aerosol generation, mixing, sampling and measurement under high pressure and high temperature conditions. The aerosol generation system is connected to the aerosol mixing system which injects SiO{sub 2}/ethanol mixture. In the sampling system, glass fiber membrane filter has been used to measure average mass concentration. Based on the experimental results using main carrier gas of steam and air mixture, the uncertainty estimation of the sampled aerosol concentration was performed by applying Gaussian error propagation law. FNC Technology Co., Ltd. has been developed the experimental facilities for the aerosol measurement under high pressure and high temperature. The purpose of the tests is to develop commercial test module for aerosol generation, mixing and sampling system applicable to environmental industry and safety related system in nuclear power plant. For the uncertainty calculation of aerosol concentration, the value of the sampled aerosol concentration is not measured directly, but must be calculated from other quantities. The uncertainty of the sampled aerosol concentration is a function of flow rates of air and steam, sampled mass, sampling time, condensed steam mass and its absolute errors. These variables propagate to the combination of variables in the function. Using operating parameters and its single errors from the aerosol test cases performed at FNC, the uncertainty of aerosol concentration evaluated by Gaussian error propagation law is less than 1%. The results of uncertainty estimation in the aerosol sampling system will be utilized for the system performance data.

Towards the production of reliable quantitative microbiological data for risk assessment: Direct quantification of Campylobacter in naturally infected chicken fecal samples using selective culture and real-time PCR

DEFF Research Database (Denmark)

Garcia Clavero, Ana Belén; Vigre, Håkan; Josefsen, Mathilde Hasseldam

2015-01-01

of Campylobacter by real-time PCR was performed using standard curves designed for two different DNA extraction methods: Easy-DNA™ Kit from Invitrogen (Easy-DNA) and NucliSENS® MiniMAG® from bioMérieux (MiniMAG). Results indicated that the estimation of the numbers of Campylobacter present in chicken fecal samples...... and for the evaluation of control strategies implemented in poultry production. The aim of this study was to compare estimates of the numbers of Campylobacter spp. in naturally infected chicken fecal samples obtained using direct quantification by selective culture and by real-time PCR. Absolute quantification....... Although there were differences in terms of estimates of Campylobacter numbers between the methods and samples, the differences between culture and real-time PCR were not statistically significant for most of the samples used in this study....
Sales Tax Compliance and Audit Selection

OpenAIRE

Murray, Matthew N.

1995-01-01

Uses sample selection estimation techniques to identify systematic audit selection rules and determinants of sales tax underreporting. Though based on data from only one state (Tennessee), outcomes are useful in developing and evaluating audit selection results.
A covariance correction that accounts for correlation estimation to improve finite-sample inference with generalized estimating equations: A study on its applicability with structured correlation matrices.

Science.gov (United States)

Westgate, Philip M

2016-01-01

When generalized estimating equations (GEE) incorporate an unstructured working correlation matrix, the variances of regression parameter estimates can inflate due to the estimation of the correlation parameters. In previous work, an approximation for this inflation that results in a corrected version of the sandwich formula for the covariance matrix of regression parameter estimates was derived. Use of this correction for correlation structure selection also reduces the over-selection of the unstructured working correlation matrix. In this manuscript, we conduct a simulation study to demonstrate that an increase in variances of regression parameter estimates can occur when GEE incorporates structured working correlation matrices as well. Correspondingly, we show the ability of the corrected version of the sandwich formula to improve the validity of inference and correlation structure selection. We also study the relative influences of two popular corrections to a different source of bias in the empirical sandwich covariance estimator.
Model to Estimate Monthly Time Horizons for Application of DEA in Selection of Stock Portfolio and for Maintenance of the Selected Portfolio

Directory of Open Access Journals (Sweden)

José Claudio Isaias

2015-01-01

Full Text Available In the selecting of stock portfolios, one type of analysis that has shown good results is Data Envelopment Analysis (DEA. It, however, has been shown to have gaps regarding its estimates of monthly time horizons of data collection for the selection of stock portfolios and of monthly time horizons for the maintenance of a selected portfolio. To better estimate these horizons, this study proposes a model of mathematical programming binary of minimization of square errors. This model is the paper’s main contribution. The model’s results are validated by simulating the estimated annual return indexes of a portfolio that uses both horizons estimated and of other portfolios that do not use these horizons. The simulation shows that portfolios with both horizons estimated have higher indexes, on average 6.99% per year. The hypothesis tests confirm the statistically significant superiority of the results of the proposed mathematical model’s indexes. The model’s indexes are also compared with portfolios that use just one of the horizons estimated; here the indexes of the dual-horizon portfolios outperform the single-horizon portfolios, though with a decrease in percentage of statistically significant superiority.
A novel approach to non-biased systematic random sampling: a stereologic estimate of Purkinje cells in the human cerebellum.

Science.gov (United States)

Agashiwala, Rajiv M; Louis, Elan D; Hof, Patrick R; Perl, Daniel P

2008-10-21

Non-biased systematic sampling using the principles of stereology provides accurate quantitative estimates of objects within neuroanatomic structures. However, the basic principles of stereology are not optimally suited for counting objects that selectively exist within a limited but complex and convoluted portion of the sample, such as occurs when counting cerebellar Purkinje cells. In an effort to quantify Purkinje cells in association with certain neurodegenerative disorders, we developed a new method for stereologic sampling of the cerebellar cortex, involving calculating the volume of the cerebellar tissues, identifying and isolating the Purkinje cell layer and using this information to extrapolate non-biased systematic sampling data to estimate the total number of Purkinje cells in the tissues. Using this approach, we counted Purkinje cells in the right cerebella of four human male control specimens, aged 41, 67, 70 and 84 years, and estimated the total Purkinje cell number for the four entire cerebella to be 27.03, 19.74, 20.44 and 22.03 million cells, respectively. The precision of the method is seen when comparing the density of the cells within the tissue: 266,274, 173,166, 167,603 and 183,575 cells/cm3, respectively. Prior literature documents Purkinje cell counts ranging from 14.8 to 30.5 million cells. These data demonstrate the accuracy of our approach. Our novel approach, which offers an improvement over previous methodologies, is of value for quantitative work of this nature. This approach could be applied to morphometric studies of other similarly complex tissues as well.
inverse gaussian model for small area estimation via gibbs sampling

African Journals Online (AJOL)

ADMIN

For example, MacGibbon and Tomberlin. (1989) have considered estimating small area rates and binomial parameters using empirical Bayes methods. Stroud (1991) used hierarchical Bayes approach for univariate natural exponential families with quadratic variance functions in sample survey applications, while Chaubey ...
Limited sampling hampers "big data" estimation of species richness in a tropical biodiversity hotspot.

Science.gov (United States)

Engemann, Kristine; Enquist, Brian J; Sandel, Brody; Boyle, Brad; Jørgensen, Peter M; Morueta-Holme, Naia; Peet, Robert K; Violle, Cyrille; Svenning, Jens-Christian

2015-02-01

Macro-scale species richness studies often use museum specimens as their main source of information. However, such datasets are often strongly biased due to variation in sampling effort in space and time. These biases may strongly affect diversity estimates and may, thereby, obstruct solid inference on the underlying diversity drivers, as well as mislead conservation prioritization. In recent years, this has resulted in an increased focus on developing methods to correct for sampling bias. In this study, we use sample-size-correcting methods to examine patterns of tropical plant diversity in Ecuador, one of the most species-rich and climatically heterogeneous biodiversity hotspots. Species richness estimates were calculated based on 205,735 georeferenced specimens of 15,788 species using the Margalef diversity index, the Chao estimator, the second-order Jackknife and Bootstrapping resampling methods, and Hill numbers and rarefaction. Species richness was heavily correlated with sampling effort, and only rarefaction was able to remove this effect, and we recommend this method for estimation of species richness with "big data" collections.
Estimating Loan-to-value Distributions

DEFF Research Database (Denmark)

Korteweg, Arthur; Sørensen, Morten

2016-01-01

We estimate a model of house prices, combined loan-to-value ratios (CLTVs) and trade and foreclosure behavior. House prices are only observed for traded properties and trades are endogenous, creating sample-selection problems for existing approaches to estimating CLTVs. We use a Bayesian filtering...
Sample size methodology

CERN Document Server

Desu, M M

2012-01-01

One of the most important problems in designing an experiment or a survey is sample size determination and this book presents the currently available methodology. It includes both random sampling from standard probability distributions and from finite populations. Also discussed is sample size determination for estimating parameters in a Bayesian setting by considering the posterior distribution of the parameter and specifying the necessary requirements. The determination of the sample size is considered for ranking and selection problems as well as for the design of clinical trials. Appropria
Beamforming using subspace estimation from a diagonally averaged sample covariance.

Science.gov (United States)

Quijano, Jorge E; Zurk, Lisa M

2017-08-01

The potential benefit of a large-aperture sonar array for high resolution target localization is often challenged by the lack of sufficient data required for adaptive beamforming. This paper introduces a Toeplitz-constrained estimator of the clairvoyant signal covariance matrix corresponding to multiple far-field targets embedded in background isotropic noise. The estimator is obtained by averaging along subdiagonals of the sample covariance matrix, followed by covariance extrapolation using the method of maximum entropy. The sample covariance is computed from limited data snapshots, a situation commonly encountered with large-aperture arrays in environments characterized by short periods of local stationarity. Eigenvectors computed from the Toeplitz-constrained covariance are used to construct signal-subspace projector matrices, which are shown to reduce background noise and improve detection of closely spaced targets when applied to subspace beamforming. Monte Carlo simulations corresponding to increasing array aperture suggest convergence of the proposed projector to the clairvoyant signal projector, thereby outperforming the classic projector obtained from the sample eigenvectors. Beamforming performance of the proposed method is analyzed using simulated data, as well as experimental data from the Shallow Water Array Performance experiment.
The effects of parameter estimation on minimizing the in-control average sample size for the double sampling X bar chart

Directory of Open Access Journals (Sweden)

Michael B.C. Khoo

2013-11-01

Full Text Available The double sampling (DS X bar chart, one of the most widely-used charting methods, is superior for detecting small and moderate shifts in the process mean. In a right skewed run length distribution, the median run length (MRL provides a more credible representation of the central tendency than the average run length (ARL, as the mean is greater than the median. In this paper, therefore, MRL is used as the performance criterion instead of the traditional ARL. Generally, the performance of the DS X bar chart is investigated under the assumption of known process parameters. In practice, these parameters are usually estimated from an in-control reference Phase-I dataset. Since the performance of the DS X bar chart is significantly affected by estimation errors, we study the effects of parameter estimation on the MRL-based DS X bar chart when the in-control average sample size is minimised. This study reveals that more than 80 samples are required for the MRL-based DS X bar chart with estimated parameters to perform more favourably than the corresponding chart with known parameters.
Effects of sampling conditions on DNA-based estimates of American black bear abundance

Science.gov (United States)

Laufenberg, Jared S.; Van Manen, Frank T.; Clark, Joseph D.

2013-01-01

DNA-based capture-mark-recapture techniques are commonly used to estimate American black bear (Ursus americanus) population abundance (N). Although the technique is well established, many questions remain regarding study design. In particular, relationships among N, capture probability of heterogeneity mixtures A and B (pA and pB, respectively, or p, collectively), the proportion of each mixture (π), number of capture occasions (k), and probability of obtaining reliable estimates of N are not fully understood. We investigated these relationships using 1) an empirical dataset of DNA samples for which true N was unknown and 2) simulated datasets with known properties that represented a broader array of sampling conditions. For the empirical data analysis, we used the full closed population with heterogeneity data type in Program MARK to estimate N for a black bear population in Great Smoky Mountains National Park, Tennessee. We systematically reduced the number of those samples used in the analysis to evaluate the effect that changes in capture probabilities may have on parameter estimates. Model-averaged N for females and males were 161 (95% CI = 114–272) and 100 (95% CI = 74–167), respectively (pooled N = 261, 95% CI = 192–419), and the average weekly p was 0.09 for females and 0.12 for males. When we reduced the number of samples of the empirical data, support for heterogeneity models decreased. For the simulation analysis, we generated capture data with individual heterogeneity covering a range of sampling conditions commonly encountered in DNA-based capture-mark-recapture studies and examined the relationships between those conditions and accuracy (i.e., probability of obtaining an estimated N that is within 20% of true N), coverage (i.e., probability that 95% confidence interval includes true N), and precision (i.e., probability of obtaining a coefficient of variation ≤20%) of estimates using logistic regression. The capture probability
40 CFR 761.247 - Sample site selection for pipe segment removal.

Science.gov (United States)

2010-07-01

... end of the pipe segment. (3) If the pipe segment is cut with a saw or other mechanical device, take..., take samples from a total of seven segments. (A) Sample the first and last segments removed. (B) Select... total length for purposes of disposal, take samples of each segment that is 1/2 mile distant from the...
Estimation of Finite Population Mean in Multivariate Stratified Sampling under Cost Function Using Goal Programming

Directory of Open Access Journals (Sweden)

Atta Ullah

2014-01-01

Full Text Available In practical utilization of stratified random sampling scheme, the investigator meets a problem to select a sample that maximizes the precision of a finite population mean under cost constraint. An allocation of sample size becomes complicated when more than one characteristic is observed from each selected unit in a sample. In many real life situations, a linear cost function of a sample size nh is not a good approximation to actual cost of sample survey when traveling cost between selected units in a stratum is significant. In this paper, sample allocation problem in multivariate stratified random sampling with proposed cost function is formulated in integer nonlinear multiobjective mathematical programming. A solution procedure is proposed using extended lexicographic goal programming approach. A numerical example is presented to illustrate the computational details and to compare the efficiency of proposed compromise allocation.
Selecting the best stable isotope mixing model to estimate grizzly bear diets in the Greater Yellowstone Ecosystem.

Science.gov (United States)

Hopkins, John B; Ferguson, Jake M; Tyers, Daniel B; Kurle, Carolyn M

2017-01-01

Past research indicates that whitebark pine seeds are a critical food source for Threatened grizzly bears (Ursus arctos) in the Greater Yellowstone Ecosystem (GYE). In recent decades, whitebark pine forests have declined markedly due to pine beetle infestation, invasive blister rust, and landscape-level fires. To date, no study has reliably estimated the contribution of whitebark pine seeds to the diets of grizzlies through time. We used stable isotope ratios (expressed as δ13C, δ15N, and δ34S values) measured in grizzly bear hair and their major food sources to estimate the diets of grizzlies sampled in Cooke City Basin, Montana. We found that stable isotope mixing models that included different combinations of stable isotope values for bears and their foods generated similar proportional dietary contributions. Estimates generated by our top model suggest that whitebark pine seeds (35±10%) and other plant foods (56±10%) were more important than meat (9±8%) to grizzly bears sampled in the study area. Stable isotope values measured in bear hair collected elsewhere in the GYE and North America support our conclusions about plant-based foraging. We recommend that researchers consider model selection when estimating the diets of animals using stable isotope mixing models. We also urge researchers to use the new statistical framework described here to estimate the dietary responses of grizzlies to declines in whitebark pine seeds and other important food sources through time in the GYE (e.g., cutthroat trout), as such information could be useful in predicting how the population will adapt to future environmental change.
A random sampling approach for robust estimation of tissue-to-plasma ratio from extremely sparse data.

Science.gov (United States)

Chu, Hui-May; Ette, Ene I

2005-09-02

his study was performed to develop a new nonparametric approach for the estimation of robust tissue-to-plasma ratio from extremely sparsely sampled paired data (ie, one sample each from plasma and tissue per subject). Tissue-to-plasma ratio was estimated from paired/unpaired experimental data using independent time points approach, area under the curve (AUC) values calculated with the naïve data averaging approach, and AUC values calculated using sampling based approaches (eg, the pseudoprofile-based bootstrap [PpbB] approach and the random sampling approach [our proposed approach]). The random sampling approach involves the use of a 2-phase algorithm. The convergence of the sampling/resampling approaches was investigated, as well as the robustness of the estimates produced by different approaches. To evaluate the latter, new data sets were generated by introducing outlier(s) into the real data set. One to 2 concentration values were inflated by 10% to 40% from their original values to produce the outliers. Tissue-to-plasma ratios computed using the independent time points approach varied between 0 and 50 across time points. The ratio obtained from AUC values acquired using the naive data averaging approach was not associated with any measure of uncertainty or variability. Calculating the ratio without regard to pairing yielded poorer estimates. The random sampling and pseudoprofile-based bootstrap approaches yielded tissue-to-plasma ratios with uncertainty and variability. However, the random sampling approach, because of the 2-phase nature of its algorithm, yielded more robust estimates and required fewer replications. Therefore, a 2-phase random sampling approach is proposed for the robust estimation of tissue-to-plasma ratio from extremely sparsely sampled data.
Estimating fish swimming metrics and metabolic rates with accelerometers: the influence of sampling frequency.

Science.gov (United States)

Brownscombe, J W; Lennox, R J; Danylchuk, A J; Cooke, S J

2018-06-21

Accelerometry is growing in popularity for remotely measuring fish swimming metrics, but appropriate sampling frequencies for accurately measuring these metrics are not well studied. This research examined the influence of sampling frequency (1-25 Hz) with tri-axial accelerometer biologgers on estimates of overall dynamic body acceleration (ODBA), tail-beat frequency, swimming speed and metabolic rate of bonefish Albula vulpes in a swim-tunnel respirometer and free-swimming in a wetland mesocosm. In the swim tunnel, sampling frequencies of ≥ 5 Hz were sufficient to establish strong relationships between ODBA, swimming speed and metabolic rate. However, in free-swimming bonefish, estimates of metabolic rate were more variable below 10 Hz. Sampling frequencies should be at least twice the maximum tail-beat frequency to estimate this metric effectively, which is generally higher than those required to estimate ODBA, swimming speed and metabolic rate. While optimal sampling frequency probably varies among species due to tail-beat frequency and swimming style, this study provides a reference point with a medium body-sized sub-carangiform teleost fish, enabling researchers to measure these metrics effectively and maximize study duration. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Estimation of plant sampling uncertainty: an example based on chemical analysis of moss samples.

Science.gov (United States)

Dołęgowska, Sabina

2016-11-01

In order to estimate the level of uncertainty arising from sampling, 54 samples (primary and duplicate) of the moss species Pleurozium schreberi (Brid.) Mitt. were collected within three forested areas (Wierna Rzeka, Piaski, Posłowice Range) in the Holy Cross Mountains (south-central Poland). During the fieldwork, each primary sample composed of 8 to 10 increments (subsamples) was taken over an area of 10 m 2 whereas duplicate samples were collected in the same way at a distance of 1-2 m. Subsequently, all samples were triple rinsed with deionized water, dried, milled, and digested (8 mL HNO 3 (1:1) + 1 mL 30 % H 2 O 2 ) in a closed microwave system Multiwave 3000. The prepared solutions were analyzed twice for Cu, Fe, Mn, and Zn using FAAS and GFAAS techniques. All datasets were checked for normality and for normally distributed elements (Cu from Piaski, Zn from Posłowice, Fe, Zn from Wierna Rzeka). The sampling uncertainty was computed with (i) classical ANOVA, (ii) classical RANOVA, (iii) modified RANOVA, and (iv) range statistics. For the remaining elements, the sampling uncertainty was calculated with traditional and/or modified RANOVA (if the amount of outliers did not exceed 10 %) or classical ANOVA after Box-Cox transformation (if the amount of outliers exceeded 10 %). The highest concentrations of all elements were found in moss samples from Piaski, whereas the sampling uncertainty calculated with different statistical methods ranged from 4.1 to 22 %.
Critical point relascope sampling for unbiased volume estimation of downed coarse woody debris

Science.gov (United States)

Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey; Mark J. Ducey

2005-01-01

Critical point relascope sampling is developed and shown to be design-unbiased for the estimation of log volume when used with point relascope sampling for downed coarse woody debris. The method is closely related to critical height sampling for standing trees when trees are first sampled with a wedge prism. Three alternative protocols for determining the critical...
Critical length sampling: a method to estimate the volume of downed coarse woody debris

Science.gov (United States)

G& #246; ran St& #229; hl; Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey

2010-01-01

In this paper, critical length sampling for estimating the volume of downed coarse woody debris is presented. Using this method, the volume of downed wood in a stand can be estimated by summing the critical lengths of down logs included in a sample obtained using a relascope or wedge prism; typically, the instrument should be tilted 90° from its usual...

Impact of sampling strategy on stream load estimates in till landscape of the Midwest

Science.gov (United States)

Vidon, P.; Hubbard, L.E.; Soyeux, E.

2009-01-01

Accurately estimating various solute loads in streams during storms is critical to accurately determine maximum daily loads for regulatory purposes. This study investigates the impact of sampling strategy on solute load estimates in streams in the US Midwest. Three different solute types (nitrate, magnesium, and dissolved organic carbon (DOC)) and three sampling strategies are assessed. Regardless of the method, the average error on nitrate loads is higher than for magnesium or DOC loads, and all three methods generally underestimate DOC loads and overestimate magnesium loads. Increasing sampling frequency only slightly improves the accuracy of solute load estimates but generally improves the precision of load calculations. This type of investigation is critical for water management and environmental assessment so error on solute load calculations can be taken into account by landscape managers, and sampling strategies optimized as a function of monitoring objectives. ?? 2008 Springer Science+Business Media B.V.
Comparison of prevalence estimation of Mycobacterium avium subsp. paratuberculosis infection by sampling slaughtered cattle with macroscopic lesions vs. systematic sampling.

Science.gov (United States)

Elze, J; Liebler-Tenorio, E; Ziller, M; Köhler, H

2013-07-01

The objective of this study was to identify the most reliable approach for prevalence estimation of Mycobacterium avium ssp. paratuberculosis (MAP) infection in clinically healthy slaughtered cattle. Sampling of macroscopically suspect tissue was compared to systematic sampling. Specimens of ileum, jejunum, mesenteric and caecal lymph nodes were examined for MAP infection using bacterial microscopy, culture, histopathology and immunohistochemistry. MAP was found most frequently in caecal lymph nodes, but sampling more tissues optimized the detection rate. Examination by culture was most efficient while combination with histopathology increased the detection rate slightly. MAP was detected in 49/50 animals with macroscopic lesions representing 1.35% of the slaughtered cattle examined. Of 150 systematically sampled macroscopically non-suspect cows, 28.7% were infected with MAP. This indicates that the majority of MAP-positive cattle are slaughtered without evidence of macroscopic lesions and before clinical signs occur. For reliable prevalence estimation of MAP infection in slaughtered cattle, systematic random sampling is essential.
Estimation of Branch Topology Errors in Power Networks by WLAN State Estimation

Energy Technology Data Exchange (ETDEWEB)

Kim, Hong Rae [Soonchunhyang University(Korea); Song, Kyung Bin [Kei Myoung University(Korea)

2000-06-01

The purpose of this paper is to detect and identify topological errors in order to maintain a reliable database for the state estimator. In this paper, a two stage estimation procedure is used to identify the topology errors. At the first stage, the WLAV state estimator which has characteristics to remove bad data during the estimation procedure is run for finding out the suspected branches at which topology errors take place. The resulting residuals are normalized and the measurements with significant normalized residuals are selected. A set of suspected branches is formed based on these selected measurements; if the selected measurement if a line flow, the corresponding branch is suspected; if it is an injection, then all the branches connecting the injection bus to its immediate neighbors are suspected. A new WLAV state estimator adding the branch flow errors in the state vector is developed to identify the branch topology errors. Sample cases of single topology error and topology error with a measurement error are applied to IEEE 14 bus test system. (author). 24 refs., 1 fig., 9 tabs.
Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators

International Nuclear Information System (INIS)

Flammia, Steven T; Gross, David; Liu, Yi-Kai; Eisert, Jens

2012-01-01

Intuitively, if a density operator has small rank, then it should be easier to estimate from experimental data, since in this case only a few eigenvectors need to be learned. We prove two complementary results that confirm this intuition. Firstly, we show that a low-rank density matrix can be estimated using fewer copies of the state, i.e. the sample complexity of tomography decreases with the rank. Secondly, we show that unknown low-rank states can be reconstructed from an incomplete set of measurements, using techniques from compressed sensing and matrix completion. These techniques use simple Pauli measurements, and their output can be certified without making any assumptions about the unknown state. In this paper, we present a new theoretical analysis of compressed tomography, based on the restricted isometry property for low-rank matrices. Using these tools, we obtain near-optimal error bounds for the realistic situation where the data contain noise due to finite statistics, and the density matrix is full-rank with decaying eigenvalues. We also obtain upper bounds on the sample complexity of compressed tomography, and almost-matching lower bounds on the sample complexity of any procedure using adaptive sequences of Pauli measurements. Using numerical simulations, we compare the performance of two compressed sensing estimators—the matrix Dantzig selector and the matrix Lasso—with standard maximum-likelihood estimation (MLE). We find that, given comparable experimental resources, the compressed sensing estimators consistently produce higher fidelity state reconstructions than MLE. In addition, the use of an incomplete set of measurements leads to faster classical processing with no loss of accuracy. Finally, we show how to certify the accuracy of a low-rank estimate using direct fidelity estimation, and describe a method for compressed quantum process tomography that works for processes with small Kraus rank and requires only Pauli eigenstate preparations
Application of a stratified random sampling technique to the estimation and minimization of respirable quartz exposure to underground miners

International Nuclear Information System (INIS)

Makepeace, C.E.; Horvath, F.J.; Stocker, H.

1981-11-01

The aim of a stratified random sampling plan is to provide the best estimate (in the absence of full-shift personal gravimetric sampling) of personal exposure to respirable quartz among underground miners. One also gains information of the exposure distribution of all the miners at the same time. Three variables (or strata) are considered in the present scheme: locations, occupations and times of sampling. Random sampling within each stratum ensures that each location, occupation and time of sampling has equal opportunity of being selected without bias. Following implementation of the plan and analysis of collected data, one can determine the individual exposures and the mean. This information can then be used to identify those groups whose exposure contributes significantly to the collective exposure. In turn, this identification, along with other considerations, allows the mine operator to carry out a cost-benefit optimization and eventual implementation of engineering controls for these groups. This optimization and engineering control procedure, together with the random sampling plan, can then be used in an iterative manner to minimize the mean value of the distribution and collective exposures
42 CFR 431.814 - Sampling plan and procedures.

Science.gov (United States)

2010-10-01

... reliability of the reduced sample. (4) The sample selection procedure. Systematic random sampling is... sampling, and yield estimates with the same or better precision than achieved in systematic random sampling... 42 Public Health 4 2010-10-01 2010-10-01 false Sampling plan and procedures. 431.814 Section 431...
Measurement of radioactivity in the environment - Soil - Part 2: Guidance for the selection of the sampling strategy, sampling and pre-treatment of samples

International Nuclear Information System (INIS)

2007-01-01

This part of ISO 18589 specifies the general requirements, based on ISO 11074 and ISO/IEC 17025, for all steps in the planning (desk study and area reconnaissance) of the sampling and the preparation of samples for testing. It includes the selection of the sampling strategy, the outline of the sampling plan, the presentation of general sampling methods and equipment, as well as the methodology of the pre-treatment of samples adapted to the measurements of the activity of radionuclides in soil. This part of ISO 18589 is addressed to the people responsible for determining the radioactivity present in soil for the purpose of radiation protection. It is applicable to soil from gardens, farmland, urban or industrial sites, as well as soil not affected by human activities. This part of ISO 18589 is applicable to all laboratories regardless of the number of personnel or the range of the testing performed. When a laboratory does not undertake one or more of the activities covered by this part of ISO 18589, such as planning, sampling or testing, the corresponding requirements do not apply. Information is provided on scope, normative references, terms and definitions and symbols, principle, sampling strategy, sampling plan, sampling process, pre-treatment of samples and recorded information. Five annexes inform about selection of the sampling strategy according to the objectives and the radiological characterization of the site and sampling areas, diagram of the evolution of the sample characteristics from the sampling site to the laboratory, example of sampling plan for a site divided in three sampling areas, example of a sampling record for a single/composite sample and example for a sample record for a soil profile with soil description. A bibliography is provided
The use of importance sampling in a trial assessment to obtain converged estimates of radiological risk

International Nuclear Information System (INIS)

Johnson, K.; Lucas, R.

1986-12-01

In developing a methodology for assessing potential sites for the disposal of radioactive wastes, the Department of the Environment has conducted a series of trial assessment exercises. In order to produce converged estimates of radiological risk using the SYVAC A/C simulation system an efficient sampling procedure is required. Previous work has demonstrated that importance sampling can substantially increase sampling efficiency. This study used importance sampling to produce converged estimates of risk for the first DoE trial assessment. Four major nuclide chains were analysed. In each case importance sampling produced converged risk estimates with between 10 and 170 times fewer runs of the SYVAC A/C model. This increase in sampling efficiency can reduce the total elapsed time required to obtain a converged estimate of risk from one nuclide chain by a factor of 20. The results of this study suggests that the use of importance sampling could reduce the elapsed time required to perform a risk assessment of a potential site by a factor of ten. (author)
An efficient modularized sample-based method to estimate the first-order Sobol' index

International Nuclear Information System (INIS)

Li, Chenzhao; Mahadevan, Sankaran

2016-01-01

Sobol' index is a prominent methodology in global sensitivity analysis. This paper aims to directly estimate the Sobol' index based only on available input–output samples, even if the underlying model is unavailable. For this purpose, a new method to calculate the first-order Sobol' index is proposed. The innovation is that the conditional variance and mean in the formula of the first-order index are calculated at an unknown but existing location of model inputs, instead of an explicit user-defined location. The proposed method is modularized in two aspects: 1) index calculations for different model inputs are separate and use the same set of samples; and 2) model input sampling, model evaluation, and index calculation are separate. Due to this modularization, the proposed method is capable to compute the first-order index if only input–output samples are available but the underlying model is unavailable, and its computational cost is not proportional to the dimension of the model inputs. In addition, the proposed method can also estimate the first-order index with correlated model inputs. Considering that the first-order index is a desired metric to rank model inputs but current methods can only handle independent model inputs, the proposed method contributes to fill this gap. - Highlights: • An efficient method to estimate the first-order Sobol' index. • Estimate the index from input–output samples directly. • Computational cost is not proportional to the number of model inputs. • Handle both uncorrelated and correlated model inputs.
Optimal time points sampling in pathway modelling.

Science.gov (United States)

Hu, Shiyan

2004-01-01

Modelling cellular dynamics based on experimental data is at the heart of system biology. Considerable progress has been made to dynamic pathway modelling as well as the related parameter estimation. However, few of them gives consideration for the issue of optimal sampling time selection for parameter estimation. Time course experiments in molecular biology rarely produce large and accurate data sets and the experiments involved are usually time consuming and expensive. Therefore, to approximate parameters for models with only few available sampling data is of significant practical value. For signal transduction, the sampling intervals are usually not evenly distributed and are based on heuristics. In the paper, we investigate an approach to guide the process of selecting time points in an optimal way to minimize the variance of parameter estimates. In the method, we first formulate the problem to a nonlinear constrained optimization problem by maximum likelihood estimation. We then modify and apply a quantum-inspired evolutionary algorithm, which combines the advantages of both quantum computing and evolutionary computing, to solve the optimization problem. The new algorithm does not suffer from the morass of selecting good initial values and being stuck into local optimum as usually accompanied with the conventional numerical optimization techniques. The simulation results indicate the soundness of the new method.
Consistency in Estimation and Model Selection of Dynamic Panel Data Models with Fixed Effects

Directory of Open Access Journals (Sweden)

Guangjie Li

2015-07-01

Full Text Available We examine the relationship between consistent parameter estimation and model selection for autoregressive panel data models with fixed effects. We find that the transformation of fixed effects proposed by Lancaster (2002 does not necessarily lead to consistent estimation of common parameters when some true exogenous regressors are excluded. We propose a data dependent way to specify the prior of the autoregressive coefficient and argue for comparing different model specifications before parameter estimation. Model selection properties of Bayes factors and Bayesian information criterion (BIC are investigated. When model uncertainty is substantial, we recommend the use of Bayesian Model Averaging to obtain point estimators with lower root mean squared errors (RMSE. We also study the implications of different levels of inclusion probabilities by simulations.
A new unbiased stochastic derivative estimator for discontinuous sample performances with structural parameters

NARCIS (Netherlands)

Peng, Yijie; Fu, Michael C.; Hu, Jian Qiang; Heidergott, Bernd

In this paper, we propose a new unbiased stochastic derivative estimator in a framework that can handle discontinuous sample performances with structural parameters. This work extends the three most popular unbiased stochastic derivative estimators: (1) infinitesimal perturbation analysis (IPA), (2)
Estimating species – area relationships by modeling abundance and frequency subject to incomplete sampling

Science.gov (United States)

Yamaura, Yuichi; Connor, Edward F.; Royle, Andy; Itoh, Katsuo; Sato, Kiyoshi; Taki, Hisatomo; Mishima, Yoshio

2016-01-01

Models and data used to describe species–area relationships confound sampling with ecological process as they fail to acknowledge that estimates of species richness arise due to sampling. This compromises our ability to make ecological inferences from and about species–area relationships. We develop and illustrate hierarchical community models of abundance and frequency to estimate species richness. The models we propose separate sampling from ecological processes by explicitly accounting for the fact that sampled patches are seldom completely covered by sampling plots and that individuals present in the sampling plots are imperfectly detected. We propose a multispecies abundance model in which community assembly is treated as the summation of an ensemble of species-level Poisson processes and estimate patch-level species richness as a derived parameter. We use sampling process models appropriate for specific survey methods. We propose a multispecies frequency model that treats the number of plots in which a species occurs as a binomial process. We illustrate these models using data collected in surveys of early-successional bird species and plants in young forest plantation patches. Results indicate that only mature forest plant species deviated from the constant density hypothesis, but the null model suggested that the deviations were too small to alter the form of species–area relationships. Nevertheless, results from simulations clearly show that the aggregate pattern of individual species density–area relationships and occurrence probability–area relationships can alter the form of species–area relationships. The plant community model estimated that only half of the species present in the regional species pool were encountered during the survey. The modeling framework we propose explicitly accounts for sampling processes so that ecological processes can be examined free of sampling artefacts. Our modeling approach is extensible and could be applied
Optimal heavy tail estimation – Part 1: Order selection

Directory of Open Access Journals (Sweden)

M. Mudelsee

2017-12-01

Full Text Available The tail probability, P, of the distribution of a variable is important for risk analysis of extremes. Many variables in complex geophysical systems show heavy tails, where P decreases with the value, x, of a variable as a power law with a characteristic exponent, α. Accurate estimation of α on the basis of data is currently hindered by the problem of the selection of the order, that is, the number of largest x values to utilize for the estimation. This paper presents a new, widely applicable, data-adaptive order selector, which is based on computer simulations and brute force search. It is the first in a set of papers on optimal heavy tail estimation. The new selector outperforms competitors in a Monte Carlo experiment, where simulated data are generated from stable distributions and AR(1 serial dependence. We calculate error bars for the estimated α by means of simulations. We illustrate the method on an artificial time series. We apply it to an observed, hydrological time series from the River Elbe and find an estimated characteristic exponent of 1.48 ± 0.13. This result indicates finite mean but infinite variance of the statistical distribution of river runoff.
Sensitivity of postplanning target and OAR coverage estimates to dosimetric margin distribution sampling parameters.

Science.gov (United States)

Xu, Huijun; Gordon, J James; Siebers, Jeffrey V

2011-02-01

A dosimetric margin (DM) is the margin in a specified direction between a structure and a specified isodose surface, corresponding to a prescription or tolerance dose. The dosimetric margin distribution (DMD) is the distribution of DMs over all directions. Given a geometric uncertainty model, representing inter- or intrafraction setup uncertainties or internal organ motion, the DMD can be used to calculate coverage Q, which is the probability that a realized target or organ-at-risk (OAR) dose metric D, exceeds the corresponding prescription or tolerance dose. Postplanning coverage evaluation quantifies the percentage of uncertainties for which target and OAR structures meet their intended dose constraints. The goal of the present work is to evaluate coverage probabilities for 28 prostate treatment plans to determine DMD sampling parameters that ensure adequate accuracy for postplanning coverage estimates. Normally distributed interfraction setup uncertainties were applied to 28 plans for localized prostate cancer, with prescribed dose of 79.2 Gy and 10 mm clinical target volume to planning target volume (CTV-to-PTV) margins. Using angular or isotropic sampling techniques, dosimetric margins were determined for the CTV, bladder and rectum, assuming shift invariance of the dose distribution. For angular sampling, DMDs were sampled at fixed angular intervals w (e.g., w = 1 degree, 2 degrees, 5 degrees, 10 degrees, 20 degrees). Isotropic samples were uniformly distributed on the unit sphere resulting in variable angular increments, but were calculated for the same number of sampling directions as angular DMDs, and accordingly characterized by the effective angular increment omega eff. In each direction, the DM was calculated by moving the structure in radial steps of size delta (=0.1, 0.2, 0.5, 1 mm) until the specified isodose was crossed. Coverage estimation accuracy deltaQ was quantified as a function of the sampling parameters omega or omega eff and delta. The
Field-based random sampling without a sampling frame: control selection for a case-control study in rural Africa.

Science.gov (United States)

Crampin, A C; Mwinuka, V; Malema, S S; Glynn, J R; Fine, P E

2001-01-01

Selection bias, particularly of controls, is common in case-control studies and may materially affect the results. Methods of control selection should be tailored both for the risk factors and disease under investigation and for the population being studied. We present here a control selection method devised for a case-control study of tuberculosis in rural Africa (Karonga, northern Malawi) that selects an age/sex frequency-matched random sample of the population, with a geographical distribution in proportion to the population density. We also present an audit of the selection process, and discuss the potential of this method in other settings.
Honest Importance Sampling with Multiple Markov Chains.

Science.gov (United States)

Tan, Aixin; Doss, Hani; Hobert, James P

2015-01-01

Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π 1 , is used to estimate an expectation with respect to another, π . The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π 1 is replaced by a Harris ergodic Markov chain with invariant density π 1 , then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π 1 , …, π k , are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable
Bridging the gaps between non-invasive genetic sampling and population parameter estimation

Science.gov (United States)

Francesca Marucco; Luigi Boitani; Daniel H. Pletscher; Michael K. Schwartz

2011-01-01

Reliable estimates of population parameters are necessary for effective management and conservation actions. The use of genetic data for captureÂrecapture (CR) analyses has become an important tool to estimate population parameters for elusive species. Strong emphasis has been placed on the genetic analysis of non-invasive samples, or on the CR analysis; however,...
Estimating an appropriate sampling frequency for monitoring ground water well contamination

International Nuclear Information System (INIS)

Tuckfield, R.C.

1994-01-01

Nearly 1,500 ground water wells at the Savannah River Site (SRS) are sampled quarterly to monitor contamination by radionuclides and other hazardous constituents from nearby waste sites. Some 10,000 water samples were collected in 1993 at a laboratory analysis cost of $10,000,000. No widely accepted statistical method has been developed, to date, for estimating a technically defensible ground water sampling frequency consistent and compliant with federal regulations. Such a method is presented here based on the concept of statistical independence among successively measured contaminant concentrations in time
Dried blood spot measurement: application in tacrolimus monitoring using limited sampling strategy and abbreviated AUC estimation.

Science.gov (United States)

Cheung, Chi Yuen; van der Heijden, Jaques; Hoogtanders, Karin; Christiaans, Maarten; Liu, Yan Lun; Chan, Yiu Han; Choi, Koon Shing; van de Plas, Afke; Shek, Chi Chung; Chau, Ka Foon; Li, Chun Sang; van Hooff, Johannes; Stolk, Leo

2008-02-01

Dried blood spot (DBS) sampling and high-performance liquid chromatography tandem-mass spectrometry have been developed in monitoring tacrolimus levels. Our center favors the use of limited sampling strategy and abbreviated formula to estimate the area under concentration-time curve (AUC(0-12)). However, it is inconvenient for patients because they have to wait in the center for blood sampling. We investigated the application of DBS method in tacrolimus level monitoring using limited sampling strategy and abbreviated AUC estimation approach. Duplicate venous samples were obtained at each time point (C(0), C(2), and C(4)). To determine the stability of blood samples, one venous sample was sent to our laboratory immediately. The other duplicate venous samples, together with simultaneous fingerprick blood samples, were sent to the University of Maastricht in the Netherlands. Thirty six patients were recruited and 108 sets of blood samples were collected. There was a highly significant relationship between AUC(0-12), estimated from venous blood samples, and fingerprick blood samples (r(2) = 0.96, P AUC(0-12) strategy as drug monitoring.

Baysian estimation of P(X > x) from a small sample of Gaussian data

DEFF Research Database (Denmark)

Ditlevsen, Ove Dalager

2017-01-01

The classical statistical uncertainty problem of estimation of upper tail probabilities on the basis of a small sample of observations of a Gaussian random variable is considered. Predictive posterior estimation is discussed, adopting the standard statistical model with diffuse priors of the two...
Maximum likelihood estimation for Cox's regression model under nested case-control sampling

DEFF Research Database (Denmark)

Scheike, Thomas Harder; Juul, Anders

2004-01-01

-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used......Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards...... model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin...
Proposal for selecting an ore sample from mining shaft under Kvanefjeld

International Nuclear Information System (INIS)

Lund Clausen, F.

1979-02-01

Uranium ore recovered from the tunnel under Kvanefjeld (Greenland) will be processed in a pilot plant. Selection of a fully representative ore sample for both the whole area and single local sites is discussed. A FORTRAN program for ore distribution is presented, in order to enable correct sampling. (EG)
A Convenient Method for Estimation of the Isotopic Abundance in Uranium Bearing Samples

International Nuclear Information System (INIS)

AI -Saleh, F.S.; AI-Mukren, Alj.H.; Farouk, M.A.

2008-01-01

A convenient and simple method for estimation of the isotopic abundance in some uranium bearing samples using gamma-ray spectrometry is developed using a hyper pure germanium spectrometer and a standard uranium sample with known isotopic abundance
A rapid method for estimation of Pu-isotopes in urine samples using high volume centrifuge.

Science.gov (United States)

Kumar, Ranjeet; Rao, D D; Dubla, Rupali; Yadav, J R

2017-07-01

The conventional radio-analytical technique used for estimation of Pu-isotopes in urine samples involves anion exchange/TEVA column separation followed by alpha spectrometry. This sequence of analysis consumes nearly 3-4 days for completion. Many a times excreta analysis results are required urgently, particularly under repeat and incidental/emergency situations. Therefore, there is need to reduce the analysis time for the estimation of Pu-isotopes in bioassay samples. This paper gives the details of standardization of a rapid method for estimation of Pu-isotopes in urine samples using multi-purpose centrifuge, TEVA resin followed by alpha spectrometry. The rapid method involves oxidation of urine samples, co-precipitation of plutonium along with calcium phosphate followed by sample preparation using high volume centrifuge and separation of Pu using TEVA resin. Pu-fraction was electrodeposited and activity estimated using 236 Pu tracer recovery by alpha spectrometry. Ten routine urine samples of radiation workers were analyzed and consistent radiochemical tracer recovery was obtained in the range 47-88% with a mean and standard deviation of 64.4% and 11.3% respectively. With this newly standardized technique, the whole analytical procedure is completed within 9h (one working day hour). Copyright © 2017 Elsevier Ltd. All rights reserved.
N-mix for fish: estimating riverine salmonid habitat selection via N-mixture models

Science.gov (United States)

Som, Nicholas A.; Perry, Russell W.; Jones, Edward C.; De Juilio, Kyle; Petros, Paul; Pinnix, William D.; Rupert, Derek L.

2018-01-01

Models that formulate mathematical linkages between fish use and habitat characteristics are applied for many purposes. For riverine fish, these linkages are often cast as resource selection functions with variables including depth and velocity of water and distance to nearest cover. Ecologists are now recognizing the role that detection plays in observing organisms, and failure to account for imperfect detection can lead to spurious inference. Herein, we present a flexible N-mixture model to associate habitat characteristics with the abundance of riverine salmonids that simultaneously estimates detection probability. Our formulation has the added benefits of accounting for demographics variation and can generate probabilistic statements regarding intensity of habitat use. In addition to the conceptual benefits, model application to data from the Trinity River, California, yields interesting results. Detection was estimated to vary among surveyors, but there was little spatial or temporal variation. Additionally, a weaker effect of water depth on resource selection is estimated than that reported by previous studies not accounting for detection probability. N-mixture models show great promise for applications to riverine resource selection.
A structured sparse regression method for estimating isoform expression level from multi-sample RNA-seq data.

Science.gov (United States)

Zhang, L; Liu, X J

2016-06-03

With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations.
Selecting the best stable isotope mixing model to estimate grizzly bear diets in the Greater Yellowstone Ecosystem.

Directory of Open Access Journals (Sweden)

John B Hopkins

Full Text Available Past research indicates that whitebark pine seeds are a critical food source for Threatened grizzly bears (Ursus arctos in the Greater Yellowstone Ecosystem (GYE. In recent decades, whitebark pine forests have declined markedly due to pine beetle infestation, invasive blister rust, and landscape-level fires. To date, no study has reliably estimated the contribution of whitebark pine seeds to the diets of grizzlies through time. We used stable isotope ratios (expressed as δ13C, δ15N, and δ34S values measured in grizzly bear hair and their major food sources to estimate the diets of grizzlies sampled in Cooke City Basin, Montana. We found that stable isotope mixing models that included different combinations of stable isotope values for bears and their foods generated similar proportional dietary contributions. Estimates generated by our top model suggest that whitebark pine seeds (35±10% and other plant foods (56±10% were more important than meat (9±8% to grizzly bears sampled in the study area. Stable isotope values measured in bear hair collected elsewhere in the GYE and North America support our conclusions about plant-based foraging. We recommend that researchers consider model selection when estimating the diets of animals using stable isotope mixing models. We also urge researchers to use the new statistical framework described here to estimate the dietary responses of grizzlies to declines in whitebark pine seeds and other important food sources through time in the GYE (e.g., cutthroat trout, as such information could be useful in predicting how the population will adapt to future environmental change.
Estimation of uranium isotope in urine samples using extraction chromatography resin

International Nuclear Information System (INIS)

Thakur, Smita S.; Yadav, J.R.; Rao, D.D.

2012-01-01

Internal exposure monitoring for alpha emitting radionuclides is carried out by bioassay samples analysis. For occupational radiation workers handling uranium in reprocessing or fuel fabrication facilities, there exists a possibility of internal exposure and urine assay is the preferred method for monitoring such exposure. Estimation of lower concentration of uranium at mBq level by alpha spectrometry requires preconcentration and its separation from large volume of urine sample. For this purpose, urine samples collected from non radiation workers were spiked with 232 U tracer at mBq level to estimate the chemical yield. Uranium in urine sample was pre-concentrated by calcium phosphate coprecipitation and separated by extraction chromatography resin U/TEVA. In this resin extractant was DAAP (Diamylamylphosphonate) supported on inert Amberlite XAD-7 support material. After co-precipitation, precipitate was centrifuged and dissolved in 10 ml of 1M Al(NO 3 ) 3 prepared in 3M HNO 3 . The sample thus prepared was loaded on extraction chromatography resin, pre-conditioned with 10 ml of 3M HNO 3 . Column was washed with 10 ml of 3M HNO 3 . Column was again rinsed with 5 ml of 9M HCl followed by 20 ml of 0.05 M oxalic acid prepared in 5M HCl to remove interference due to Th and Np if present in the sample. Uranium was eluted from U/TEVA column with 15 ml of 0.01M HCl. The eluted uranium fraction was electrodeposited on stainless steel planchet and counted by alpha spectrometry for 360000 sec. Approximate analysis time involved from sample loading to stripping is 2 hours when compared with the time involved of 3.5 hours by conventional ion exchange method. Seven urine samples from non radiation worker were radio chemically analyzed by this technique and the radiochemical yield was found in the range of 69-91 %. Efficacy of this method against conventional anion exchange technique earlier standardized at this laboratory is also being highlighted. Minimum detectable activity
Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

OpenAIRE

Wan, Xiang; Wang, Wenqian; Liu, Jiming; Tong, Tiejun

2014-01-01

Background In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials. Methods In this paper, we propose to improve the existing literature in ...
Evaluation of pump pulsation in respirable size-selective sampling: part II. Changes in sampling efficiency.

Science.gov (United States)

Lee, Eun Gyung; Lee, Taekhee; Kim, Seung Won; Lee, Larry; Flemmer, Michael M; Harper, Martin

2014-01-01

This second, and concluding, part of this study evaluated changes in sampling efficiency of respirable size-selective samplers due to air pulsations generated by the selected personal sampling pumps characterized in Part I (Lee E, Lee L, Möhlmann C et al. Evaluation of pump pulsation in respirable size-selective sampling: Part I. Pulsation measurements. Ann Occup Hyg 2013). Nine particle sizes of monodisperse ammonium fluorescein (from 1 to 9 μm mass median aerodynamic diameter) were generated individually by a vibrating orifice aerosol generator from dilute solutions of fluorescein in aqueous ammonia and then injected into an environmental chamber. To collect these particles, 10-mm nylon cyclones, also known as Dorr-Oliver (DO) cyclones, were used with five medium volumetric flow rate pumps. Those were the Apex IS, HFS513, GilAir5, Elite5, and Basic5 pumps, which were found in Part I to generate pulsations of 5% (the lowest), 25%, 30%, 56%, and 70% (the highest), respectively. GK2.69 cyclones were used with the Legacy [pump pulsation (PP) = 15%] and Elite12 (PP = 41%) pumps for collection at high flows. The DO cyclone was also used to evaluate changes in sampling efficiency due to pulse shape. The HFS513 pump, which generates a more complex pulse shape, was compared to a single sine wave fluctuation generated by a piston. The luminescent intensity of the fluorescein extracted from each sample was measured with a luminescence spectrometer. Sampling efficiencies were obtained by dividing the intensity of the fluorescein extracted from the filter placed in a cyclone with the intensity obtained from the filter used with a sharp-edged reference sampler. Then, sampling efficiency curves were generated using a sigmoid function with three parameters and each sampling efficiency curve was compared to that of the reference cyclone by constructing bias maps. In general, no change in sampling efficiency (bias under ±10%) was observed until pulsations exceeded 25% for the
A quick method based on SIMPLISMA-KPLS for simultaneously selecting outlier samples and informative samples for model standardization in near infrared spectroscopy

Science.gov (United States)

Li, Li-Na; Ma, Chang-Ming; Chang, Ming; Zhang, Ren-Cheng

2017-12-01

A novel method based on SIMPLe-to-use Interactive Self-modeling Mixture Analysis (SIMPLISMA) and Kernel Partial Least Square (KPLS), named as SIMPLISMA-KPLS, is proposed in this paper for selection of outlier samples and informative samples simultaneously. It is a quick algorithm used to model standardization (or named as model transfer) in near infrared (NIR) spectroscopy. The NIR experiment data of the corn for analysis of the protein content is introduced to evaluate the proposed method. Piecewise direct standardization (PDS) is employed in model transfer. And the comparison of SIMPLISMA-PDS-KPLS and KS-PDS-KPLS is given in this research by discussion of the prediction accuracy of protein content and calculation speed of each algorithm. The conclusions include that SIMPLISMA-KPLS can be utilized as an alternative sample selection method for model transfer. Although it has similar accuracy to Kennard-Stone (KS), it is different from KS as it employs concentration information in selection program. This means that it ensures analyte information is involved in analysis, and the spectra (X) of the selected samples is interrelated with concentration (y). And it can be used for outlier sample elimination simultaneously by validation of calibration. According to the statistical data results of running time, it is clear that the sample selection process is more rapid when using KPLS. The quick algorithm of SIMPLISMA-KPLS is beneficial to improve the speed of online measurement using NIR spectroscopy.
Model Selection and Risk Estimation with Applications to Nonlinear Ordinary Differential Equation Systems

DEFF Research Database (Denmark)

Mikkelsen, Frederik Vissing

eective computational tools for estimating unknown structures in dynamical systems, such as gene regulatory networks, which may be used to predict downstream eects of interventions in the system. A recommended algorithm based on the computational tools is presented and thoroughly tested in various......Broadly speaking, this thesis is devoted to model selection applied to ordinary dierential equations and risk estimation under model selection. A model selection framework was developed for modelling time course data by ordinary dierential equations. The framework is accompanied by the R software...... package, episode. This package incorporates a collection of sparsity inducing penalties into two types of loss functions: a squared loss function relying on numerically solving the equations and an approximate loss function based on inverse collocation methods. The goal of this framework is to provide...
Near-native protein loop sampling using nonparametric density estimation accommodating sparcity.

Science.gov (United States)

Joo, Hyun; Chavan, Archana G; Day, Ryan; Lennox, Kristin P; Sukhanov, Paul; Dahl, David B; Vannucci, Marina; Tsai, Jerry

2011-10-01

Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD 7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/.
Near-native protein loop sampling using nonparametric density estimation accommodating sparcity.

Directory of Open Access Journals (Sweden)

Hyun Joo

2011-10-01

Full Text Available Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM. Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD 7.0 Å, this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/.
Near-Native Protein Loop Sampling Using Nonparametric Density Estimation Accommodating Sparcity

Science.gov (United States)

Day, Ryan; Lennox, Kristin P.; Sukhanov, Paul; Dahl, David B.; Vannucci, Marina; Tsai, Jerry

2011-01-01

Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD 7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/. PMID:22028638
Sampling Error in Relation to Cyst Nematode Population Density Estimation in Small Field Plots.

Science.gov (United States)

Župunski, Vesna; Jevtić, Radivoje; Jokić, Vesna Spasić; Župunski, Ljubica; Lalošević, Mirjana; Ćirić, Mihajlo; Ćurčić, Živko

2017-06-01

Cyst nematodes are serious plant-parasitic pests which could cause severe yield losses and extensive damage. Since there is still very little information about error of population density estimation in small field plots, this study contributes to the broad issue of population density assessment. It was shown that there was no significant difference between cyst counts of five or seven bulk samples taken per each 1-m 2 plot, if average cyst count per examined plot exceeds 75 cysts per 100 g of soil. Goodness of fit of data to probability distribution tested with χ 2 test confirmed a negative binomial distribution of cyst counts for 21 out of 23 plots. The recommended measure of sampling precision of 17% expressed through coefficient of variation ( cv ) was achieved if the plots of 1 m 2 contaminated with more than 90 cysts per 100 g of soil were sampled with 10-core bulk samples taken in five repetitions. If plots were contaminated with less than 75 cysts per 100 g of soil, 10-core bulk samples taken in seven repetitions gave cv higher than 23%. This study indicates that more attention should be paid on estimation of sampling error in experimental field plots to ensure more reliable estimation of population density of cyst nematodes.
Estimation of functional failure probability of passive systems based on adaptive importance sampling method

International Nuclear Information System (INIS)

Wang Baosheng; Wang Dongqing; Zhang Jianmin; Jiang Jing

2012-01-01

In order to estimate the functional failure probability of passive systems, an innovative adaptive importance sampling methodology is presented. In the proposed methodology, information of variables is extracted with some pre-sampling of points in the failure region. An important sampling density is then constructed from the sample distribution in the failure region. Taking the AP1000 passive residual heat removal system as an example, the uncertainties related to the model of a passive system and the numerical values of its input parameters are considered in this paper. And then the probability of functional failure is estimated with the combination of the response surface method and adaptive importance sampling method. The numerical results demonstrate the high computed efficiency and excellent computed accuracy of the methodology compared with traditional probability analysis methods. (authors)
Evaluation of design flood estimates with respect to sample size

Science.gov (United States)

Kobierska, Florian; Engeland, Kolbjorn

2016-04-01

Estimation of design floods forms the basis for hazard management related to flood risk and is a legal obligation when building infrastructure such as dams, bridges and roads close to water bodies. Flood inundation maps used for land use planning are also produced based on design flood estimates. In Norway, the current guidelines for design flood estimates give recommendations on which data, probability distribution, and method to use dependent on length of the local record. If less than 30 years of local data is available, an index flood approach is recommended where the local observations are used for estimating the index flood and regional data are used for estimating the growth curve. For 30-50 years of data, a 2 parameter distribution is recommended, and for more than 50 years of data, a 3 parameter distribution should be used. Many countries have national guidelines for flood frequency estimation, and recommended distributions include the log Pearson II, generalized logistic and generalized extreme value distributions. For estimating distribution parameters, ordinary and linear moments, maximum likelihood and Bayesian methods are used. The aim of this study is to r-evaluate the guidelines for local flood frequency estimation. In particular, we wanted to answer the following questions: (i) Which distribution gives the best fit to the data? (ii) Which estimation method provides the best fit to the data? (iii) Does the answer to (i) and (ii) depend on local data availability? To answer these questions we set up a test bench for local flood frequency analysis using data based cross-validation methods. The criteria were based on indices describing stability and reliability of design flood estimates. Stability is used as a criterion since design flood estimates should not excessively depend on the data sample. The reliability indices describe to which degree design flood predictions can be trusted.
Estimation and model selection of semiparametric multivariate survival functions under general censorship.

Science.gov (United States)

Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

2010-07-01

We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.

Estimation of sample size and testing power (part 6).

Science.gov (United States)

Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

2012-03-01

The design of one factor with k levels (k ≥ 3) refers to the research that only involves one experimental factor with k levels (k ≥ 3), and there is no arrangement for other important non-experimental factors. This paper introduces the estimation of sample size and testing power for quantitative data and qualitative data having a binary response variable with the design of one factor with k levels (k ≥ 3).
A Simple Sampling Method for Estimating the Accuracy of Large Scale Record Linkage Projects.

Science.gov (United States)

Boyd, James H; Guiver, Tenniel; Randall, Sean M; Ferrante, Anna M; Semmens, James B; Anderson, Phil; Dickinson, Teresa

2016-05-17

Record linkage techniques allow different data collections to be brought together to provide a wider picture of the health status of individuals. Ensuring high linkage quality is important to guarantee the quality and integrity of research. Current methods for measuring linkage quality typically focus on precision (the proportion of incorrect links), given the difficulty of measuring the proportion of false negatives. The aim of this work is to introduce and evaluate a sampling based method to estimate both precision and recall following record linkage. In the sampling based method, record-pairs from each threshold (including those below the identified cut-off for acceptance) are sampled and clerically reviewed. These results are then applied to the entire set of record-pairs, providing estimates of false positives and false negatives. This method was evaluated on a synthetically generated dataset, where the true match status (which records belonged to the same person) was known. The sampled estimates of linkage quality were relatively close to actual linkage quality metrics calculated for the whole synthetic dataset. The precision and recall measures for seven reviewers were very consistent with little variation in the clerical assessment results (overall agreement using the Fleiss Kappa statistics was 0.601). This method presents as a possible means of accurately estimating matching quality and refining linkages in population level linkage studies. The sampling approach is especially important for large project linkages where the number of record pairs produced may be very large often running into millions.
Selection Component Analysis of Natural Polymorphisms using Population Samples Including Mother-Offspring Combinations, II

DEFF Research Database (Denmark)

Jarmer, Hanne Østergaard; Christiansen, Freddy Bugge

1981-01-01

Population samples including mother-offspring combinations provide information on the selection components: zygotic selection, sexual selection, gametic seletion and fecundity selection, on the mating pattern, and on the deviation from linkage equilibrium among the loci studied. The theory...
Privacy problems in the small sample selection

Directory of Open Access Journals (Sweden)

Loredana Cerbara

2013-05-01

Full Text Available The side of social research that uses small samples for the production of micro data, today finds some operating difficulties due to the privacy law. The privacy code is a really important and necessary law because it guarantees the Italian citizen’s rights, as already happens in other Countries of the world. However it does not seem appropriate to limit once more the possibilities of the data production of the national centres of research. That possibilities are already moreover compromised due to insufficient founds is a common problem becoming more and more frequent in the research field. It would be necessary, therefore, to include in the law the possibility to use telephonic lists to select samples useful for activities directly of interest and importance to the citizen, such as the collection of the data carried out on the basis of opinion polls by the centres of research of the Italian CNR and some universities.
Reliability of sampling strategies for measuring dairy cattle welfare on commercial farms.

Science.gov (United States)

Van Os, Jennifer M C; Winckler, Christoph; Trieb, Julia; Matarazzo, Soraia V; Lehenbauer, Terry W; Champagne, John D; Tucker, Cassandra B

2018-02-01

Our objective was to evaluate how the proportion of high-producing lactating cows sampled on each farm and the selection method affect prevalence estimates for animal-based measures. We assessed the entire high-producing pen (days in milk size calculations from the Welfare Quality Protocol; and (4) selecting the first, middle, or final third of cows exiting the milking parlor. Estimates were compared with true values using regression analysis and were considered accurate if they met 3 criteria: the coefficient of determination was ≥0.9 and the slope and intercept did not differ significantly from 1 and 0, respectively. All estimates met the slope and intercept criteria, whereas the coefficient of determination increased when more cows were sampled. All estimates were accurate for neck alterations, ocular discharge (22.2 ± 27.4%), and carpal joint hair loss (14.1 ± 17.4%). Selecting a third of the milking order or using the Welfare Quality sample size calculations failed to accurately estimate all measures simultaneously. However, all estimates were accurate when selecting at least 2 of every 3 cows locked at the feed bunk. Using restraint position at the feed bunk did not differ systematically from computer-selecting the same proportion of cows randomly, and the former may be a simpler approach for welfare assessments. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Data Quality Objectives For Selecting Waste Samples To Test The Fluid Bed Steam Reformer Test

International Nuclear Information System (INIS)

Banning, D.L.

2010-01-01

This document describes the data quality objectives to select archived samples located at the 222-S Laboratory for Fluid Bed Steam Reformer testing. The type, quantity and quality of the data required to select the samples for Fluid Bed Steam Reformer testing are discussed. In order to maximize the efficiency and minimize the time to treat Hanford tank waste in the Waste Treatment and Immobilization Plant, additional treatment processes may be required. One of the potential treatment processes is the fluid bed steam reformer (FBSR). A determination of the adequacy of the FBSR process to treat Hanford tank waste is required. The initial step in determining the adequacy of the FBSR process is to select archived waste samples from the 222-S Laboratory that will be used to test the FBSR process. Analyses of the selected samples will be required to confirm the samples meet the testing criteria.
Approaches to sampling and case selection in qualitative research: examples in the geography of health.

Science.gov (United States)

Curtis, S; Gesler, W; Smith, G; Washburn, S

2000-04-01

This paper focuses on the question of sampling (or selection of cases) in qualitative research. Although the literature includes some very useful discussions of qualitative sampling strategies, the question of sampling often seems to receive less attention in methodological discussion than questions of how data is collected or is analysed. Decisions about sampling are likely to be important in many qualitative studies (although it may not be an issue in some research). There are varying accounts of the principles applicable to sampling or case selection. Those who espouse 'theoretical sampling', based on a 'grounded theory' approach, are in some ways opposed to those who promote forms of 'purposive sampling' suitable for research informed by an existing body of social theory. Diversity also results from the many different methods for drawing purposive samples which are applicable to qualitative research. We explore the value of a framework suggested by Miles and Huberman [Miles, M., Huberman,, A., 1994. Qualitative Data Analysis, Sage, London.], to evaluate the sampling strategies employed in three examples of research by the authors. Our examples comprise three studies which respectively involve selection of: 'healing places'; rural places which incorporated national anti-malarial policies; young male interviewees, identified as either chronically ill or disabled. The examples are used to show how in these three studies the (sometimes conflicting) requirements of the different criteria were resolved, as well as the potential and constraints placed on the research by the selection decisions which were made. We also consider how far the criteria Miles and Huberman suggest seem helpful for planning 'sample' selection in qualitative research.
Estimation of uranium in different types of water and sand samples by adsorptive stripping voltammetry

International Nuclear Information System (INIS)

Bhalke, Sunil; Raghunath, Radha; Mishra, Suchismita; Suseela, B.; Tripathi, R.M.; Pandit, G.G.; Shukla, V.K.; Puranik, V.D.

2005-01-01

A method is standardized for the estimation of uranium by adsorptive stripping voltammetry using chloranilic acid (CAA) as complexing agent. The optimum parameters to get best sensitivity and good reproducibility for uranium were 60s adsorption time, pH 1.8, chloranilic acid (2x10 -4 M) and 0.002M EDTA. The peak potential under this condition was found to be -0.03 V. With these optimum parameters a sensitivity of 1.19 nA/nM uranium was observed. Detection limit for this optimum parameter was found to be 0.55 nM. This can be further improved by increasing adsorption time. Using this method, uranium was estimated in different type of water samples such as seawater, synthetic seawater, stream water, tap water, well water, bore well water and process water. This method has also been used for estimation of uranium in sand, organic solvent used for extraction of uranium from phosphoric acid and its raffinate. Sample digestion procedures used for estimation of uranium in various matrices are discussed. It has been observed from the analysis that the uranium peak potentials changes with matrix of the sample, hence, standard addition method is the best method to get reliable and accurate results. Quality assurance of the standardized method is verified by analyzing certified reference water sample from USDOE, participating intercomparison exercises and also by estimating uranium content in water samples both by differential pulse adsorptive stripping voltammetric and laser fluorimetric techniques. (author)
Estimation of the sensitivity of various environmental sampling methods for detection of Salmonella in duck flocks.

Science.gov (United States)

Arnold, Mark E; Mueller-Doblies, Doris; Gosling, Rebecca J; Martelli, Francesca; Davies, Robert H

2015-01-01

Reports of Salmonella in ducks in the UK currently rely upon voluntary submissions from the industry, and as there is no harmonized statutory monitoring and control programme, it is difficult to compare data from different years in order to evaluate any trends in Salmonella prevalence in relation to sampling methodology. Therefore, the aim of this project was to assess the sensitivity of a selection of environmental sampling methods, including the sampling of faeces, dust and water troughs or bowls for the detection of Salmonella in duck flocks, and a range of sampling methods were applied to 67 duck flocks. Bayesian methods in the absence of a gold standard were used to provide estimates of the sensitivity of each of the sampling methods relative to the within-flock prevalence. There was a large influence of the within-flock prevalence on the sensitivity of all sample types, with sensitivity reducing as the within-flock prevalence reduced. Boot swabs (individual and pool of four), swabs of faecally contaminated areas and whole house hand-held fabric swabs showed the overall highest sensitivity for low-prevalence flocks and are recommended for use to detect Salmonella in duck flocks. The sample type with the highest proportion positive was a pool of four hair nets used as boot swabs, but this was not the most sensitive sample for low-prevalence flocks. All the environmental sampling types (faeces swabs, litter pinches, drag swabs, water trough samples and dust) had higher sensitivity than individual faeces sampling. None of the methods consistently identified all the positive flocks, and at least 10 samples would be required for even the most sensitive method (pool of four boot swabs) to detect a 5% prevalence. The sampling of dust had a low sensitivity and is not recommended for ducks.
Continuous sampling from distributed streams

DEFF Research Database (Denmark)

Graham, Cormode; Muthukrishnan, S.; Yi, Ke

2012-01-01

A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple distribu......A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple...... distributed sites. The main challenge is to ensure that a sample is drawn uniformly across the union of the data while minimizing the communication needed to run the protocol on the evolving data. At the same time, it is also necessary to make the protocol lightweight, by keeping the space and time costs low...... for each participant. In this article, we present communication-efficient protocols for continuously maintaining a sample (both with and without replacement) from k distributed streams. These apply to the case when we want a sample from the full streams, and to the sliding window cases of only the W most...
Per tree estimates with n-tree distance sampling: an application to increment core data

Science.gov (United States)

Thomas B. Lynch; Robert F. Wittwer

2002-01-01

Per tree estimates using the n trees nearest a point can be obtained by using a ratio of per unit area estimates from n-tree distance sampling. This ratio was used to estimate average age by d.b.h. classes for cottonwood trees (Populus deltoides Bartr. ex Marsh.) on the Cimarron National Grassland. Increment...
Determination of Selected Polycyclic Aromatic Compounds in Particulate Matter Samples with Low Mass Loading: An Approach to Test Method Accuracy

Directory of Open Access Journals (Sweden)

Susana García-Alonso

2017-01-01

Full Text Available A miniaturized analytical procedure to determine selected polycyclic aromatic compounds (PACs in low mass loadings (<10 mg of particulate matter (PM is evaluated. The proposed method is based on a simple sonication/agitation method using small amounts of solvent for extraction. The use of a reduced sample size of particulate matter is often limiting for allowing the quantification of analytes. This also leads to the need for changing analytical procedures and evaluating its performance. The trueness and precision of the proposed method were tested using ambient air samples. Analytical results from the proposed method were compared with those of pressurized liquid and microwave extractions. Selected PACs (polycyclic aromatic hydrocarbons (PAHs and nitro polycyclic aromatic hydrocarbons (NPAHs were determined by liquid chromatography with fluorescence detection (HPLC/FD. Taking results from pressurized liquid extractions as reference values, recovery rates of sonication/agitation method were over 80% for the most abundant PAHs. Recovery rates of selected NPAHs were lower. Enhanced rates were obtained when methanol was used as a modifier. Intermediate precision was estimated by data comparison from two mathematical approaches: normalized difference data and pooled relative deviations. Intermediate precision was in the range of 10–20%. The effectiveness of the proposed method was evaluated in PM aerosol samples collected with very low mass loadings (<0.2 mg during characterization studies from turbofan engine exhausts.
Directional selection in temporally replicated studies is remarkably consistent.

Science.gov (United States)

Morrissey, Michael B; Hadfield, Jarrod D

2012-02-01

Temporal variation in selection is a fundamental determinant of evolutionary outcomes. A recent paper presented a synthetic analysis of temporal variation in selection in natural populations. The authors concluded that there is substantial variation in the strength and direction of selection over time, but acknowledged that sampling error would result in estimates of selection that were more variable than the true values. We reanalyze their dataset using techniques that account for the necessary effect of sampling error to inflate apparent levels of variation and show that directional selection is remarkably constant over time, both in magnitude and direction. Thus we cannot claim that the available data support the existence of substantial temporal heterogeneity in selection. Nonetheless, we conject that temporal variation in selection could be important, but that there are good reasons why it may not appear in the available data. These new analyses highlight the importance of applying techniques that estimate parameters of the distribution of selection, rather than parameters of the distribution of estimated selection (which will reflect both sampling error and "real" variation in selection); indeed, despite availability of methods for the former, focus on the latter has been common in synthetic reviews of the aspects of selection in nature, and can lead to serious misinterpretations. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.
Counting Cats: Spatially Explicit Population Estimates of Cheetah (Acinonyx jubatus Using Unstructured Sampling Data.

Directory of Open Access Journals (Sweden)

Femke Broekhuis

Full Text Available Many ecological theories and species conservation programmes rely on accurate estimates of population density. Accurate density estimation, especially for species facing rapid declines, requires the application of rigorous field and analytical methods. However, obtaining accurate density estimates of carnivores can be challenging as carnivores naturally exist at relatively low densities and are often elusive and wide-ranging. In this study, we employ an unstructured spatial sampling field design along with a Bayesian sex-specific spatially explicit capture-recapture (SECR analysis, to provide the first rigorous population density estimates of cheetahs (Acinonyx jubatus in the Maasai Mara, Kenya. We estimate adult cheetah density to be between 1.28 ± 0.315 and 1.34 ± 0.337 individuals/100km2 across four candidate models specified in our analysis. Our spatially explicit approach revealed 'hotspots' of cheetah density, highlighting that cheetah are distributed heterogeneously across the landscape. The SECR models incorporated a movement range parameter which indicated that male cheetah moved four times as much as females, possibly because female movement was restricted by their reproductive status and/or the spatial distribution of prey. We show that SECR can be used for spatially unstructured data to successfully characterise the spatial distribution of a low density species and also estimate population density when sample size is small. Our sampling and modelling framework will help determine spatial and temporal variation in cheetah densities, providing a foundation for their conservation and management. Based on our results we encourage other researchers to adopt a similar approach in estimating densities of individually recognisable species.
Counting Cats: Spatially Explicit Population Estimates of Cheetah (Acinonyx jubatus) Using Unstructured Sampling Data.

Science.gov (United States)

Broekhuis, Femke; Gopalaswamy, Arjun M

2016-01-01

Many ecological theories and species conservation programmes rely on accurate estimates of population density. Accurate density estimation, especially for species facing rapid declines, requires the application of rigorous field and analytical methods. However, obtaining accurate density estimates of carnivores can be challenging as carnivores naturally exist at relatively low densities and are often elusive and wide-ranging. In this study, we employ an unstructured spatial sampling field design along with a Bayesian sex-specific spatially explicit capture-recapture (SECR) analysis, to provide the first rigorous population density estimates of cheetahs (Acinonyx jubatus) in the Maasai Mara, Kenya. We estimate adult cheetah density to be between 1.28 ± 0.315 and 1.34 ± 0.337 individuals/100km2 across four candidate models specified in our analysis. Our spatially explicit approach revealed 'hotspots' of cheetah density, highlighting that cheetah are distributed heterogeneously across the landscape. The SECR models incorporated a movement range parameter which indicated that male cheetah moved four times as much as females, possibly because female movement was restricted by their reproductive status and/or the spatial distribution of prey. We show that SECR can be used for spatially unstructured data to successfully characterise the spatial distribution of a low density species and also estimate population density when sample size is small. Our sampling and modelling framework will help determine spatial and temporal variation in cheetah densities, providing a foundation for their conservation and management. Based on our results we encourage other researchers to adopt a similar approach in estimating densities of individually recognisable species.
Evaluation of species richness estimators based on quantitative performance measures and sensitivity to patchiness and sample grain size

Science.gov (United States)

Willie, Jacob; Petre, Charles-Albert; Tagg, Nikki; Lens, Luc

2012-11-01

Data from forest herbaceous plants in a site of known species richness in Cameroon were used to test the performance of rarefaction and eight species richness estimators (ACE, ICE, Chao1, Chao2, Jack1, Jack2, Bootstrap and MM). Bias, accuracy, precision and sensitivity to patchiness and sample grain size were the evaluation criteria. An evaluation of the effects of sampling effort and patchiness on diversity estimation is also provided. Stems were identified and counted in linear series of 1-m2 contiguous square plots distributed in six habitat types. Initially, 500 plots were sampled in each habitat type. The sampling process was monitored using rarefaction and a set of richness estimator curves. Curves from the first dataset suggested adequate sampling in riparian forest only. Additional plots ranging from 523 to 2143 were subsequently added in the undersampled habitats until most of the curves stabilized. Jack1 and ICE, the non-parametric richness estimators, performed better, being more accurate and less sensitive to patchiness and sample grain size, and significantly reducing biases that could not be detected by rarefaction and other estimators. This study confirms the usefulness of non-parametric incidence-based estimators, and recommends Jack1 or ICE alongside rarefaction while describing taxon richness and comparing results across areas sampled using similar or different grain sizes. As patchiness varied across habitat types, accurate estimations of diversity did not require the same number of plots. The number of samples needed to fully capture diversity is not necessarily the same across habitats, and can only be known when taxon sampling curves have indicated adequate sampling. Differences in observed species richness between habitats were generally due to differences in patchiness, except between two habitats where they resulted from differences in abundance. We suggest that communities should first be sampled thoroughly using appropriate taxon sampling
Molecularly imprinted layer-coated silica nanoparticles for selective solid-phase extraction of bisphenol A from chemical cleansing and cosmetics samples

International Nuclear Information System (INIS)

Zhu Rong; Zhao Wenhui; Zhai Meijuan; Wei Fangdi; Cai Zheng; Sheng Na; Hu Qin

2010-01-01

Highly selective molecularly imprinted layer-coated silica nanoparticles for bisphenol A (BPA) were synthesized by molecular imprinting technique with a sol-gel process on the supporter of silica nanoparticles. The BPA-imprinted silica nanoparticles were characterized by fourier transform infrared spectrometer, transmission electron microscope, dynamic adsorption and static adsorption tests. The equilibrium association constant, K a , and the apparent maximum number of binding sites, Q max , were estimated to be 1.25 x 10 5 mL μmol -1 and 16.4 μmol g -1 , respectively. The BPA-imprinted silica nanoparticles solid-phase extraction (SPE) column had higher selectivity for BPA than the commercial C18-SPE column. The results of the study indicated that the prepared BPA-imprinted silica nanoparticles exhibited high adsorption capacity and selectivity, and offered a fast kinetics for the rebinding of BPA. The BPA-imprinted silica nanoparticles were successfully used in SPE to selectively enrich and determine BPA from shampoo, bath lotion and cosmetic cream samples.
Contributions to sampling statistics

CERN Document Server

Conti, Pier; Ranalli, Maria

2014-01-01

This book contains a selection of the papers presented at the ITACOSM 2013 Conference, held in Milan in June 2013. ITACOSM is the bi-annual meeting of the Survey Sampling Group S2G of the Italian Statistical Society, intended as an international forum of scientific discussion on the developments of theory and application of survey sampling methodologies and applications in human and natural sciences. The book gathers research papers carefully selected from both invited and contributed sessions of the conference. The whole book appears to be a relevant contribution to various key aspects of sampling methodology and techniques; it deals with some hot topics in sampling theory, such as calibration, quantile-regression and multiple frame surveys, and with innovative methodologies in important topics of both sampling theory and applications. Contributions cut across current sampling methodologies such as interval estimation for complex samples, randomized responses, bootstrap, weighting, modeling, imputati...
SU-E-I-46: Sample-Size Dependence of Model Observers for Estimating Low-Contrast Detection Performance From CT Images

International Nuclear Information System (INIS)

Reiser, I; Lu, Z

2014-01-01

Purpose: Recently, task-based assessment of diagnostic CT systems has attracted much attention. Detection task performance can be estimated using human observers, or mathematical observer models. While most models are well established, considerable bias can be introduced when performance is estimated from a limited number of image samples. Thus, the purpose of this work was to assess the effect of sample size on bias and uncertainty of two channelized Hotelling observers and a template-matching observer. Methods: The image data used for this study consisted of 100 signal-present and 100 signal-absent regions-of-interest, which were extracted from CT slices. The experimental conditions included two signal sizes and five different x-ray beam current settings (mAs). Human observer performance for these images was determined in 2-alternative forced choice experiments. These data were provided by the Mayo clinic in Rochester, MN. Detection performance was estimated from three observer models, including channelized Hotelling observers (CHO) with Gabor or Laguerre-Gauss (LG) channels, and a template-matching observer (TM). Different sample sizes were generated by randomly selecting a subset of image pairs, (N=20,40,60,80). Observer performance was quantified as proportion of correct responses (PC). Bias was quantified as the relative difference of PC for 20 and 80 image pairs. Results: For n=100, all observer models predicted human performance across mAs and signal sizes. Bias was 23% for CHO (Gabor), 7% for CHO (LG), and 3% for TM. The relative standard deviation, σ(PC)/PC at N=20 was highest for the TM observer (11%) and lowest for the CHO (Gabor) observer (5%). Conclusion: In order to make image quality assessment feasible in the clinical practice, a statistically efficient observer model, that can predict performance from few samples, is needed. Our results identified two observer models that may be suited for this task
Choosing a suitable sample size in descriptive sampling

International Nuclear Information System (INIS)

Lee, Yong Kyun; Choi, Dong Hoon; Cha, Kyung Joon

2010-01-01

Descriptive sampling (DS) is an alternative to crude Monte Carlo sampling (CMCS) in finding solutions to structural reliability problems. It is known to be an effective sampling method in approximating the distribution of a random variable because it uses the deterministic selection of sample values and their random permutation,. However, because this method is difficult to apply to complex simulations, the sample size is occasionally determined without thorough consideration. Input sample variability may cause the sample size to change between runs, leading to poor simulation results. This paper proposes a numerical method for choosing a suitable sample size for use in DS. Using this method, one can estimate a more accurate probability of failure in a reliability problem while running a minimal number of simulations. The method is then applied to several examples and compared with CMCS and conventional DS to validate its usefulness and efficiency

The Swift Gamma-Ray Burst Host Galaxy Legacy Survey. I. Sample Selection and Redshift Distribution

Science.gov (United States)

Perley, D. A.; Kruhler, T.; Schulze, S.; Postigo, A. De Ugarte; Hjorth, J.; Berger, E.; Cenko, S. B.; Chary, R.; Cucchiara, A.; Ellis, R.;

2016-01-01

We introduce the Swift Gamma-Ray Burst Host Galaxy Legacy Survey (SHOALS), a multi-observatory high redshift galaxy survey targeting the largest unbiased sample of long-duration gamma-ray burst (GRB) hosts yet assembled (119 in total). We describe the motivations of the survey and the development of our selection criteria, including an assessment of the impact of various observability metrics on the success rate of afterglow-based redshift measurement. We briefly outline our host galaxy observational program, consisting of deep Spitzer/IRAC imaging of every field supplemented by similarly deep, multicolor optical/near-IR photometry, plus spectroscopy of events without preexisting redshifts. Our optimized selection cuts combined with host galaxy follow-up have so far enabled redshift measurements for 110 targets (92%) and placed upper limits on all but one of the remainder. About 20% of GRBs in the sample are heavily dust obscured, and at most 2% originate from z > 5.5. Using this sample, we estimate the redshift-dependent GRB rate density, showing it to peak at z approx. 2.5 and fall by at least an order of magnitude toward low (z = 0) redshift, while declining more gradually toward high (z approx. 7) redshift. This behavior is consistent with a progenitor whose formation efficiency varies modestly over cosmic history. Our survey will permit the most detailed examination to date of the connection between the GRB host population and general star-forming galaxies, directly measure evolution in the host population over cosmic time and discern its causes, and provide new constraints on the fraction of cosmic star formation occurring in undetectable galaxies at all redshifts.

Estimation of the sugar cane cultivated area from LANDSAT images using the two phase sampling method

Science.gov (United States)

Parada, N. D. J. (Principal Investigator); Cappelletti, C. A.; Mendonca, F. J.; Lee, D. C. L.; Shimabukuro, Y. E.

1982-01-01

A two phase sampling method and the optimal sampling segment dimensions for the estimation of sugar cane cultivated area were developed. This technique employs visual interpretations of LANDSAT images and panchromatic aerial photographs considered as the ground truth. The estimates, as a mean value of 100 simulated samples, represent 99.3% of the true value with a CV of approximately 1%; the relative efficiency of the two phase design was 157% when compared with a one phase aerial photographs sample.
New sorbent materials for selective extraction of cocaine and benzoylecgonine from human urine samples.

Science.gov (United States)

Bujak, Renata; Gadzała-Kopciuch, Renata; Nowaczyk, Alicja; Raczak-Gutknecht, Joanna; Kordalewska, Marta; Struck-Lewicka, Wiktoria; Waszczuk-Jankowska, Małgorzata; Tomczak, Ewa; Kaliszan, Michał; Buszewski, Bogusław; Markuszewski, Michał J

2016-02-20

An increase in cocaine consumption has been observed in Europe during the last decade. Benzoylecgonine, as a main urinary metabolite of cocaine in human, is so far the most reliable marker of cocaine consumption. Determination of cocaine and its metabolite in complex biological samples as urine or blood, requires efficient and selective sample pretreatment. In this preliminary study, the newly synthesized sorbent materials were proposed for selective extraction of cocaine and benzoylecgonine from urine samples. Application of these sorbent media allowed to determine cocaine and benzoylecgonine in urine samples at the concentration level of 100ng/ml with good recovery values as 81.7%±6.6 and 73.8%±4.2, respectively. The newly synthesized materials provided efficient, inexpensive and selective extraction of both cocaine and benzoylecgonine from urine samples, which can consequently lead to an increase of the sensitivity of the current available screening diagnostic tests. Copyright © 2015 Elsevier B.V. All rights reserved.
Soft X-Ray Observations of a Complete Sample of X-Ray--selected BL Lacertae Objects

Science.gov (United States)

Perlman, Eric S.; Stocke, John T.; Wang, Q. Daniel; Morris, Simon L.

1996-01-01

We present the results of ROSAT PSPC observations of the X-ray selected BL Lacertae objects (XBLs) in the complete Einstein Extended Medium Sensitivity Survey (EM MS) sample. None of the objects is resolved in their respective PSPC images, but all are easily detected. All BL Lac objects in this sample are well-fitted by single power laws. Their X-ray spectra exhibit a variety of spectral slopes, with best-fit energy power-law spectral indices between α = 0.5-2.3. The PSPC spectra of this sample are slightly steeper than those typical of flat ratio-spectrum quasars. Because almost all of the individual PSPC spectral indices are equal to or slightly steeper than the overall optical to X-ray spectral indices for these same objects, we infer that BL Lac soft X-ray continua are dominated by steep-spectrum synchrotron radiation from a broad X-ray jet, rather than flat-spectrum inverse Compton radiation linked to the narrower radio/millimeter jet. The softness of the X-ray spectra of these XBLs revives the possibility proposed by Guilbert, Fabian, & McCray (1983) that BL Lac objects are lineless because the circumnuclear gas cannot be heated sufficiently to permit two stable gas phases, the cooler of which would comprise the broad emission-line clouds. Because unified schemes predict that hard self-Compton radiation is beamed only into a small solid angle in BL Lac objects, the steep-spectrum synchrotron tail controls the temperature of the circumnuclear gas at r ≤ 1018 cm and prevents broad-line cloud formation. We use these new ROSAT data to recalculate the X-ray luminosity function and cosmological evolution of the complete EMSS sample by determining accurate K-corrections for the sample and estimating the effects of variability and the possibility of incompleteness in the sample. Our analysis confirms that XBLs are evolving "negatively," opposite in sense to quasars, with Ve/Va = 0.331±0.060. The statistically significant difference between the values for X
World Health Organization Estimates of the Relative Contributions of Food to the Burden of Disease Due to Selected Foodborne Hazards: A Structured Expert Elicitation

DEFF Research Database (Denmark)

Hald, Tine; Aspinall, Willy; Devleesschauwer, Brecht

2016-01-01

transmission routes. These findings are essential for global burden of FBD estimates. While gaps exist, we believe the estimates presented here are the best current source of guidance to support decision makers when allocating resources for control and intervention, and for future research initiatives......., seven other infectious diseases and one chemical (lead). Experts were identified through international networks followed by social network sampling. Final selection of experts was based on their experience including international working experience. Enrolled experts were scored on their ability to judge...
Density Estimation in Several Populations With Uncertain Population Membership

KAUST Repository

Ma, Yanyuan; Hart, Jeffrey D.; Carroll, Raymond J.

2011-01-01

sampled from any given population can be calculated. We develop general estimation procedures and bandwidth selection methods for our setting. We establish large-sample properties and study finite-sample performance using simulation studies. We illustrate
Method for estimating modulation transfer function from sample images.

Science.gov (United States)

Saiga, Rino; Takeuchi, Akihisa; Uesugi, Kentaro; Terada, Yasuko; Suzuki, Yoshio; Mizutani, Ryuta

2018-02-01

The modulation transfer function (MTF) represents the frequency domain response of imaging modalities. Here, we report a method for estimating the MTF from sample images. Test images were generated from a number of images, including those taken with an electron microscope and with an observation satellite. These original images were convolved with point spread functions (PSFs) including those of circular apertures. The resultant test images were subjected to a Fourier transformation. The logarithm of the squared norm of the Fourier transform was plotted against the squared distance from the origin. Linear correlations were observed in the logarithmic plots, indicating that the PSF of the test images can be approximated with a Gaussian. The MTF was then calculated from the Gaussian-approximated PSF. The obtained MTF closely coincided with the MTF predicted from the original PSF. The MTF of an x-ray microtomographic section of a fly brain was also estimated with this method. The obtained MTF showed good agreement with the MTF determined from an edge profile of an aluminum test object. We suggest that this approach is an alternative way of estimating the MTF, independently of the image type. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dynamic selection of ship responses for estimation of on-site directional wave spectra

DEFF Research Database (Denmark)

Andersen, Ingrid Marie Vincent; Storhaug, Gaute

2012-01-01

-estimate of the wave spectrum is suggested. The selection method needs to be robust for what reason a parameterised uni-directional, two-parameter wave spectrum is treated. The parameters included are the zero up-crossing period, the significant wave height and the main wave direction relative to the ship’s heading...... with the best overall agreement are selected for the actual estimation of the directional wave spectrum. The transfer functions for the ship responses can be determined using different computational methods such as striptheory, 3D panel codes, closed form expressions or model tests. The uncertainty associated......Knowledge of the wave environment in which a ship is operating is crucial for most on-board decision support systems. Previous research has shown that the directional wave spectrum can be estimated by the use of measured global ship responses and a set of transfer functions determined...
A novel recursive Fourier transform for nonuniform sampled signals: application to heart rate variability spectrum estimation.

Science.gov (United States)

Holland, Alexander; Aboy, Mateo

2009-07-01

We present a novel method to iteratively calculate discrete Fourier transforms for discrete time signals with sample time intervals that may be widely nonuniform. The proposed recursive Fourier transform (RFT) does not require interpolation of the samples to uniform time intervals, and each iterative transform update of N frequencies has computational order N. Because of the inherent non-uniformity in the time between successive heart beats, an application particularly well suited for this transform is power spectral density (PSD) estimation for heart rate variability. We compare RFT based spectrum estimation with Lomb-Scargle Transform (LST) based estimation. PSD estimation based on the LST also does not require uniform time samples, but the LST has a computational order greater than Nlog(N). We conducted an assessment study involving the analysis of quasi-stationary signals with various levels of randomly missing heart beats. Our results indicate that the RFT leads to comparable estimation performance to the LST with significantly less computational overhead and complexity for applications requiring iterative spectrum estimations.
PERIOD ESTIMATION FOR SPARSELY SAMPLED QUASI-PERIODIC LIGHT CURVES APPLIED TO MIRAS

Energy Technology Data Exchange (ETDEWEB)

He, Shiyuan; Huang, Jianhua Z.; Long, James [Department of Statistics, Texas A and M University, College Station, TX (United States); Yuan, Wenlong; Macri, Lucas M., E-mail: lmacri@tamu.edu [George P. and Cynthia W. Mitchell Institute for Fundamental Physics and Astronomy, Department of Physics and Astronomy, Texas A and M University, College Station, TX (United States)

2016-12-01

We develop a nonlinear semi-parametric Gaussian process model to estimate periods of Miras with sparsely sampled light curves. The model uses a sinusoidal basis for the periodic variation and a Gaussian process for the stochastic changes. We use maximum likelihood to estimate the period and the parameters of the Gaussian process, while integrating out the effects of other nuisance parameters in the model with respect to a suitable prior distribution obtained from earlier studies. Since the likelihood is highly multimodal for period, we implement a hybrid method that applies the quasi-Newton algorithm for Gaussian process parameters and search the period/frequency parameter space over a dense grid. A large-scale, high-fidelity simulation is conducted to mimic the sampling quality of Mira light curves obtained by the M33 Synoptic Stellar Survey. The simulated data set is publicly available and can serve as a testbed for future evaluation of different period estimation methods. The semi-parametric model outperforms an existing algorithm on this simulated test data set as measured by period recovery rate and quality of the resulting period–luminosity relations.
Estimation of Handgrip Force from SEMG Based on Wavelet Scale Selection.

Science.gov (United States)

Wang, Kai; Zhang, Xianmin; Ota, Jun; Huang, Yanjiang

2018-02-24

This paper proposes a nonlinear correlation-based wavelet scale selection technology to select the effective wavelet scales for the estimation of handgrip force from surface electromyograms (SEMG). The SEMG signal corresponding to gripping force was collected from extensor and flexor forearm muscles during the force-varying analysis task. We performed a computational sensitivity analysis on the initial nonlinear SEMG-handgrip force model. To explore the nonlinear correlation between ten wavelet scales and handgrip force, a large-scale iteration based on the Monte Carlo simulation was conducted. To choose a suitable combination of scales, we proposed a rule to combine wavelet scales based on the sensitivity of each scale and selected the appropriate combination of wavelet scales based on sequence combination analysis (SCA). The results of SCA indicated that the scale combination VI is suitable for estimating force from the extensors and the combination V is suitable for the flexors. The proposed method was compared to two former methods through prolonged static and force-varying contraction tasks. The experiment results showed that the root mean square errors derived by the proposed method for both static and force-varying contraction tasks were less than 20%. The accuracy and robustness of the handgrip force derived by the proposed method is better than that obtained by the former methods.
Estimation of salt intake from spot urine samples in patients with chronic kidney disease

Directory of Open Access Journals (Sweden)

Ogura Makoto

2012-06-01

Full Text Available Abstract Background High salt intake in patients with chronic kidney disease (CKD may cause high blood pressure and increased albuminuria. Although, the estimation of salt intake is essential, there are no easy methods to estimate real salt intake. Methods Salt intake was assessed by determining urinary sodium excretion from the collected urine samples. Estimation of salt intake by spot urine was calculated by Tanaka’s formula. The correlation between estimated and measured sodium excretion was evaluated by Pearson´s correlation coefficients. Performance of equation was estimated by median bias, interquartile range (IQR, proportion of estimates within 30% deviation of measured sodium excretion (P30 and root mean square error (RMSE.The sensitivity and specificity of estimated against measured sodium excretion were separately assessed by receiver-operating characteristic (ROC curves. Results A total of 334 urine samples from 96 patients were examined. Mean age was 58 ± 16 years, and estimated glomerular filtration rate (eGFR was 53 ± 27 mL/min. Among these patients, 35 had CKD stage 1 or 2, 39 had stage 3, and 22 had stage 4 or 5. Estimated sodium excretion significantly correlated with measured sodium excretion (R = 0.52, P 170 mEq/day (AUC 0.835. Conclusions The present study demonstrated that spot urine can be used to estimate sodium excretion, especially in patients with low eGFR.
Optimal Wavelength Selection in Ultraviolet Spectroscopy for the Estimation of Toxin Reduction Ratio during Hemodialysis

Directory of Open Access Journals (Sweden)

Amir Ghanifar

2016-06-01

Full Text Available Introduction The concentration of substances, including urea, creatinine, and uric acid, can be used as an index to measure toxic uremic solutes in the blood during dialysis and interdialytic intervals. The on-line monitoring of toxin concentration allows for the clearance measurement of some low-molecular-weight solutes at any time during hemodialysis.The aim of this study was to determine the optimal wavelength for estimating the changes in urea, creatinine, and uric acid in dialysate, using ultraviolet (UV spectroscopy. Materials and Methods In this study, nine uremic patients were investigated, using on-line spectrophotometry. The on-line absorption measurements (UV radiation were performed with a spectrophotometer module, connected to the fluid outlet of the dialysis machine. Dialysate samples were obtained and analyzed, using standard biochemical methods. Optimal wavelengths for both creatinine and uric acid were selected by using a combination of genetic algorithms (GAs, i.e., GA-partial least squares (GA-PLS and interval partial least squares (iPLS. Results The Artifitial Neural Network (ANN sensitivity analysis determined the wavelengths of the UV band most suitable for estimating the concentration of creatinine and uric acid. The two optimal wavelengths were 242 and 252 nm for creatinine and 295 and 298 nm for uric acid. Conclusion It can be concluded that the reduction ratio of creatinine and uric acid (dialysis efficiency could be continuously monitored during hemodialysis by UV spectroscopy.Compared to the conventional method, which is particularly sensitive to the sampling technique and involves post-dialysis blood sampling, iterative measurements throughout the dialysis session can yield more reliable data.
Feature selection for portfolio optimization

DEFF Research Database (Denmark)

Bjerring, Thomas Trier; Ross, Omri; Weissensteiner, Alex

2016-01-01

Most portfolio selection rules based on the sample mean and covariance matrix perform poorly out-of-sample. Moreover, there is a growing body of evidence that such optimization rules are not able to beat simple rules of thumb, such as 1/N. Parameter uncertainty has been identified as one major....... While most of the diversification benefits are preserved, the parameter estimation problem is alleviated. We conduct out-of-sample back-tests to show that in most cases different well-established portfolio selection rules applied on the reduced asset universe are able to improve alpha relative...
Automated CBED processing: Sample thickness estimation based on analysis of zone-axis CBED pattern

Energy Technology Data Exchange (ETDEWEB)

Klinger, M., E-mail: klinger@post.cz; Němec, M.; Polívka, L.; Gärtnerová, V.; Jäger, A.

2015-03-15

An automated processing of convergent beam electron diffraction (CBED) patterns is presented. The proposed methods are used in an automated tool for estimating the thickness of transmission electron microscopy (TEM) samples by matching an experimental zone-axis CBED pattern with a series of patterns simulated for known thicknesses. The proposed tool detects CBED disks, localizes a pattern in detected disks and unifies the coordinate system of the experimental pattern with the simulated one. The experimental pattern is then compared disk-by-disk with a series of simulated patterns each corresponding to different known thicknesses. The thickness of the most similar simulated pattern is then taken as the thickness estimate. The tool was tested on [0 1 1] Si, [0 1 0] α-Ti and [0 1 1] α-Ti samples prepared using different techniques. Results of the presented approach were compared with thickness estimates based on analysis of CBED patterns in two beam conditions. The mean difference between these two methods was 4.1% for the FIB-prepared silicon samples, 5.2% for the electro-chemically polished titanium and 7.9% for Ar{sup +} ion-polished titanium. The proposed techniques can also be employed in other established CBED analyses. Apart from the thickness estimation, it can potentially be used to quantify lattice deformation, structure factors, symmetry, defects or extinction distance. - Highlights: • Automated TEM sample thickness estimation using zone-axis CBED is presented. • Computer vision and artificial intelligence are employed in CBED processing. • This approach reduces operator effort, analysis time and increases repeatability. • Individual parts can be employed in other analyses of CBED/diffraction pattern.
Temporally stratified sampling programs for estimation of fish impingement

International Nuclear Information System (INIS)

Kumar, K.D.; Griffith, J.S.

1977-01-01

Impingement monitoring programs often expend valuable and limited resources and fail to provide a dependable estimate of either total annual impingement or those biological and physicochemical factors affecting impingement. In situations where initial monitoring has identified ''problem'' fish species and the periodicity of their impingement, intensive sampling during periods of high impingement will maximize information obtained. We use data gathered at two nuclear generating facilities in the southeastern United States to discuss techniques of designing such temporally stratified monitoring programs and their benefits and drawbacks. Of the possible temporal patterns in environmental factors within a calendar year, differences among seasons are most influential in the impingement of freshwater fishes in the Southeast. Data on the threadfin shad (Dorosoma petenense) and the role of seasonal temperature changes are utilized as an example to demonstrate ways of most efficiently and accurately estimating impingement of the species
Reliable Quantification of the Potential for Equations Based on Spot Urine Samples to Estimate Population Salt Intake

DEFF Research Database (Denmark)

Huang, Liping; Crino, Michelle; Wu, Jason Hy

2016-01-01

to a standard format. Individual participant records will be compiled and a series of analyses will be completed to: (1) compare existing equations for estimating 24-hour salt intake from spot urine samples with 24-hour urine samples, and assess the degree of bias according to key demographic and clinical......BACKGROUND: Methods based on spot urine samples (a single sample at one time-point) have been identified as a possible alternative approach to 24-hour urine samples for determining mean population salt intake. OBJECTIVE: The aim of this study is to identify a reliable method for estimating mean...... population salt intake from spot urine samples. This will be done by comparing the performance of existing equations against one other and against estimates derived from 24-hour urine samples. The effects of factors such as ethnicity, sex, age, body mass index, antihypertensive drug use, health status...
B-graph sampling to estimate the size of a hidden population

NARCIS (Netherlands)

Spreen, M.; Bogaerts, S.

2015-01-01

Link-tracing designs are often used to estimate the size of hidden populations by utilizing the relational links between their members. A major problem in studies of hidden populations is the lack of a convenient sampling frame. The most frequently applied design in studies of hidden populations is
The NuSTAR Extragalactic Surveys: X-Ray Spectroscopic Analysis of the Bright Hard-band Selected Sample

Science.gov (United States)

Zappacosta, L.; Comastri, A.; Civano, F.; Puccetti, S.; Fiore, F.; Aird, J.; Del Moro, A.; Lansbury, G. B.; Lanzuisi, G.; Goulding, A.; Mullaney, J. R.; Stern, D.; Ajello, M.; Alexander, D. M.; Ballantyne, D. R.; Bauer, F. E.; Brandt, W. N.; Chen, C.-T. J.; Farrah, D.; Harrison, F. A.; Gandhi, P.; Lanz, L.; Masini, A.; Marchesi, S.; Ricci, C.; Treister, E.

2018-02-01

We discuss the spectral analysis of a sample of 63 active galactic nuclei (AGN) detected above a limiting flux of S(8{--}24 {keV})=7× {10}-14 {erg} {{{s}}}-1 {{cm}}-2 in the multi-tiered NuSTAR extragalactic survey program. The sources span a redshift range z=0{--}2.1 (median =0.58). The spectral analysis is performed over the broad 0.5–24 keV energy range, combining NuSTAR with Chandra and/or XMM-Newton data and employing empirical and physically motivated models. This constitutes the largest sample of AGN selected at > 10 {keV} to be homogeneously spectrally analyzed at these flux levels. We study the distribution of spectral parameters such as photon index, column density ({N}{{H}}), reflection parameter ({\\boldsymbol{R}}), and 10–40 keV luminosity ({L}{{X}}). Heavily obscured ({log}[{N}{{H}}/{{cm}}-2]≥slant 23) and Compton-thick (CT; {log}[{N}{{H}}/{{cm}}-2]≥slant 24) AGN constitute ∼25% (15–17 sources) and ∼2–3% (1–2 sources) of the sample, respectively. The observed {N}{{H}} distribution agrees fairly well with predictions of cosmic X-ray background population-synthesis models (CXBPSM). We estimate the intrinsic fraction of AGN as a function of {N}{{H}}, accounting for the bias against obscured AGN in a flux-selected sample. The fraction of CT AGN relative to {log}[{N}{{H}}/{{cm}}-2]=20{--}24 AGN is poorly constrained, formally in the range 2–56% (90% upper limit of 66%). We derived a fraction (f abs) of obscured AGN ({log}[{N}{{H}}/{{cm}}-2]=22{--}24) as a function of {L}{{X}} in agreement with CXBPSM and previous zvalues.
Automatic Feature Selection and Weighting for the Formation of Homogeneous Groups for Regional Intensity-Duration-Frequency (IDF) Curve Estimation

Science.gov (United States)

Yang, Z.; Burn, D. H.

2017-12-01

Extreme rainfall events can have devastating impacts on society. To quantify the associated risk, the IDF curve has been used to provide the essential rainfall-related information for urban planning. However, the recent changes in the rainfall climatology caused by climate change and urbanization have made the estimates provided by the traditional regional IDF approach increasingly inaccurate. This inaccuracy is mainly caused by two problems: 1) The ineffective choice of similarity indicators for the formation of a homogeneous group at different regions; and 2) An inadequate number of stations in the pooling group that does not adequately reflect the optimal balance between group size and group homogeneity or achieve the lowest uncertainty in the rainfall quantiles estimates. For the first issue, to consider the temporal difference among different meteorological and topographic indicators, a three-layer design is proposed based on three stages in the extreme rainfall formation: cloud formation, rainfall generation and change of rainfall intensity above urban surface. During the process, the impacts from climate change and urbanization are considered through the inclusion of potential relevant features at each layer. Then to consider spatial difference of similarity indicators for the homogeneous group formation at various regions, an automatic feature selection and weighting algorithm, specifically the hybrid searching algorithm of Tabu search, Lagrange Multiplier and Fuzzy C-means Clustering, is used to select the optimal combination of features for the potential optimal homogenous groups formation at a specific region. For the second issue, to compare the uncertainty of rainfall quantile estimates among potential groups, the two sample Kolmogorov-Smirnov test-based sample ranking process is used. During the process, linear programming is used to rank these groups based on the confidence intervals of the quantile estimates. The proposed methodology fills the gap

Effects of Spatial Distribution of Trees on Density Estimation by Nearest Individual Sampling Method: Case Studies in Zagros Wild Pistachio Woodlands and Simulated Stands

Directory of Open Access Journals (Sweden)

Y. Erfanifard

2014-06-01

Full Text Available Distance methods and their estimators of density may have biased measurements unless the studied stand of trees has a random spatial pattern. This study aimed at assessing the effect of spatial arrangement of wild pistachio trees on the results of density estimation by using the nearest individual sampling method in Zagros woodlands, Iran, and applying a correction factor based on the spatial pattern of trees. A 45 ha clumped stand of wild pistachio trees was selected in Zagros woodlands and two random and dispersed stands with similar density and area were simulated. Distances from the nearest individual and neighbour at 40 sample points in a 100 × 100 m grid were measured in the three stands. The results showed that the nearest individual method with Batcheler estimator could not calculate density correctly in all stands. However, applying the correction factor based on the spatial pattern of the trees, density was measured with no significant difference in terms of the real density of the stands. This study showed that considering the spatial arrangement of trees can improve the results of the nearest individual method with Batcheler estimator in density measurement.
Effective traffic features selection algorithm for cyber-attacks samples

Science.gov (United States)

Li, Yihong; Liu, Fangzheng; Du, Zhenyu

2018-05-01

By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.
Automatic sampling for unbiased and efficient stereological estimation using the proportionator in biological studies

DEFF Research Database (Denmark)

Gardi, Jonathan Eyal; Nyengaard, Jens Randel; Gundersen, Hans Jørgen Gottlieb

2008-01-01

Quantification of tissue properties is improved using the general proportionator sampling and estimation procedure: automatic image analysis and non-uniform sampling with probability proportional to size (PPS). The complete region of interest is partitioned into fields of view, and every field...... of view is given a weight (the size) proportional to the total amount of requested image analysis features in it. The fields of view sampled with known probabilities proportional to individual weight are the only ones seen by the observer who provides the correct count. Even though the image analysis...... cerebellum, total number of orexin positive neurons in transgenic mice brain, and estimating the absolute area and the areal fraction of β islet cells in dog pancreas. The proportionator was at least eight times more efficient (precision and time combined) than traditional computer controlled sampling....
Sample size methods for estimating HIV incidence from cross-sectional surveys.

Science.gov (United States)

Konikoff, Jacob; Brookmeyer, Ron

2015-12-01

Understanding HIV incidence, the rate at which new infections occur in populations, is critical for tracking and surveillance of the epidemic. In this article, we derive methods for determining sample sizes for cross-sectional surveys to estimate incidence with sufficient precision. We further show how to specify sample sizes for two successive cross-sectional surveys to detect changes in incidence with adequate power. In these surveys biomarkers such as CD4 cell count, viral load, and recently developed serological assays are used to determine which individuals are in an early disease stage of infection. The total number of individuals in this stage, divided by the number of people who are uninfected, is used to approximate the incidence rate. Our methods account for uncertainty in the durations of time spent in the biomarker defined early disease stage. We find that failure to account for this uncertainty when designing surveys can lead to imprecise estimates of incidence and underpowered studies. We evaluated our sample size methods in simulations and found that they performed well in a variety of underlying epidemics. Code for implementing our methods in R is available with this article at the Biometrics website on Wiley Online Library. © 2015, The International Biometric Society.
Effects of sample size on estimation of rainfall extremes at high temperatures

Science.gov (United States)

Boessenkool, Berry; Bürger, Gerd; Heistermann, Maik

2017-09-01

High precipitation quantiles tend to rise with temperature, following the so-called Clausius-Clapeyron (CC) scaling. It is often reported that the CC-scaling relation breaks down and even reverts for very high temperatures. In our study, we investigate this reversal using observational climate data from 142 stations across Germany. One of the suggested meteorological explanations for the breakdown is limited moisture supply. Here we argue that, instead, it could simply originate from undersampling. As rainfall frequency generally decreases with higher temperatures, rainfall intensities as dictated by CC scaling are less likely to be recorded than for moderate temperatures. Empirical quantiles are conventionally estimated from order statistics via various forms of plotting position formulas. They have in common that their largest representable return period is given by the sample size. In small samples, high quantiles are underestimated accordingly. The small-sample effect is weaker, or disappears completely, when using parametric quantile estimates from a generalized Pareto distribution (GPD) fitted with L moments. For those, we obtain quantiles of rainfall intensities that continue to rise with temperature.
Effects of sample size on estimation of rainfall extremes at high temperatures

Directory of Open Access Journals (Sweden)

B. Boessenkool

2017-09-01

Full Text Available High precipitation quantiles tend to rise with temperature, following the so-called Clausius–Clapeyron (CC scaling. It is often reported that the CC-scaling relation breaks down and even reverts for very high temperatures. In our study, we investigate this reversal using observational climate data from 142 stations across Germany. One of the suggested meteorological explanations for the breakdown is limited moisture supply. Here we argue that, instead, it could simply originate from undersampling. As rainfall frequency generally decreases with higher temperatures, rainfall intensities as dictated by CC scaling are less likely to be recorded than for moderate temperatures. Empirical quantiles are conventionally estimated from order statistics via various forms of plotting position formulas. They have in common that their largest representable return period is given by the sample size. In small samples, high quantiles are underestimated accordingly. The small-sample effect is weaker, or disappears completely, when using parametric quantile estimates from a generalized Pareto distribution (GPD fitted with L moments. For those, we obtain quantiles of rainfall intensities that continue to rise with temperature.
Random selection of items. Selection of n1 samples among N items composing a stratum

International Nuclear Information System (INIS)

Jaech, J.L.; Lemaire, R.J.

1987-02-01

STR-224 provides generalized procedures to determine required sample sizes, for instance in the course of a Physical Inventory Verification at Bulk Handling Facilities. The present report describes procedures to generate random numbers and select groups of items to be verified in a given stratum through each of the measurement methods involved in the verification. (author). 3 refs
Evaluating the accuracy of sampling to estimate central line-days: simplification of the National Healthcare Safety Network surveillance methods.

Science.gov (United States)

Thompson, Nicola D; Edwards, Jonathan R; Bamberg, Wendy; Beldavs, Zintars G; Dumyati, Ghinwa; Godine, Deborah; Maloney, Meghan; Kainer, Marion; Ray, Susan; Thompson, Deborah; Wilson, Lucy; Magill, Shelley S

2013-03-01

To evaluate the accuracy of weekly sampling of central line-associated bloodstream infection (CLABSI) denominator data to estimate central line-days (CLDs). Obtained CLABSI denominator logs showing daily counts of patient-days and CLD for 6-12 consecutive months from participants and CLABSI numerators and facility and location characteristics from the National Healthcare Safety Network (NHSN). Convenience sample of 119 inpatient locations in 63 acute care facilities within 9 states participating in the Emerging Infections Program. Actual CLD and estimated CLD obtained from sampling denominator data on all single-day and 2-day (day-pair) samples were compared by assessing the distributions of the CLD percentage error. Facility and location characteristics associated with increased precision of estimated CLD were assessed. The impact of using estimated CLD to calculate CLABSI rates was evaluated by measuring the change in CLABSI decile ranking. The distribution of CLD percentage error varied by the day and number of days sampled. On average, day-pair samples provided more accurate estimates than did single-day samples. For several day-pair samples, approximately 90% of locations had CLD percentage error of less than or equal to ±5%. A lower number of CLD per month was most significantly associated with poor precision in estimated CLD. Most locations experienced no change in CLABSI decile ranking, and no location's CLABSI ranking changed by more than 2 deciles. Sampling to obtain estimated CLD is a valid alternative to daily data collection for a large proportion of locations. Development of a sampling guideline for NHSN users is underway.
A Sample-Based Forest Monitoring Strategy Using Landsat, AVHRR and MODIS Data to Estimate Gross Forest Cover Loss in Malaysia between 1990 and 2005

Directory of Open Access Journals (Sweden)

Peter Potapov

2013-04-01

Full Text Available Insular Southeast Asia is a hotspot of humid tropical forest cover loss. A sample-based monitoring approach quantifying forest cover loss from Landsat imagery was implemented to estimate gross forest cover loss for two eras, 1990–2000 and 2000–2005. For each time interval, a probability sample of 18.5 km × 18.5 km blocks was selected, and pairs of Landsat images acquired per sample block were interpreted to quantify forest cover area and gross forest cover loss. Stratified random sampling was implemented for 2000–2005 with MODIS-derived forest cover loss used to define the strata. A probability proportional to x (πpx design was implemented for 1990–2000 with AVHRR-derived forest cover loss used as the x variable to increase the likelihood of including forest loss area in the sample. The estimated annual gross forest cover loss for Malaysia was 0.43 Mha/yr (SE = 0.04 during 1990–2000 and 0.64 Mha/yr (SE = 0.055 during 2000–2005. Our use of the πpx sampling design represents a first practical trial of this design for sampling satellite imagery. Although the design performed adequately in this study, a thorough comparative investigation of the πpx design relative to other sampling strategies is needed before general design recommendations can be put forth.
Estimates of Inequality Indices Based on Simple Random, Ranked Set, and Systematic Sampling

OpenAIRE

Bansal, Pooja; Arora, Sangeeta; Mahajan, Kalpana K.

2013-01-01

Gini index, Bonferroni index, and Absolute Lorenz index are some popular indices of inequality showing different features of inequality measurement. In general simple random sampling procedure is commonly used to estimate the inequality indices and their related inference. The key condition that the samples must be drawn via simple random sampling procedure though makes calculations much simpler but this assumption is often violated in practice as the data does not always yield simple random ...
Time delay estimation in a reverberant environment by low rate sampling of impulsive acoustic sources

KAUST Repository

Omer, Muhammad

2012-07-01

This paper presents a new method of time delay estimation (TDE) using low sample rates of an impulsive acoustic source in a room environment. The proposed method finds the time delay from the room impulse response (RIR) which makes it robust against room reverberations. The RIR is considered a sparse phenomenon and a recently proposed sparse signal reconstruction technique called orthogonal clustering (OC) is utilized for its estimation from the low rate sampled received signal. The arrival time of the direct path signal at a pair of microphones is identified from the estimated RIR and their difference yields the desired time delay. Low sampling rates reduce the hardware and computational complexity and decrease the communication between the microphones and the centralized location. The performance of the proposed technique is demonstrated by numerical simulations and experimental results. © 2012 IEEE.
Clinical usefulness of limited sampling strategies for estimating AUC of proton pump inhibitors.

Science.gov (United States)

Niioka, Takenori

2011-03-01

Cytochrome P450 (CYP) 2C19 (CYP2C19) genotype is regarded as a useful tool to predict area under the blood concentration-time curve (AUC) of proton pump inhibitors (PPIs). In our results, however, CYP2C19 genotypes had no influence on AUC of all PPIs during fluvoxamine treatment. These findings suggest that CYP2C19 genotyping is not always a good indicator for estimating AUC of PPIs. Limited sampling strategies (LSS) were developed to estimate AUC simply and accurately. It is important to minimize the number of blood samples because of patient's acceptance. This article reviewed the usefulness of LSS for estimating AUC of three PPIs (omeprazole: OPZ, lansoprazole: LPZ and rabeprazole: RPZ). The best prediction formulas in each PPI were AUC(OPZ)=9.24 x C(6h)+2638.03, AUC(LPZ)=12.32 x C(6h)+3276.09 and AUC(RPZ)=1.39 x C(3h)+7.17 x C(6h)+344.14, respectively. In order to optimize the sampling strategy of LPZ, we tried to establish LSS for LPZ using a time point within 3 hours through the property of pharmacokinetics of its enantiomers. The best prediction formula using the fewest sampling points (one point) was AUC(racemic LPZ)=6.5 x C(3h) of (R)-LPZ+13.7 x C(3h) of (S)-LPZ-9917.3 x G1-14387.2×G2+7103.6 (G1: homozygous extensive metabolizer is 1 and the other genotypes are 0; G2: heterozygous extensive metabolizer is 1 and the other genotypes are 0). Those strategies, plasma concentration monitoring at one or two time-points, might be more suitable for AUC estimation than reference to CYP2C19 genotypes, particularly in the case of coadministration of CYP mediators.
40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.

Science.gov (United States)

2010-07-01

... 40 Protection of Environment 30 2010-07-01 2010-07-01 false Sample selection by random number... Â§ 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square... area created in accordance with paragraph (a) of this section, select two random numbers: one each for...
Frequency-Selective Signal Sensing with Sub-Nyquist Uniform Sampling Scheme

DEFF Research Database (Denmark)

Pierzchlewski, Jacek; Arildsen, Thomas

2015-01-01

In this paper the authors discuss a problem of acquisition and reconstruction of a signal polluted by adjacent- channel interference. The authors propose a method to find a sub-Nyquist uniform sampling pattern which allows for correct reconstruction of selected frequencies. The method is inspired...... by the Restricted Isometry Property, which is known from the field of compressed sensing. Then, compressed sensing is used to successfully reconstruct a wanted signal even if some of the uniform samples were randomly lost, e. g. due to ADC saturation. An experiment which tests the proposed method in practice...
Non-parametric adaptive importance sampling for the probability estimation of a launcher impact position

International Nuclear Information System (INIS)

Morio, Jerome

2011-01-01

Importance sampling (IS) is a useful simulation technique to estimate critical probability with a better accuracy than Monte Carlo methods. It consists in generating random weighted samples from an auxiliary distribution rather than the distribution of interest. The crucial part of this algorithm is the choice of an efficient auxiliary PDF that has to be able to simulate more rare random events. The optimisation of this auxiliary distribution is often in practice very difficult. In this article, we propose to approach the IS optimal auxiliary density with non-parametric adaptive importance sampling (NAIS). We apply this technique for the probability estimation of spatial launcher impact position since it has currently become a more and more important issue in the field of aeronautics.
Reef-associated crustacean fauna: biodiversity estimates using semi-quantitative sampling and DNA barcoding

Science.gov (United States)

Plaisance, L.; Knowlton, N.; Paulay, G.; Meyer, C.

2009-12-01

The cryptofauna associated with coral reefs accounts for a major part of the biodiversity in these ecosystems but has been largely overlooked in biodiversity estimates because the organisms are hard to collect and identify. We combine a semi-quantitative sampling design and a DNA barcoding approach to provide metrics for the diversity of reef-associated crustacean. Twenty-two similar-sized dead heads of Pocillopora were sampled at 10 m depth from five central Pacific Ocean localities (four atolls in the Northern Line Islands and in Moorea, French Polynesia). All crustaceans were removed, and partial cytochrome oxidase subunit I was sequenced from 403 individuals, yielding 135 distinct taxa using a species-level criterion of 5% similarity. Most crustacean species were rare; 44% of the OTUs were represented by a single individual, and an additional 33% were represented by several specimens found only in one of the five localities. The Northern Line Islands and Moorea shared only 11 OTUs. Total numbers estimated by species richness statistics (Chao1 and ACE) suggest at least 90 species of crustaceans in Moorea and 150 in the Northern Line Islands for this habitat type. However, rarefaction curves for each region failed to approach an asymptote, and Chao1 and ACE estimators did not stabilize after sampling eight heads in Moorea, so even these diversity figures are underestimates. Nevertheless, even this modest sampling effort from a very limited habitat resulted in surprisingly high species numbers.
Multiple sensitive estimation and optimal sample size allocation in the item sum technique.

Science.gov (United States)

Perri, Pier Francesco; Rueda García, María Del Mar; Cobo Rodríguez, Beatriz

2018-01-01

For surveys of sensitive issues in life sciences, statistical procedures can be used to reduce nonresponse and social desirability response bias. Both of these phenomena provoke nonsampling errors that are difficult to deal with and can seriously flaw the validity of the analyses. The item sum technique (IST) is a very recent indirect questioning method derived from the item count technique that seeks to procure more reliable responses on quantitative items than direct questioning while preserving respondents' anonymity. This article addresses two important questions concerning the IST: (i) its implementation when two or more sensitive variables are investigated and efficient estimates of their unknown population means are required; (ii) the determination of the optimal sample size to achieve minimum variance estimates. These aspects are of great relevance for survey practitioners engaged in sensitive research and, to the best of our knowledge, were not studied so far. In this article, theoretical results for multiple estimation and optimal allocation are obtained under a generic sampling design and then particularized to simple random sampling and stratified sampling designs. Theoretical considerations are integrated with a number of simulation studies based on data from two real surveys and conducted to ascertain the efficiency gain derived from optimal allocation in different situations. One of the surveys concerns cannabis consumption among university students. Our findings highlight some methodological advances that can be obtained in life sciences IST surveys when optimal allocation is achieved. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Variable selection for confounder control, flexible modeling and Collaborative Targeted Minimum Loss-based Estimation in causal inference

Science.gov (United States)

Schnitzer, Mireille E.; Lok, Judith J.; Gruber, Susan

2015-01-01

This paper investigates the appropriateness of the integration of flexible propensity score modeling (nonparametric or machine learning approaches) in semiparametric models for the estimation of a causal quantity, such as the mean outcome under treatment. We begin with an overview of some of the issues involved in knowledge-based and statistical variable selection in causal inference and the potential pitfalls of automated selection based on the fit of the propensity score. Using a simple example, we directly show the consequences of adjusting for pure causes of the exposure when using inverse probability of treatment weighting (IPTW). Such variables are likely to be selected when using a naive approach to model selection for the propensity score. We describe how the method of Collaborative Targeted minimum loss-based estimation (C-TMLE; van der Laan and Gruber, 2010) capitalizes on the collaborative double robustness property of semiparametric efficient estimators to select covariates for the propensity score based on the error in the conditional outcome model. Finally, we compare several approaches to automated variable selection in low-and high-dimensional settings through a simulation study. From this simulation study, we conclude that using IPTW with flexible prediction for the propensity score can result in inferior estimation, while Targeted minimum loss-based estimation and C-TMLE may benefit from flexible prediction and remain robust to the presence of variables that are highly correlated with treatment. However, in our study, standard influence function-based methods for the variance underestimated the standard errors, resulting in poor coverage under certain data-generating scenarios. PMID:26226129
Point and Fixed Plot Sampling Inventory Estimates at the Savannah River Site, South Carolina.

Energy Technology Data Exchange (ETDEWEB)

Parresol, Bernard, R.

2004-02-01

This report provides calculation of systematic point sampling volume estimates for trees greater than or equal to 5 inches diameter breast height (dbh) and fixed radius plot volume estimates for trees < 5 inches dbh at the Savannah River Site (SRS), Aiken County, South Carolina. The inventory of 622 plots was started in March 1999 and completed in January 2002 (Figure 1). Estimates are given in cubic foot volume. The analyses are presented in a series of Tables and Figures. In addition, a preliminary analysis of fuel levels on the SRS is given, based on depth measurements of the duff and litter layers on the 622 inventory plots plus line transect samples of down coarse woody material. Potential standing live fuels are also included. The fuels analyses are presented in a series of tables.
Selection and estimation of the heritability of sunflower (Helianthus annuus) pollen collection behavior in Apis mellifera colonies.

Science.gov (United States)

Basualdo, M; Rodríguez, E M; Bedascarrasbure, E; De Jong, D

2007-06-20

We selected honey bee colonies (Apis mellifera L.) with a high tendency to collect sunflower pollen and estimated the heritability of this trait. The percentage of sunflower pollen collected by 74 colonies was evaluated. Five colonies that collected the highest percentages of sunflower pollen were selected. Nineteen colonies headed by daughters of these selected queens were evaluated for this characteristic in comparison with 20 control (unselected) colonies. The variation for the proportion of sunflower pollen was greater among colonies of the control group than among these selected daughter colonies. The estimated heritability was 0.26 +/- 0.23, demonstrating that selection to increase sunflower pollen collection is feasible. Such selected colonies could be used to improve sunflower pollination in commercial fields.

Mineralogy, petrology and whole-rock chemistry data compilation for selected samples of Yucca Mountain tuffs

International Nuclear Information System (INIS)

Connolly, J.R.

1991-12-01

Petrologic, bulk chemical, and mineralogic data are presented for 49 samples of tuffaceous rocks from core holes USW G-1 and UE-25a number-sign 1 at Yucca Mountain, Nevada. Included, in descending stratigraphic order, are 11 samples from the Topopah Spring Member of the Paintbrush Tuff, 12 samples from the Tuffaceous Beds of Calico Hills, 3 samples from the Prow Pass Member of the Crater Flat Tuff, 20 samples from the Bullfrog Member of the Crater Flat Tuff and 3 samples from the Tram Member of the Crater Flat Tuff. The suite of samples contains a wide variety of petrologic types, including zeolitized, glassy, and devitrified tuffs. Data vary considerably between groups of samples, and include thin section descriptions (some with modal analyses for which uncertainties are estimated), electron microprobe analyses of mineral phases and matrix, mineral identifications by X-ray diffraction, and major element analyses with uncertainty estimates
Estimating cross-validatory predictive p-values with integrated importance sampling for disease mapping models.

Science.gov (United States)

Li, Longhai; Feng, Cindy X; Qiu, Shi

2017-06-30

An important statistical task in disease mapping problems is to identify divergent regions with unusually high or low risk of disease. Leave-one-out cross-validatory (LOOCV) model assessment is the gold standard for estimating predictive p-values that can flag such divergent regions. However, actual LOOCV is time-consuming because one needs to rerun a Markov chain Monte Carlo analysis for each posterior distribution in which an observation is held out as a test case. This paper introduces a new method, called integrated importance sampling (iIS), for estimating LOOCV predictive p-values with only Markov chain samples drawn from the posterior based on a full data set. The key step in iIS is that we integrate away the latent variables associated the test observation with respect to their conditional distribution without reference to the actual observation. By following the general theory for importance sampling, the formula used by iIS can be proved to be equivalent to the LOOCV predictive p-value. We compare iIS and other three existing methods in the literature with two disease mapping datasets. Our empirical results show that the predictive p-values estimated with iIS are almost identical to the predictive p-values estimated with actual LOOCV and outperform those given by the existing three methods, namely, the posterior predictive checking, the ordinary importance sampling, and the ghosting method by Marshall and Spiegelhalter (2003). Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Molecularly imprinted layer-coated silica nanoparticles for selective solid-phase extraction of bisphenol A from chemical cleansing and cosmetics samples

Energy Technology Data Exchange (ETDEWEB)

Zhu Rong; Zhao Wenhui; Zhai Meijuan; Wei Fangdi; Cai Zheng; Sheng Na [School of Pharmacy, Nanjing Medical University, Hanzhong Road 140, Nanjing, Jiangsu 210029 (China); Hu Qin, E-mail: huqin@njmu.edu.cn [School of Pharmacy, Nanjing Medical University, Hanzhong Road 140, Nanjing, Jiangsu 210029 (China)

2010-01-25

Highly selective molecularly imprinted layer-coated silica nanoparticles for bisphenol A (BPA) were synthesized by molecular imprinting technique with a sol-gel process on the supporter of silica nanoparticles. The BPA-imprinted silica nanoparticles were characterized by fourier transform infrared spectrometer, transmission electron microscope, dynamic adsorption and static adsorption tests. The equilibrium association constant, K{sub a}, and the apparent maximum number of binding sites, Q{sub max}, were estimated to be 1.25 x 10{sup 5} mL {mu}mol{sup -1} and 16.4 {mu}mol g{sup -1}, respectively. The BPA-imprinted silica nanoparticles solid-phase extraction (SPE) column had higher selectivity for BPA than the commercial C18-SPE column. The results of the study indicated that the prepared BPA-imprinted silica nanoparticles exhibited high adsorption capacity and selectivity, and offered a fast kinetics for the rebinding of BPA. The BPA-imprinted silica nanoparticles were successfully used in SPE to selectively enrich and determine BPA from shampoo, bath lotion and cosmetic cream samples.
Density Estimation in Several Populations With Uncertain Population Membership

KAUST Repository

Ma, Yanyuan

2011-09-01

We devise methods to estimate probability density functions of several populations using observations with uncertain population membership, meaning from which population an observation comes is unknown. The probability of an observation being sampled from any given population can be calculated. We develop general estimation procedures and bandwidth selection methods for our setting. We establish large-sample properties and study finite-sample performance using simulation studies. We illustrate our methods with data from a nutrition study.
Improved sampling for airborne surveys to estimate wildlife population parameters in the African Savannah

NARCIS (Netherlands)

Khaemba, W.; Stein, A.

2002-01-01

Parameter estimates, obtained from airborne surveys of wildlife populations, often have large bias and large standard errors. Sampling error is one of the major causes of this imprecision and the occurrence of many animals in herds violates the common assumptions in traditional sampling designs like
Adult health study reference papers. Selection of the sample. Characteristics of the sample

Energy Technology Data Exchange (ETDEWEB)

Beebe, G W; Fujisawa, Hideo; Yamasaki, Mitsuru

1960-12-14

The characteristics and selection of the clinical sample have been described in some detail to provide information on the comparability of the exposure groups with respect to factors excluded from the matching criteria and to provide basic descriptive information potentially relevant to individual studies that may be done within the framework of the Adult Health Study. The characteristics under review here are age, sex, many different aspects of residence, marital status, occupation and industry, details of location and shielding ATB, acute radiation signs and symptoms, and prior ABCC medical or pathology examinations. 5 references, 57 tables.
Effects of LiDAR point density, sampling size and height threshold on estimation accuracy of crop biophysical parameters.

Science.gov (United States)

Luo, Shezhou; Chen, Jing M; Wang, Cheng; Xi, Xiaohuan; Zeng, Hongcheng; Peng, Dailiang; Li, Dong

2016-05-30

Vegetation leaf area index (LAI), height, and aboveground biomass are key biophysical parameters. Corn is an important and globally distributed crop, and reliable estimations of these parameters are essential for corn yield forecasting, health monitoring and ecosystem modeling. Light Detection and Ranging (LiDAR) is considered an effective technology for estimating vegetation biophysical parameters. However, the estimation accuracies of these parameters are affected by multiple factors. In this study, we first estimated corn LAI, height and biomass (R2 = 0.80, 0.874 and 0.838, respectively) using the original LiDAR data (7.32 points/m2), and the results showed that LiDAR data could accurately estimate these biophysical parameters. Second, comprehensive research was conducted on the effects of LiDAR point density, sampling size and height threshold on the estimation accuracy of LAI, height and biomass. Our findings indicated that LiDAR point density had an important effect on the estimation accuracy for vegetation biophysical parameters, however, high point density did not always produce highly accurate estimates, and reduced point density could deliver reasonable estimation results. Furthermore, the results showed that sampling size and height threshold were additional key factors that affect the estimation accuracy of biophysical parameters. Therefore, the optimal sampling size and the height threshold should be determined to improve the estimation accuracy of biophysical parameters. Our results also implied that a higher LiDAR point density, larger sampling size and height threshold were required to obtain accurate corn LAI estimation when compared with height and biomass estimations. In general, our results provide valuable guidance for LiDAR data acquisition and estimation of vegetation biophysical parameters using LiDAR data.
Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA.

Science.gov (United States)

Kelly, Brendan J; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D; Collman, Ronald G; Bushman, Frederic D; Li, Hongzhe

2015-08-01

The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence-absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-group distance and exposure/intervention effect size must be accurately modeled to estimate statistical power for a microbiome study that will be analyzed with pairwise distances and PERMANOVA. We present a framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a novel method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared (ω2). From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Estimation of the biserial correlation and its sampling variance for use in meta-analysis.

Science.gov (United States)

Jacobs, Perke; Viechtbauer, Wolfgang

2017-06-01

Meta-analyses are often used to synthesize the findings of studies examining the correlational relationship between two continuous variables. When only dichotomous measurements are available for one of the two variables, the biserial correlation coefficient can be used to estimate the product-moment correlation between the two underlying continuous variables. Unlike the point-biserial correlation coefficient, biserial correlation coefficients can therefore be integrated with product-moment correlation coefficients in the same meta-analysis. The present article describes the estimation of the biserial correlation coefficient for meta-analytic purposes and reports simulation results comparing different methods for estimating the coefficient's sampling variance. The findings indicate that commonly employed methods yield inconsistent estimates of the sampling variance across a broad range of research situations. In contrast, consistent estimates can be obtained using two methods that appear to be unknown in the meta-analytic literature. A variance-stabilizing transformation for the biserial correlation coefficient is described that allows for the construction of confidence intervals for individual coefficients with close to nominal coverage probabilities in most of the examined conditions. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Investigation of Bicycle Travel Time Estimation Using Bluetooth Sensors for Low Sampling Rates

Directory of Open Access Journals (Sweden)

Zhenyu Mei

2014-10-01

Full Text Available Filtering the data for bicycle travel time using Bluetooth sensors is crucial to the estimation of link travel times on a corridor. The current paper describes an adaptive filtering algorithm for estimating bicycle travel times using Bluetooth data, with consideration of low sampling rates. The data for bicycle travel time using Bluetooth sensors has two characteristics. First, the bicycle flow contains stable and unstable conditions. Second, the collected data have low sampling rates (less than 1%. To avoid erroneous inference, filters are introduced to “purify” multiple time series. The valid data are identified within a dynamically varying validity window with the use of a robust data-filtering procedure. The size of the validity window varies based on the number of preceding sampling intervals without a Bluetooth record. Applications of the proposed algorithm to the dataset from Genshan East Road and Moganshan Road in Hangzhou demonstrate its ability to track typical variations in bicycle travel time efficiently, while suppressing high frequency noise signals.
How can we estimate natural selection on endocrine traits? Lessons from evolutionary biology.

Science.gov (United States)

Bonier, Frances; Martin, Paul R

2016-11-30

An evolutionary perspective can enrich almost any endeavour in biology, providing a deeper understanding of the variation we see in nature. To this end, evolutionary endocrinologists seek to describe the fitness consequences of variation in endocrine traits. Much of the recent work in our field, however, follows a flawed approach to the study of how selection shapes endocrine traits. Briefly, this approach relies on among-individual correlations between endocrine phenotypes (often circulating hormone levels) and fitness metrics to estimate selection on those endocrine traits. Adaptive plasticity in both endocrine and fitness-related traits can drive these correlations, generating patterns that do not accurately reflect natural selection. We illustrate why this approach to studying selection on endocrine traits is problematic, referring to work from evolutionary biologists who, decades ago, described this problem as it relates to a variety of other plastic traits. We extend these arguments to evolutionary endocrinology, where the likelihood that this flaw generates bias in estimates of selection is unusually high due to the exceptional responsiveness of hormones to environmental conditions, and their function to induce adaptive life-history responses to environmental variation. We end with a review of productive approaches for investigating the fitness consequences of variation in endocrine traits that we expect will generate exciting advances in our understanding of endocrine system evolution. © 2016 The Author(s).
Acceptance sampling using judgmental and randomly selected samples

Energy Technology Data Exchange (ETDEWEB)

Sego, Landon H.; Shulman, Stanley A.; Anderson, Kevin K.; Wilson, John E.; Pulsipher, Brent A.; Sieber, W. Karl

2010-09-01

We present a Bayesian model for acceptance sampling where the population consists of two groups, each with different levels of risk of containing unacceptable items. Expert opinion, or judgment, may be required to distinguish between the high and low-risk groups. Hence, high-risk items are likely to be identifed (and sampled) using expert judgment, while the remaining low-risk items are sampled randomly. We focus on the situation where all observed samples must be acceptable. Consequently, the objective of the statistical inference is to quantify the probability that a large percentage of the unsampled items in the population are also acceptable. We demonstrate that traditional (frequentist) acceptance sampling and simpler Bayesian formulations of the problem are essentially special cases of the proposed model. We explore the properties of the model in detail, and discuss the conditions necessary to ensure that required samples sizes are non-decreasing function of the population size. The method is applicable to a variety of acceptance sampling problems, and, in particular, to environmental sampling where the objective is to demonstrate the safety of reoccupying a remediated facility that has been contaminated with a lethal agent.
How many dinosaur species were there? Fossil bias and true richness estimated using a Poisson sampling model.

Science.gov (United States)

Starrfelt, Jostein; Liow, Lee Hsiang

2016-04-05

The fossil record is a rich source of information about biological diversity in the past. However, the fossil record is not only incomplete but has also inherent biases due to geological, physical, chemical and biological factors. Our knowledge of past life is also biased because of differences in academic and amateur interests and sampling efforts. As a result, not all individuals or species that lived in the past are equally likely to be discovered at any point in time or space. To reconstruct temporal dynamics of diversity using the fossil record, biased sampling must be explicitly taken into account. Here, we introduce an approach that uses the variation in the number of times each species is observed in the fossil record to estimate both sampling bias and true richness. We term our technique TRiPS (True Richness estimated using a Poisson Sampling model) and explore its robustness to violation of its assumptions via simulations. We then venture to estimate sampling bias and absolute species richness of dinosaurs in the geological stages of the Mesozoic. Using TRiPS, we estimate that 1936 (1543-2468) species of dinosaurs roamed the Earth during the Mesozoic. We also present improved estimates of species richness trajectories of the three major dinosaur clades: the sauropodomorphs, ornithischians and theropods, casting doubt on the Jurassic-Cretaceous extinction event and demonstrating that all dinosaur groups are subject to considerable sampling bias throughout the Mesozoic. © 2016 The Authors.
Remote Sensing Based Two-Stage Sampling for Accuracy Assessment and Area Estimation of Land Cover Changes

Directory of Open Access Journals (Sweden)

Heinz Gallaun

2015-09-01

Full Text Available Land cover change processes are accelerating at the regional to global level. The remote sensing community has developed reliable and robust methods for wall-to-wall mapping of land cover changes; however, land cover changes often occur at rates below the mapping errors. In the current publication, we propose a cost-effective approach to complement wall-to-wall land cover change maps with a sampling approach, which is used for accuracy assessment and accurate estimation of areas undergoing land cover changes, including provision of confidence intervals. We propose a two-stage sampling approach in order to keep accuracy, efficiency, and effort of the estimations in balance. Stratification is applied in both stages in order to gain control over the sample size allocated to rare land cover change classes on the one hand and the cost constraints for very high resolution reference imagery on the other. Bootstrapping is used to complement the accuracy measures and the area estimates with confidence intervals. The area estimates and verification estimations rely on a high quality visual interpretation of the sampling units based on time series of satellite imagery. To demonstrate the cost-effective operational applicability of the approach we applied it for assessment of deforestation in an area characterized by frequent cloud cover and very low change rate in the Republic of Congo, which makes accurate deforestation monitoring particularly challenging.
Estimation of technetium 99m mercaptoacetyltriglycine plasma clearance by use of one single plasma sample

International Nuclear Information System (INIS)

Mueller-Suur, R.; Magnusson, G.; Karolinska Inst., Stockholm; Bois-Svensson, I.; Jansson, B.

1991-01-01

Recent studies have shown that technetium 99m mercaptoacetyltriglycine (MAG-3) is a suitable replacement for iodine 131 or 123 hippurate in gamma-camera renography. Also, the determination of its clearance is of value, since it correlates well with that of hippurate and thus may be an indirect measure of renal plasma flow. In order to simplify the clearance method we developed formulas for the estimation of plasma clearance of MAG-3 based on a single plasma sample and compared them with the multiple sample method based on 7 plasma samples. The correlation to effective renal plasma flow (ERPF) (according to Tauxe's method, using iodine 123 hippurate), which ranged from 75 to 654 ml/min per 1.73 m 2 , was determined in these patients. Using the developed regression equations the error of estimate for the simplified clearance method was acceptably low (18-14 ml/min), when the single plasma sample was taken 44-64 min post-injection. Formulas for different sampling times at 44, 48, 52, 56, 60 and 64 min are given, and we recommend 60 min as optimal, with an error of estimate of 15.5 ml/min. The correlation between the MAG-3 clearances and ERPF was high (r=0.90). Since normal values for MAG-3 clearance are not yet available, transformation to estimated ERPF values by the regression equation (ERPF=1.86xC MAG-3 +4.6) could be of clinical value in order to compare it with the normal values for ERPF given in the literature. (orig.)
An expert system for estimating production rates and costs for hardwood group-selection harvests

Science.gov (United States)

Chris B. LeDoux; B. Gopalakrishnan; R. S. Pabba

2003-01-01

As forest managers shift their focus from stands to entire ecosystems alternative harvesting methods such as group selection are being used increasingly. Results of several field time and motion studies and simulation runs were incorporated into an expert system for estimating production rates and costs associated with harvests of group-selection units of various size...
Automatic smoothing parameter selection in GAMLSS with an application to centile estimation.

Science.gov (United States)

Rigby, Robert A; Stasinopoulos, Dimitrios M

2014-08-01

A method for automatic selection of the smoothing parameters in a generalised additive model for location, scale and shape (GAMLSS) model is introduced. The method uses a P-spline representation of the smoothing terms to express them as random effect terms with an internal (or local) maximum likelihood estimation on the predictor scale of each distribution parameter to estimate its smoothing parameters. This provides a fast method for estimating multiple smoothing parameters. The method is applied to centile estimation where all four parameters of a distribution for the response variable are modelled as smooth functions of a transformed explanatory variable x This allows smooth modelling of the location, scale, skewness and kurtosis parameters of the response variable distribution as functions of x. © The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Are We Under-Estimating the Association between Autism Symptoms?: The Importance of Considering Simultaneous Selection When Using Samples of Individuals Who Meet Diagnostic Criteria for an Autism Spectrum Disorder

Science.gov (United States)

Murray, Aja Louise; McKenzie, Karen; Kuenssberg, Renate; O'Donnell, Michael

2014-01-01

The magnitude of symptom inter-correlations in diagnosed individuals has contributed to the evidence that autism spectrum disorders (ASD) is a fractionable disorder. Such correlations may substantially under-estimate the population correlations among symptoms due to simultaneous selection on the areas of deficit required for diagnosis. Using…
Differences in Movement Pattern and Detectability between Males and Females Influence How Common Sampling Methods Estimate Sex Ratio.

Directory of Open Access Journals (Sweden)

João Fabrício Mota Rodrigues

Full Text Available Sampling the biodiversity is an essential step for conservation, and understanding the efficiency of sampling methods allows us to estimate the quality of our biodiversity data. Sex ratio is an important population characteristic, but until now, no study has evaluated how efficient are the sampling methods commonly used in biodiversity surveys in estimating the sex ratio of populations. We used a virtual ecologist approach to investigate whether active and passive capture methods are able to accurately sample a population's sex ratio and whether differences in movement pattern and detectability between males and females produce biased estimates of sex-ratios when using these methods. Our simulation allowed the recognition of individuals, similar to mark-recapture studies. We found that differences in both movement patterns and detectability between males and females produce biased estimates of sex ratios. However, increasing the sampling effort or the number of sampling days improves the ability of passive or active capture methods to properly sample sex ratio. Thus, prior knowledge regarding movement patterns and detectability for species is important information to guide field studies aiming to understand sex ratio related patterns.
Differences in Movement Pattern and Detectability between Males and Females Influence How Common Sampling Methods Estimate Sex Ratio.

Science.gov (United States)

Rodrigues, João Fabrício Mota; Coelho, Marco Túlio Pacheco

2016-01-01

Sampling the biodiversity is an essential step for conservation, and understanding the efficiency of sampling methods allows us to estimate the quality of our biodiversity data. Sex ratio is an important population characteristic, but until now, no study has evaluated how efficient are the sampling methods commonly used in biodiversity surveys in estimating the sex ratio of populations. We used a virtual ecologist approach to investigate whether active and passive capture methods are able to accurately sample a population's sex ratio and whether differences in movement pattern and detectability between males and females produce biased estimates of sex-ratios when using these methods. Our simulation allowed the recognition of individuals, similar to mark-recapture studies. We found that differences in both movement patterns and detectability between males and females produce biased estimates of sex ratios. However, increasing the sampling effort or the number of sampling days improves the ability of passive or active capture methods to properly sample sex ratio. Thus, prior knowledge regarding movement patterns and detectability for species is important information to guide field studies aiming to understand sex ratio related patterns.

Statistical sampling strategies for survey of soil contamination

NARCIS (Netherlands)

Brus, D.J.

2011-01-01

This chapter reviews methods for selecting sampling locations in contaminated soils for three situations. In the first situation a global estimate of the soil contamination in an area is required. The result of the surey is a number or a series of numbers per contaminant, e.g. the estimated mean
Methods for estimating concentrations and loads of selected constituents in tributaries to Lake Houston near Houston, Texas

Science.gov (United States)

Lee, Michael T.

2012-01-01

Since December 2005, the U.S. Geological Survey, in cooperation with the City of Houston, Texas, has been assessing the quality of the water flowing into Lake Houston. Continuous in-stream water-quality monitors measured streamflow and other physical water quality properties at stations in Spring Creek near Spring, Tex., and East Fork San Jacinto River near New Caney, Tex. Additionally, discrete water-quality samples were periodically collected on these tributaries and analyzed for selected constituents of concern. Data from the discrete water-quality samples collected during 2005-9, in conjunction with the real-time streamflow data and data from the continuous in-stream water-quality monitors, provided the basis for developing regression equations for the estimation of concentrations of water-quality constituents of these source watersheds to Lake Houston. The output of the regression equations are available through the interactive National Real-Time Water Quality Web site (http://nrtwq.usgs.gov).
Variance estimation for generalized Cavalieri estimators

OpenAIRE

Johanna Ziegel; Eva B. Vedel Jensen; Karl-Anton Dorph-Petersen

2011-01-01

The precision of stereological estimators based on systematic sampling is of great practical importance. This paper presents methods of data-based variance estimation for generalized Cavalieri estimators where errors in sampling positions may occur. Variance estimators are derived under perturbed systematic sampling, systematic sampling with cumulative errors and systematic sampling with random dropouts. Copyright 2011, Oxford University Press.
Bayesian stratified sampling to assess corpus utility

Energy Technology Data Exchange (ETDEWEB)

Hochberg, J.; Scovel, C.; Thomas, T.; Hall, S.

1998-12-01

This paper describes a method for asking statistical questions about a large text corpus. The authors exemplify the method by addressing the question, ``What percentage of Federal Register documents are real documents, of possible interest to a text researcher or analyst?`` They estimate an answer to this question by evaluating 200 documents selected from a corpus of 45,820 Federal Register documents. Bayesian analysis and stratified sampling are used to reduce the sampling uncertainty of the estimate from over 3,100 documents to fewer than 1,000. A possible application of the method is to establish baseline statistics used to estimate recall rates for information retrieval systems.
An optimized Line Sampling method for the estimation of the failure probability of nuclear passive systems

International Nuclear Information System (INIS)

Zio, E.; Pedroni, N.

2010-01-01

The quantitative reliability assessment of a thermal-hydraulic (T-H) passive safety system of a nuclear power plant can be obtained by (i) Monte Carlo (MC) sampling the uncertainties of the system model and parameters, (ii) computing, for each sample, the system response by a mechanistic T-H code and (iii) comparing the system response with pre-established safety thresholds, which define the success or failure of the safety function. The computational effort involved can be prohibitive because of the large number of (typically long) T-H code simulations that must be performed (one for each sample) for the statistical estimation of the probability of success or failure. In this work, Line Sampling (LS) is adopted for efficient MC sampling. In the LS method, an 'important direction' pointing towards the failure domain of interest is determined and a number of conditional one-dimensional problems are solved along such direction; this allows for a significant reduction of the variance of the failure probability estimator, with respect, for example, to standard random sampling. Two issues are still open with respect to LS: first, the method relies on the determination of the 'important direction', which requires additional runs of the T-H code; second, although the method has been shown to improve the computational efficiency by reducing the variance of the failure probability estimator, no evidence has been given yet that accurate and precise failure probability estimates can be obtained with a number of samples reduced to below a few hundreds, which may be required in case of long-running models. The work presented in this paper addresses the first issue by (i) quantitatively comparing the efficiency of the methods proposed in the literature to determine the LS important direction; (ii) employing artificial neural network (ANN) regression models as fast-running surrogates of the original, long-running T-H code to reduce the computational cost associated to the
Uncertainty Estimation of Neutron Activation Analysis in Zinc Elemental Determination in Food Samples

International Nuclear Information System (INIS)

Endah Damastuti; Muhayatun; Diah Dwiana L

2009-01-01

Beside to complished the requirements of international standard of ISO/IEC 17025:2005, uncertainty estimation should be done to increase quality and confidence of analysis results and also to establish traceability of the analysis results to SI unit. Neutron activation analysis is a major technique used by Radiometry technique analysis laboratory and is included as scope of accreditation under ISO/IEC 17025:2005, therefore uncertainty estimation of neutron activation analysis is needed to be carried out. Sample and standard preparation as well as, irradiation and measurement using gamma spectrometry were the main activities which could give contribution to uncertainty. The components of uncertainty sources were specifically explained. The result of expanded uncertainty was 4,0 mg/kg with level of confidence 95% (coverage factor=2) and Zn concentration was 25,1 mg/kg. Counting statistic of cuplikan and standard were the major contribution of combined uncertainty. The uncertainty estimation was expected to increase the quality of the analysis results and could be applied further to other kind of samples. (author)
A novel method of selective removal of human DNA improves PCR sensitivity for detection of Salmonella Typhi in blood samples.

Science.gov (United States)

Zhou, Liqing; Pollard, Andrew J

2012-07-27

Enteric fever is a major public health problem, causing an estimated 21million new cases and 216,000 or more deaths every year. Current diagnosis of the disease is inadequate. Blood culture only identifies 45 to 70% of the cases and is time-consuming. Serological tests have very low sensitivity and specificity. Clinical samples obtained for diagnosis of enteric fever in the field generally have blood, so that even PCR-based methods, widely used for detection of other infectious diseases, are not a straightforward option in typhoid diagnosis. We developed a novel method to enrich target bacterial DNA by selective removal of human DNA from blood samples, enhancing the sensitivity of PCR tests. This method offers the possibility of improving PCR assays directly using clinical specimens for diagnosis of this globally important infectious disease. Blood samples were mixed with ox bile for selective lysis of human blood cells and the released human DNA was then digested with addition of bile resistant micrococcal nuclease. The intact Salmonella Typhi bacteria were collected from the specimen by centrifugation and the DNA extracted with QIAamp DNA mini kit. The presence of Salmonella Typhi bacteria in blood samples was detected by PCR with the fliC-d gene of Salmonella Typhi as the target. Micrococcal nuclease retained activity against human blood DNA in the presence of up to 9% ox bile. Background human DNA was dramatically removed from blood samples through the use of ox bile lysis and micrococcal nuclease for removal of mammalian DNA. Consequently target Salmonella Typhi DNA was enriched in DNA preparations and the PCR sensitivity for detection of Salmonella Typhi in spiked blood samples was enhanced by 1,000 fold. Use of a combination of selective ox-bile blood cell lysis and removal of human DNA with micrococcal nuclease significantly improves PCR sensitivity and offers a better option for improved typhoid PCR assays directly using clinical specimens in diagnosis of
Some remarks on estimating a covariance structure model from a sample correlation matrix

OpenAIRE

Maydeu Olivares, Alberto; Hernández Estrada, Adolfo

2000-01-01

A popular model in structural equation modeling involves a multivariate normal density with a structured covariance matrix that has been categorized according to a set of thresholds. In this setup one may estimate the covariance structure parameters from the sample tetrachoricl polychoric correlations but only if the covariance structure is scale invariant. Doing so when the covariance structure is not scale invariant results in estimating a more restricted covariance structure than the one i...
New competitive dendrimer-based and highly selective immunosensor for determination of atrazine in environmental, feed and food samples: the importance of antibody selectivity for discrimination among related triazinic metabolites.

Science.gov (United States)

Giannetto, Marco; Umiltà, Eleonora; Careri, Maria

2014-01-02

A new voltammetric competitive immunosensor selective for atrazine, based on the immobilization of a conjugate atrazine-bovine serum albumine on a nanostructured gold substrate previously functionalized with poliamidoaminic dendrimers, was realized, characterized, and validated in different real samples of environmental and food concern. Response of the sensor was reliable, highly selective and suitable for the detection and quantification of atrazine at trace levels in complex matrices such as territorial waters, corn-cultivated soils, corn-containing poultry and bovine feeds and corn flakes for human use. Selectivity studies were focused on desethylatrazine, the principal metabolite generated by long-term microbiological degradation of atrazine, terbutylazine-2-hydroxy and simazine as potential interferents. The response of the developed immunosensor for atrazine was explored over the 10(-2)-10(3) ng mL(-1) range. Good sensitivity was proved, as limit of detection and limit of quantitation of 1.2 and 5 ng mL(-1), respectively, were estimated for atrazine. RSD values <5% over the entire explored range attested a good precision of the device. Copyright © 2013 Elsevier B.V. All rights reserved.
Correcting for Systematic Bias in Sample Estimates of Population Variances: Why Do We Divide by n-1?

Science.gov (United States)

Mittag, Kathleen Cage

An important topic presented in introductory statistics courses is the estimation of population parameters using samples. Students learn that when estimating population variances using sample data, we always get an underestimate of the population variance if we divide by n rather than n-1. One implication of this correction is that the degree of…
The Gender Wage Gap and Sample Selection via Risk Attitudes

OpenAIRE

Jung , Seeun

2014-01-01

This paper investigates a new way to estimate the gender wage gap with the introduction of individual risk attitudes using representative Korean data. We es- timate the wage gap with correction for the selection bias, which latter results in the overestimation of this wage gap. Female workers are more risk averse. They hence prefer working in the public sector, where wages are generally lower than in the private sector. It goes on to explain the reduced gender wage gap by develop- ing an appr...
Genotyping faecal samples of Bengal tiger Panthera tigris tigris for population estimation: A pilot study

Directory of Open Access Journals (Sweden)

Singh Lalji

2006-10-01

Full Text Available Abstract Background Bengal tiger Panthera tigris tigris the National Animal of India, is an endangered species. Estimating populations for such species is the main objective for designing conservation measures and for evaluating those that are already in place. Due to the tiger's cryptic and secretive behaviour, it is not possible to enumerate and monitor its populations through direct observations; instead indirect methods have always been used for studying tigers in the wild. DNA methods based on non-invasive sampling have not been attempted so far for tiger population studies in India. We describe here a pilot study using DNA extracted from faecal samples of tigers for the purpose of population estimation. Results In this study, PCR primers were developed based on tiger-specific variations in the mitochondrial cytochrome b for reliably identifying tiger faecal samples from those of sympatric carnivores. Microsatellite markers were developed for the identification of individual tigers with a sibling Probability of Identity of 0.005 that can distinguish even closely related individuals with 99.9% certainty. The effectiveness of using field-collected tiger faecal samples for DNA analysis was evaluated by sampling, identification and subsequently genotyping samples from two protected areas in southern India. Conclusion Our results demonstrate the feasibility of using tiger faecal matter as a potential source of DNA for population estimation of tigers in protected areas in India in addition to the methods currently in use.
Application of marker selection to enhance estimation of genetic effects and gene interaction in cattle

Science.gov (United States)

Selection on important genetic markers can improve estimates of additive and dominance association effects. A composite population of beef cattle was selected for intermediate frequencies of myostatin (GDF8) F94L and µ-calpain (CAPN1) polymorphisms. Important additive associations of the GDF8 locu...
Comparison of sampling designs for estimating deforestation from landsat TM and MODIS imagery: a case study in Mato Grosso, Brazil.

Science.gov (United States)

Zhu, Shanyou; Zhang, Hailong; Liu, Ronggao; Cao, Yun; Zhang, Guixin

2014-01-01

Sampling designs are commonly used to estimate deforestation over large areas, but comparisons between different sampling strategies are required. Using PRODES deforestation data as a reference, deforestation in the state of Mato Grosso in Brazil from 2005 to 2006 is evaluated using Landsat imagery and a nearly synchronous MODIS dataset. The MODIS-derived deforestation is used to assist in sampling and extrapolation. Three sampling designs are compared according to the estimated deforestation of the entire study area based on simple extrapolation and linear regression models. The results show that stratified sampling for strata construction and sample allocation using the MODIS-derived deforestation hotspots provided more precise estimations than simple random and systematic sampling. Moreover, the relationship between the MODIS-derived and TM-derived deforestation provides a precise estimate of the total deforestation area as well as the distribution of deforestation in each block.
Estimating sample size for landscape-scale mark-recapture studies of North American migratory tree bats

Science.gov (United States)

Ellison, Laura E.; Lukacs, Paul M.

2014-01-01

Concern for migratory tree-roosting bats in North America has grown because of possible population declines from wind energy development. This concern has driven interest in estimating population-level changes. Mark-recapture methodology is one possible analytical framework for assessing bat population changes, but sample size requirements to produce reliable estimates have not been estimated. To illustrate the sample sizes necessary for a mark-recapture-based monitoring program we conducted power analyses using a statistical model that allows reencounters of live and dead marked individuals. We ran 1,000 simulations for each of five broad sample size categories in a Burnham joint model, and then compared the proportion of simulations in which 95% confidence intervals overlapped between and among years for a 4-year study. Additionally, we conducted sensitivity analyses of sample size to various capture probabilities and recovery probabilities. More than 50,000 individuals per year would need to be captured and released to accurately determine 10% and 15% declines in annual survival. To detect more dramatic declines of 33% or 50% survival over four years, then sample sizes of 25,000 or 10,000 per year, respectively, would be sufficient. Sensitivity analyses reveal that increasing recovery of dead marked individuals may be more valuable than increasing capture probability of marked individuals. Because of the extraordinary effort that would be required, we advise caution should such a mark-recapture effort be initiated because of the difficulty in attaining reliable estimates. We make recommendations for what techniques show the most promise for mark-recapture studies of bats because some techniques violate the assumptions of mark-recapture methodology when used to mark bats.
Sex Estimation From Modern American Humeri and Femora, Accounting for Sample Variance Structure

DEFF Research Database (Denmark)

Boldsen, J. L.; Milner, G. R.; Boldsen, S. K.

2015-01-01

several decades. Results: For measurements individually and collectively, the probabilities of being one sex or the other were generated for samples with an equal distribution of males and females, taking into account the variance structure of the original measurements. The combination providing the best......Objectives: A new procedure for skeletal sex estimation based on humeral and femoral dimensions is presented, based on skeletons from the United States. The approach specifically addresses the problem that arises from a lack of variance homogeneity between the sexes, taking into account prior...... information about the sample's sex ratio, if known. Material and methods: Three measurements useful for estimating the sex of adult skeletons, the humeral and femoral head diameters and the humeral epicondylar breadth, were collected from 258 Americans born between 1893 and 1980 who died within the past...
Sampling procedure in a willow plantation for estimation of moisture content

DEFF Research Database (Denmark)

Nielsen, Henrik Kofoed; Lærke, Poul Erik; Liu, Na

2015-01-01

Heating value and fuel quality of wood is closely connected to moisture content. In this work the variation of moisture content (MC) of short rotation coppice (SRC) willow shoots is described for five clones during one harvesting season. Subsequently an appropriate sampling procedure minimising...... labour costs and sampling uncertainty is proposed, where the MC of a single stem section with the length of 10–50 cm corresponds to the mean shoot moisture content (MSMC) with a bias of maximum 11 g kg−1. This bias can be reduced by selecting the stem section according to the particular clone...
Bomb survivor selection and consequences for estimates of population cancer risks

International Nuclear Information System (INIS)

Little, M.P.; Charles, M.W.

1990-01-01

Health records of the Japanese bomb survivor population [with the 1965 (T65D) and 1986 (DS86) dosimetry systems] have been analyzed and some evidence found for the selection effect hypothesized by Stewart and Kneale. This is found to be significant in only the first of the periods examined (1950-1958), and the effect diminishes in magnitude thereafter. There are indications that the effect might be an artifact of the T65D dosimetry, in which it is observed more strongly than in the DS86 data. There is no evidence to suggest that selection on this basis might confer correspondingly reduced susceptibility to radiation-induced cancer. If, however, one makes this assumption, as suggested by Stewart and Kneale, then current estimates of population cancer risks might need to be inflated by between 5% and 35% (for excess cancer deaths, Gy-1) or between 8% and 40% (for years of life lost, Gy-1) to account for this. It is likely that these figures, even assuming them not to be simply an artifact of the T65D dosimetry, overestimate the degree of adjustment required to the risk estimates
Detailed RIF decomposition with selection : the gender pay gap in Italy

OpenAIRE

Töpfer, Marina

2017-01-01

In this paper, we estimate the gender pay gap along the wage distribution using a detailed decomposition approach based on unconditional quantile regressions. Non-randomness of the sample leads to biased and inconsistent estimates of the wage equation as well as of the components of the wage gap. Therefore, the method is extended to account for sample selection problems. The decomposition is conducted by using Italian microdata. Accounting for labor market selection may be particularly rele...
Matrix algebra and sampling theory : The case of the Horvitz-Thompson estimator

NARCIS (Netherlands)

Dol, W.; Steerneman, A.G.M.; Wansbeek, T.J.

Matrix algebra is a tool not commonly employed in sampling theory. The intention of this paper is to help change this situation by showing, in the context of the Horvitz-Thompson (HT) estimator, the convenience of the use of a number of matrix-algebra results. Sufficient conditions for the

Evaluation of Stress Loaded Steel Samples Using Selected Electromagnetic Methods

International Nuclear Information System (INIS)

Chady, T.

2004-01-01

In this paper the magnetic leakage flux and eddy current method were used to evaluate changes of materials' properties caused by stress. Seven samples made of ferromagnetic material with different level of applied stress were prepared. First, the leakage magnetic fields were measured by scanning the surface of the specimens with GMR gradiometer. Next, the same samples were evaluated using an eddy current sensor. A comparison between results obtained from both methods was carried out. Finally, selected parameters of the measured signal were calculated and utilized to evaluate level of the applied stress. A strong coincidence between amount of the applied stress and the maximum amplitude of the derivative was confirmed
Random Walks on Directed Networks: Inference and Respondent-Driven Sampling

Directory of Open Access Journals (Sweden)

Malmros Jens

2016-06-01

Full Text Available Respondent-driven sampling (RDS is often used to estimate population properties (e.g., sexual risk behavior in hard-to-reach populations. In RDS, already sampled individuals recruit population members to the sample from their social contacts in an efficient snowball-like sampling procedure. By assuming a Markov model for the recruitment of individuals, asymptotically unbiased estimates of population characteristics can be obtained. Current RDS estimation methodology assumes that the social network is undirected, that is, all edges are reciprocal. However, empirical social networks in general also include a substantial number of nonreciprocal edges. In this article, we develop an estimation method for RDS in populations connected by social networks that include reciprocal and nonreciprocal edges. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing edges of sampled individuals. The proposed estimators are evaluated on artificial and empirical networks and are shown to generally perform better than existing estimators. This is the case in particular when the fraction of directed edges in the network is large.
Long-term response to genomic selection: effects of estimation method and reference population structure for different genetic architectures.

Science.gov (United States)

Bastiaansen, John W M; Coster, Albart; Calus, Mario P L; van Arendonk, Johan A M; Bovenhuis, Henk

2012-01-24

Genomic selection has become an important tool in the genetic improvement of animals and plants. The objective of this study was to investigate the impacts of breeding value estimation method, reference population structure, and trait genetic architecture, on long-term response to genomic selection without updating marker effects. Three methods were used to estimate genomic breeding values: a BLUP method with relationships estimated from genome-wide markers (GBLUP), a Bayesian method, and a partial least squares regression method (PLSR). A shallow (individuals from one generation) or deep reference population (individuals from five generations) was used with each method. The effects of the different selection approaches were compared under four different genetic architectures for the trait under selection. Selection was based on one of the three genomic breeding values, on pedigree BLUP breeding values, or performed at random. Selection continued for ten generations. Differences in long-term selection response were small. For a genetic architecture with a very small number of three to four quantitative trait loci (QTL), the Bayesian method achieved a response that was 0.05 to 0.1 genetic standard deviation higher than other methods in generation 10. For genetic architectures with approximately 30 to 300 QTL, PLSR (shallow reference) or GBLUP (deep reference) had an average advantage of 0.2 genetic standard deviation over the Bayesian method in generation 10. GBLUP resulted in 0.6% and 0.9% less inbreeding than PLSR and BM and on average a one third smaller reduction of genetic variance. Responses in early generations were greater with the shallow reference population while long-term response was not affected by reference population structure. The ranking of estimation methods was different with than without selection. Under selection, applying GBLUP led to lower inbreeding and a smaller reduction of genetic variance while a similar response to selection was
Generalized Likelihood Uncertainty Estimation (GLUE) Using Multi-Optimization Algorithm as Sampling Method

Science.gov (United States)

Wang, Z.

2015-12-01

For decades, distributed and lumped hydrological models have furthered our understanding of hydrological system. The development of hydrological simulation in large scale and high precision elaborated the spatial descriptions and hydrological behaviors. Meanwhile, the new trend is also followed by the increment of model complexity and number of parameters, which brings new challenges of uncertainty quantification. Generalized Likelihood Uncertainty Estimation (GLUE) has been widely used in uncertainty analysis for hydrological models referring to Monte Carlo method coupled with Bayesian estimation. However, the stochastic sampling method of prior parameters adopted by GLUE appears inefficient, especially in high dimensional parameter space. The heuristic optimization algorithms utilizing iterative evolution show better convergence speed and optimality-searching performance. In light of the features of heuristic optimization algorithms, this study adopted genetic algorithm, differential evolution, shuffled complex evolving algorithm to search the parameter space and obtain the parameter sets of large likelihoods. Based on the multi-algorithm sampling, hydrological model uncertainty analysis is conducted by the typical GLUE framework. To demonstrate the superiority of the new method, two hydrological models of different complexity are examined. The results shows the adaptive method tends to be efficient in sampling and effective in uncertainty analysis, providing an alternative path for uncertainty quantilization.
Magnetically separable polymer (Mag-MIP) for selective analysis of biotin in food samples.

Science.gov (United States)

Uzuriaga-Sánchez, Rosario Josefina; Khan, Sabir; Wong, Ademar; Picasso, Gino; Pividori, Maria Isabel; Sotomayor, Maria Del Pilar Taboada

2016-01-01

This work presents an efficient method for the preparation of magnetic nanoparticles modified with molecularly imprinted polymers (Mag-MIP) through core-shell method for the determination of biotin in milk food samples. The functional monomer acrylic acid was selected from molecular modeling, EGDMA was used as cross-linking monomer and AIBN as radical initiator. The Mag-MIP and Mag-NIP were characterized by FTIR, magnetic hysteresis, XRD, SEM and N2-sorption measurements. The capacity of Mag-MIP for biotin adsorption, its kinetics and selectivity were studied in detail. The adsorption data was well described by Freundlich isotherm model with adsorption equilibrium constant (KF) of 1.46 mL g(-1). The selectivity experiments revealed that prepared Mag-MIP had higher selectivity toward biotin compared to other molecules with different chemical structure. The material was successfully applied for the determination of biotin in diverse milk samples using HPLC for quantification of the analyte, obtaining the mean value of 87.4% recovery. Copyright © 2015 Elsevier Ltd. All rights reserved.
Probability Sampling - A Guideline for Quantitative Health Care ...

African Journals Online (AJOL)

A more direct definition is the method used for selecting a given ... description of the chosen population, the sampling procedure giving ... target population, precision, and stratification. The ... survey estimates, it is recommended that researchers first analyze a .... The optimum sample size has a relation to the type of planned ...
Comparison of Sampling Designs for Estimating Deforestation from Landsat TM and MODIS Imagery: A Case Study in Mato Grosso, Brazil

Directory of Open Access Journals (Sweden)

Shanyou Zhu

2014-01-01

Full Text Available Sampling designs are commonly used to estimate deforestation over large areas, but comparisons between different sampling strategies are required. Using PRODES deforestation data as a reference, deforestation in the state of Mato Grosso in Brazil from 2005 to 2006 is evaluated using Landsat imagery and a nearly synchronous MODIS dataset. The MODIS-derived deforestation is used to assist in sampling and extrapolation. Three sampling designs are compared according to the estimated deforestation of the entire study area based on simple extrapolation and linear regression models. The results show that stratified sampling for strata construction and sample allocation using the MODIS-derived deforestation hotspots provided more precise estimations than simple random and systematic sampling. Moreover, the relationship between the MODIS-derived and TM-derived deforestation provides a precise estimate of the total deforestation area as well as the distribution of deforestation in each block.
Selective Segmentation for Global Optimization of Depth Estimation in Complex Scenes

Directory of Open Access Journals (Sweden)

Sheng Liu

2013-01-01

Full Text Available This paper proposes a segmentation-based global optimization method for depth estimation. Firstly, for obtaining accurate matching cost, the original local stereo matching approach based on self-adapting matching window is integrated with two matching cost optimization strategies aiming at handling both borders and occlusion regions. Secondly, we employ a comprehensive smooth term to satisfy diverse smoothness request in real scene. Thirdly, a selective segmentation term is used for enforcing the plane trend constraints selectively on the corresponding segments to further improve the accuracy of depth results from object level. Experiments on the Middlebury image pairs show that the proposed global optimization approach is considerably competitive with other state-of-the-art matching approaches.
Selection of Sampling Pumps Used for Groundwater Monitoring at the Hanford Site

Energy Technology Data Exchange (ETDEWEB)

Schalla, Ronald; Webber, William D.; Smith, Ronald M.

2001-11-05

The variable frequency drive centrifugal submersible pump, Redi-Flo2a made by Grundfosa, was selected for universal application for Hanford Site groundwater monitoring. Specifications for the selected pump and five other pumps were evaluated against current and future Hanford groundwater monitoring performance requirements, and the Redi-Flo2 was selected as the most versatile and applicable for the range of monitoring conditions. The Redi-Flo2 pump distinguished itself from the other pumps considered because of its wide range in output flow rate and its comparatively moderate maintenance and low capital costs. The Redi-Flo2 pump is able to purge a well at a high flow rate and then supply water for sampling at a low flow rate. Groundwater sampling using a low-volume-purging technique (e.g., low flow, minimal purge, no purge, or micropurgea) is planned in the future, eliminating the need for the pump to supply a high-output flow rate. Under those conditions, the Well Wizard bladder pump, manufactured by QED Environmental Systems, Inc., may be the preferred pump because of the lower capital cost.
Passive sampling of selected endocrine disrupting compounds using polar organic chemical integrative samplers

International Nuclear Information System (INIS)

Arditsoglou, Anastasia; Voutsa, Dimitra

2008-01-01

Two types of polar organic chemical integrative samplers (pharmaceutical POCIS and pesticide POCIS) were examined for their sampling efficiency of selected endocrine disrupting compounds (EDCs). Laboratory-based calibration of POCISs was conducted by exposing them at high and low concentrations of 14 EDCs (4-alkyl-phenols, their ethoxylate oligomers, bisphenol A, selected estrogens and synthetic steroids) for different time periods. The kinetic studies showed an integrative uptake up to 28 days. The sampling rates for the individual compounds were obtained. The use of POCISs could result in an integrative approach to the quality status of the aquatic systems especially in the case of high variation of water concentrations of EDCs. The sampling efficiency of POCISs under various field conditions was assessed after their deployment in different aquatic environments. - Calibration and field performance of polar organic integrative samplers for monitoring EDCs in aquatic environments
Design and cost estimate for the SRL integrated hot off gas facility using selective adsorption

International Nuclear Information System (INIS)

Pence, D.T.; Kirstein, B.E.

1981-07-01

Based on the results of an engineering-scale demonstration program, a design and cost estimate were performed for a 25-m 3 /h (15-ft 3 /min) capacity pilot plant demonstration system using selective adsorption technology for installation at the Integrated Hot Off Gas Facility at the Savannah River Plant. The design includes provisions for the destruction of NO/sub x/ and the concentration and removal of radioisotopes of ruthenium, iodine-129, tritiated water vapor, carbon-14 contaminated carbon dioxide, and krypton-85. The nobel gases are separated by the use of selective adsorption on mordenite-type zeolites. The theory of noble gas adsorption on zeolites is essentially the same as that for the adsorption of noble gases on activated charcoals. Considerable detail is provided regarding the application of the theory to adsorbent bed designs and operation. The design is based on a comprehensive material balance and appropriate heat transfer calculations. Details are provided on techniques and procedures used for heating, cooling, and desorbing the adsorbent columns. Analyses are also given regarding component and arrangement selection and includes discussions on alternative arrangements. The estimated equipment costs for the described treatment system is about $1,400,000. The cost estimate includes a detailed equipment list of all the major component items in the design. Related technical issues and estimated system performance are also discussed
Design and cost estimate for the SRL integrated hot off gas facility using selective adsorption

Energy Technology Data Exchange (ETDEWEB)

Pence, D T; Kirstein, B E

1981-07-01

Based on the results of an engineering-scale demonstration program, a design and cost estimate were performed for a 25-m/sup 3//h (15-ft/sup 3//min) capacity pilot plant demonstration system using selective adsorption technology for installation at the Integrated Hot Off Gas Facility at the Savannah River Plant. The design includes provisions for the destruction of NO/sub x/ and the concentration and removal of radioisotopes of ruthenium, iodine-129, tritiated water vapor, carbon-14 contaminated carbon dioxide, and krypton-85. The nobel gases are separated by the use of selective adsorption on mordenite-type zeolites. The theory of noble gas adsorption on zeolites is essentially the same as that for the adsorption of noble gases on activated charcoals. Considerable detail is provided regarding the application of the theory to adsorbent bed designs and operation. The design is based on a comprehensive material balance and appropriate heat transfer calculations. Details are provided on techniques and procedures used for heating, cooling, and desorbing the adsorbent columns. Analyses are also given regarding component and arrangement selection and includes discussions on alternative arrangements. The estimated equipment costs for the described treatment system is about $1,400,000. The cost estimate includes a detailed equipment list of all the major component items in the design. Related technical issues and estimated system performance are also discussed.
Optimal difference-based estimation for partially linear models

KAUST Repository

Zhou, Yuejin; Cheng, Yebin; Dai, Wenlin; Tong, Tiejun

2017-01-01

Difference-based methods have attracted increasing attention for analyzing partially linear models in the recent literature. In this paper, we first propose to solve the optimal sequence selection problem in difference-based estimation for the linear component. To achieve the goal, a family of new sequences and a cross-validation method for selecting the adaptive sequence are proposed. We demonstrate that the existing sequences are only extreme cases in the proposed family. Secondly, we propose a new estimator for the residual variance by fitting a linear regression method to some difference-based estimators. Our proposed estimator achieves the asymptotic optimal rate of mean squared error. Simulation studies also demonstrate that our proposed estimator performs better than the existing estimator, especially when the sample size is small and the nonparametric function is rough.
Optimal difference-based estimation for partially linear models

KAUST Repository

Zhou, Yuejin

2017-12-16

Difference-based methods have attracted increasing attention for analyzing partially linear models in the recent literature. In this paper, we first propose to solve the optimal sequence selection problem in difference-based estimation for the linear component. To achieve the goal, a family of new sequences and a cross-validation method for selecting the adaptive sequence are proposed. We demonstrate that the existing sequences are only extreme cases in the proposed family. Secondly, we propose a new estimator for the residual variance by fitting a linear regression method to some difference-based estimators. Our proposed estimator achieves the asymptotic optimal rate of mean squared error. Simulation studies also demonstrate that our proposed estimator performs better than the existing estimator, especially when the sample size is small and the nonparametric function is rough.
Using the Violence Risk Scale-Sexual Offense version in sexual violence risk assessments: Updated risk categories and recidivism estimates from a multisite sample of treated sexual offenders.

Science.gov (United States)

Olver, Mark E; Mundt, James C; Thornton, David; Beggs Christofferson, Sarah M; Kingston, Drew A; Sowden, Justina N; Nicholaichuk, Terry P; Gordon, Audrey; Wong, Stephen C P

2018-04-30

The present study sought to develop updated risk categories and recidivism estimates for the Violence Risk Scale-Sexual Offense version (VRS-SO; Wong, Olver, Nicholaichuk, & Gordon, 2003-2017), a sexual offender risk assessment and treatment planning tool. The overarching purpose was to increase the clarity and accuracy of communicating risk assessment information that includes a systematic incorporation of new information (i.e., change) to modify risk estimates. Four treated samples of sexual offenders with VRS-SO pretreatment, posttreatment, and Static-99R ratings were combined with a minimum follow-up period of 10-years postrelease (N = 913). Logistic regression was used to model 5- and 10-year sexual and violent (including sexual) recidivism estimates across 6 different regression models employing specific risk and change score information from the VRS-SO and/or Static-99R. A rationale is presented for clinical applications of select models and the necessity of controlling for baseline risk when utilizing change information across repeated assessments. Information concerning relative risk (percentiles) and absolute risk (recidivism estimates) is integrated with common risk assessment language guidelines to generate new risk categories for the VRS-SO. Guidelines for model selection and forensic clinical application of the risk estimates are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Reserves' potential of sedimentary basin: modeling and estimation; Potentiel de reserves d'un bassin petrolier: modelisation et estimation

Energy Technology Data Exchange (ETDEWEB)

Lepez, V

2002-12-01

The aim of this thesis is to build a statistical model of oil and gas fields' sizes distribution in a given sedimentary basin, for both the fields that exist in:the subsoil and those which have already been discovered. The estimation of all the parameters of the model via estimation of the density of the observations by model selection of piecewise polynomials by penalized maximum likelihood techniques enables to provide estimates of the total number of fields which are yet to be discovered, by class of size. We assume that the set of underground fields' sizes is an i.i.d. sample of unknown population with Levy-Pareto law with unknown parameter. The set of already discovered fields is a sub-sample without replacement from the previous which is 'size-biased'. The associated inclusion probabilities are to be estimated. We prove that the probability density of the observations is the product of the underlying density and of an unknown weighting function representing the sampling bias. An arbitrary partition of the sizes' interval being set (called a model), the analytical solutions of likelihood maximization enables to estimate both the parameter of the underlying Levy-Pareto law and the weighting function, which is assumed to be piecewise constant and based upon the partition. We shall add a monotonousness constraint over the latter, taking into account the fact that the bigger a field, the higher its probability of being discovered. Horvitz-Thompson-like estimators finally give the conclusion. We then allow our partitions to vary inside several classes of models and prove a model selection theorem which aims at selecting the best partition within a class, in terms of both Kuilback and Hellinger risk of the associated estimator. We conclude by simulations and various applications to real data from sedimentary basins of four continents, in order to illustrate theoretical as well as practical aspects of our model. (author)
Estimating instream constituent loads using replicate synoptic sampling, Peru Creek, Colorado

Science.gov (United States)

Runkel, Robert L.; Walton-Day, Katherine; Kimball, Briant A.; Verplanck, Philip L.; Nimick, David A.

2013-01-01

The synoptic mass balance approach is often used to evaluate constituent mass loading in streams affected by mine drainage. Spatial profiles of constituent mass load are used to identify sources of contamination and prioritize sites for remedial action. This paper presents a field scale study in which replicate synoptic sampling campaigns are used to quantify the aggregate uncertainty in constituent load that arises from (1) laboratory analyses of constituent and tracer concentrations, (2) field sampling error, and (3) temporal variation in concentration from diel constituent cycles and/or source variation. Consideration of these factors represents an advance in the application of the synoptic mass balance approach by placing error bars on estimates of constituent load and by allowing all sources of uncertainty to be quantified in aggregate; previous applications of the approach have provided only point estimates of constituent load and considered only a subset of the possible errors. Given estimates of aggregate uncertainty, site specific data and expert judgement may be used to qualitatively assess the contributions of individual factors to uncertainty. This assessment can be used to guide the collection of additional data to reduce uncertainty. Further, error bars provided by the replicate approach can aid the investigator in the interpretation of spatial loading profiles and the subsequent identification of constituent source areas within the watershed.The replicate sampling approach is applied to Peru Creek, a stream receiving acidic, metal-rich effluent from the Pennsylvania Mine. Other sources of acidity and metals within the study reach include a wetland area adjacent to the mine and tributary inflow from Cinnamon Gulch. Analysis of data collected under low-flow conditions indicates that concentrations of Al, Cd, Cu, Fe, Mn, Pb, and Zn in Peru Creek exceed aquatic life standards. Constituent loading within the study reach is dominated by effluent from the
Estimating instream constituent loads using replicate synoptic sampling, Peru Creek, Colorado

Science.gov (United States)

Runkel, Robert L.; Walton-Day, Katherine; Kimball, Briant A.; Verplanck, Philip L.; Nimick, David A.

2013-05-01

SummaryThe synoptic mass balance approach is often used to evaluate constituent mass loading in streams affected by mine drainage. Spatial profiles of constituent mass load are used to identify sources of contamination and prioritize sites for remedial action. This paper presents a field scale study in which replicate synoptic sampling campaigns are used to quantify the aggregate uncertainty in constituent load that arises from (1) laboratory analyses of constituent and tracer concentrations, (2) field sampling error, and (3) temporal variation in concentration from diel constituent cycles and/or source variation. Consideration of these factors represents an advance in the application of the synoptic mass balance approach by placing error bars on estimates of constituent load and by allowing all sources of uncertainty to be quantified in aggregate; previous applications of the approach have provided only point estimates of constituent load and considered only a subset of the possible errors. Given estimates of aggregate uncertainty, site specific data and expert judgement may be used to qualitatively assess the contributions of individual factors to uncertainty. This assessment can be used to guide the collection of additional data to reduce uncertainty. Further, error bars provided by the replicate approach can aid the investigator in the interpretation of spatial loading profiles and the subsequent identification of constituent source areas within the watershed. The replicate sampling approach is applied to Peru Creek, a stream receiving acidic, metal-rich effluent from the Pennsylvania Mine. Other sources of acidity and metals within the study reach include a wetland area adjacent to the mine and tributary inflow from Cinnamon Gulch. Analysis of data collected under low-flow conditions indicates that concentrations of Al, Cd, Cu, Fe, Mn, Pb, and Zn in Peru Creek exceed aquatic life standards. Constituent loading within the study reach is dominated by effluent
Size selective isocyanate aerosols personal air sampling using porous plastic foams

International Nuclear Information System (INIS)

Cong Khanh Huynh; Trinh Vu Duc

2009-01-01

As part of a European project (SMT4-CT96-2137), various European institutions specialized in occupational hygiene (BGIA, HSL, IOM, INRS, IST, Ambiente e Lavoro) have established a program of scientific collaboration to develop one or more prototypes of European personal samplers for the collection of simultaneous three dust fractions: inhalable, thoracic and respirable. These samplers based on existing sampling heads (IOM, GSP and cassettes) use Polyurethane Plastic Foam (PUF) according to their porosity to support sampling and separator size of the particles. In this study, the authors present an original application of size selective personal air sampling using chemical impregnated PUF to perform isocyanate aerosols capturing and derivatizing in industrial spray-painting shops.
Final Report: Sampling-Based Algorithms for Estimating Structure in Big Data.

Energy Technology Data Exchange (ETDEWEB)

Matulef, Kevin Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-02-01

The purpose of this project was to develop sampling-based algorithms to discover hidden struc- ture in massive data sets. Inferring structure in large data sets is an increasingly common task in many critical national security applications. These data sets come from myriad sources, such as network traffic, sensor data, and data generated by large-scale simulations. They are often so large that traditional data mining techniques are time consuming or even infeasible. To address this problem, we focus on a class of algorithms that do not compute an exact answer, but instead use sampling to compute an approximate answer using fewer resources. The particular class of algorithms that we focus on are streaming algorithms , so called because they are designed to handle high-throughput streams of data. Streaming algorithms have only a small amount of working storage - much less than the size of the full data stream - so they must necessarily use sampling to approximate the correct answer. We present two results: * A streaming algorithm called HyperHeadTail , that estimates the degree distribution of a graph (i.e., the distribution of the number of connections for each node in a network). The degree distribution is a fundamental graph property, but prior work on estimating the degree distribution in a streaming setting was impractical for many real-world application. We improve upon prior work by developing an algorithm that can handle streams with repeated edges, and graph structures that evolve over time. * An algorithm for the task of maintaining a weighted subsample of items in a stream, when the items must be sampled according to their weight, and the weights are dynamically changing. To our knowledge, this is the first such algorithm designed for dynamically evolving weights. We expect it may be useful as a building block for other streaming algorithms on dynamic data sets.

A new method of hybrid frequency hopping signals selection and blind parameter estimation

Science.gov (United States)

Zeng, Xiaoyu; Jiao, Wencheng; Sun, Huixian

2018-04-01

Frequency hopping communication is widely used in military communications at home and abroad. In the case of single-channel reception, it is scarce to process multiple frequency hopping signals both effectively and simultaneously. A method of hybrid FH signals selection and blind parameter estimation is proposed. The method makes use of spectral transformation, spectral entropy calculation and PRI transformation basic theory to realize the sorting and parameter estimation of the components in the hybrid frequency hopping signal. The simulation results show that this method can correctly classify the frequency hopping component signal, and the estimated error of the frequency hopping period is about 5% and the estimated error of the frequency hopping frequency is less than 1% when the SNR is 10dB. However, the performance of this method deteriorates seriously at low SNR.
Maximum likelihood estimation and EM algorithm of Copas-like selection model for publication bias correction.

Science.gov (United States)

Ning, Jing; Chen, Yong; Piao, Jin

2017-07-01

Publication bias occurs when the published research results are systematically unrepresentative of the population of studies that have been conducted, and is a potential threat to meaningful meta-analysis. The Copas selection model provides a flexible framework for correcting estimates and offers considerable insight into the publication bias. However, maximizing the observed likelihood under the Copas selection model is challenging because the observed data contain very little information on the latent variable. In this article, we study a Copas-like selection model and propose an expectation-maximization (EM) algorithm for estimation based on the full likelihood. Empirical simulation studies show that the EM algorithm and its associated inferential procedure performs well and avoids the non-convergence problem when maximizing the observed likelihood. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
On Estimating Quantiles Using Auxiliary Information

Directory of Open Access Journals (Sweden)

Berger Yves G.

2015-03-01

Full Text Available We propose a transformation-based approach for estimating quantiles using auxiliary information. The proposed estimators can be easily implemented using a regression estimator. We show that the proposed estimators are consistent and asymptotically unbiased. The main advantage of the proposed estimators is their simplicity. Despite the fact the proposed estimators are not necessarily more efficient than their competitors, they offer a good compromise between accuracy and simplicity. They can be used under single and multistage sampling designs with unequal selection probabilities. A simulation study supports our finding and shows that the proposed estimators are robust and of an acceptable accuracy compared to alternative estimators, which can be more computationally intensive.
Sample selection based on kernel-subclustering for the signal reconstruction of multifunctional sensors

International Nuclear Information System (INIS)

Wang, Xin; Wei, Guo; Sun, Jinwei

2013-01-01

The signal reconstruction methods based on inverse modeling for the signal reconstruction of multifunctional sensors have been widely studied in recent years. To improve the accuracy, the reconstruction methods have become more and more complicated because of the increase in the model parameters and sample points. However, there is another factor that affects the reconstruction accuracy, the position of the sample points, which has not been studied. A reasonable selection of the sample points could improve the signal reconstruction quality in at least two ways: improved accuracy with the same number of sample points or the same accuracy obtained with a smaller number of sample points. Both ways are valuable for improving the accuracy and decreasing the workload, especially for large batches of multifunctional sensors. In this paper, we propose a sample selection method based on kernel-subclustering distill groupings of the sample data and produce the representation of the data set for inverse modeling. The method calculates the distance between two data points based on the kernel-induced distance instead of the conventional distance. The kernel function is a generalization of the distance metric by mapping the data that are non-separable in the original space into homogeneous groups in the high-dimensional space. The method obtained the best results compared with the other three methods in the simulation. (paper)
Selective solid-phase extraction of Ni(II) by an ion-imprinted polymer from water samples

International Nuclear Information System (INIS)

Saraji, Mohammad; Yousefi, Hamideh

2009-01-01

A new ion-imprinted polymer (IIP) material was synthesized by copolymerization of 4-vinylpyridine as monomer, ethyleneglycoldimethacrylate as crosslinking agent and 2,2'-azobis-sobutyronitrile as initiator in the presence of Ni-dithizone complex. The IIP was used as sorbent in a solid-phase extraction column. The effects of sampling volume, elution conditions, sample pH and sample flow rate on the extraction of Ni ions form water samples were studied. The maximum adsorption capacity and the relative selectivity coefficients of imprinted polymer for Ni(II)/Co(II), Ni(II)/Cu(II) and Ni(II)/Cd(II) were calculated. Compared with non-imprinted polymer particles, the IIP had higher selectivity for Ni(II). The relative selectivity factor (α r ) values of Ni(II)/Co(II), Ni(II)/Cu(II) and Ni(II)/Cd(II) were 21.6, 54.3, and 22.7, respectively, which are greater than 1. The relative standard deviation of the five replicate determinations of Ni(II) was 3.4%. The detection limit for 150 mL of sample was 1.6 μg L -1 using flame atomic absorption spectrometry. The developed method was successfully applied to the determination of trace nickel in water samples with satisfactory results.
40 CFR 761.306 - Sampling 1 meter square surfaces by random selection of halves.

Science.gov (United States)

2010-07-01

... 40 Protection of Environment 30 2010-07-01 2010-07-01 false Sampling 1 meter square surfaces by...(b)(3) § 761.306 Sampling 1 meter square surfaces by random selection of halves. (a) Divide each 1 meter square portion where it is necessary to collect a surface wipe test sample into two equal (or as...
Estimation of sampling error uncertainties in observed surface air temperature change in China

Science.gov (United States)

Hua, Wei; Shen, Samuel S. P.; Weithmann, Alexander; Wang, Huijun

2017-08-01

This study examines the sampling error uncertainties in the monthly surface air temperature (SAT) change in China over recent decades, focusing on the uncertainties of gridded data, national averages, and linear trends. Results indicate that large sampling error variances appear at the station-sparse area of northern and western China with the maximum value exceeding 2.0 K2 while small sampling error variances are found at the station-dense area of southern and eastern China with most grid values being less than 0.05 K2. In general, the negative temperature existed in each month prior to the 1980s, and a warming in temperature began thereafter, which accelerated in the early and mid-1990s. The increasing trend in the SAT series was observed for each month of the year with the largest temperature increase and highest uncertainty of 0.51 ± 0.29 K (10 year)-1 occurring in February and the weakest trend and smallest uncertainty of 0.13 ± 0.07 K (10 year)-1 in August. The sampling error uncertainties in the national average annual mean SAT series are not sufficiently large to alter the conclusion of the persistent warming in China. In addition, the sampling error uncertainties in the SAT series show a clear variation compared with other uncertainty estimation methods, which is a plausible reason for the inconsistent variations between our estimate and other studies during this period.
Selecting a sampling method to aid in vegetation management decisions in loblolly pine plantations

Science.gov (United States)

David R. Weise; Glenn R. Glover

1993-01-01

Objective methods to evaluate hardwood competition in young loblolly pine (Pinustaeda L.) plantations are not widely used in the southeastern United States. Ability of common sampling rules to accurately estimate hardwood rootstock attributes at low sampling intensities and across varying rootstock spatial distributions is unknown. Fixed area plot...
The finite sample performance of estimators for mediation analysis under sequential conditional independence

DEFF Research Database (Denmark)

Huber, Martin; Lechner, Michael; Mellace, Giovanni

Using a comprehensive simulation study based on empirical data, this paper investigates the finite sample properties of different classes of parametric and semi-parametric estimators of (natural) direct and indirect causal effects used in mediation analysis under sequential conditional independence...
Coordination of Conditional Poisson Samples

Directory of Open Access Journals (Sweden)

Grafström Anton

2015-12-01

Full Text Available Sample coordination seeks to maximize or to minimize the overlap of two or more samples. The former is known as positive coordination, and the latter as negative coordination. Positive coordination is mainly used for estimation purposes and to reduce data collection costs. Negative coordination is mainly performed to diminish the response burden of the sampled units. Poisson sampling design with permanent random numbers provides an optimum coordination degree of two or more samples. The size of a Poisson sample is, however, random. Conditional Poisson (CP sampling is a modification of the classical Poisson sampling that produces a fixed-size πps sample. We introduce two methods to coordinate Conditional Poisson samples over time or simultaneously. The first one uses permanent random numbers and the list-sequential implementation of CP sampling. The second method uses a CP sample in the first selection and provides an approximate one in the second selection because the prescribed inclusion probabilities are not respected exactly. The methods are evaluated using the size of the expected sample overlap, and are compared with their competitors using Monte Carlo simulation. The new methods provide a good coordination degree of two samples, close to the performance of Poisson sampling with permanent random numbers.
Estimates of diet selection in cattle grazing cornstalk residues by measurement of chemical composition and near infrared reflectance spectroscopy of diet samples collected by ruminal evacuation.

Science.gov (United States)

Petzel, Emily A; Smart, Alexander J; St-Pierre, Benoit; Selman, Susan L; Bailey, Eric A; Beck, Erin E; Walker, Julie A; Wright, Cody L; Held, Jeffrey E; Brake, Derek W

2018-05-04

Six ruminally cannulated cows (570 ± 73 kg) fed corn residues were placed in a 6 × 6 Latin square to evaluate predictions of diet composition from ruminally collected diet samples. After complete ruminal evacuation, cows were fed 1-kg meals (dry matter [DM]-basis) containing different combinations of cornstalk and leaf and husk (LH) residues in ratios of 0:100, 20:80, 40:60, 60:40, 80:20, and 100:0. Diet samples from each meal were collected by removal of ruminal contents after 1-h and were either unrinsed, hand-rinsed or machine-rinsed to evaluate effects of endogenous compounds on predictions of diet composition. Diet samples were analyzed for neutral (NDF) and acid (ADF) detergent fiber, acid detergent insoluble ash (ADIA), acid detergent lignin (ADL), crude protein (CP), and near infrared reflectance spectroscopy (NIRS) to calculate diet composition. Rinsing type increased NDF and ADF content and decreased ADIA and CP content of diet samples (P content of diet samples. Differences in concentration between cornstalk and LH residues within each chemical component were standardized by calculating a coefficient of variation (CV). Accuracy and precision of estimates of diet composition were analyzed by regressing predicted diet composition and known diet composition. Predictions of diet composition were improved by increasing differences in concentration of chemical components between cornstalk and LH residues up to a CV of 22.6 ± 5.4%. Predictions of diet composition from unrinsed ADIA and machine-rinsed NIRS had the greatest accuracy (slope = 0.98 and 0.95, respectively) and large coefficients of determination (r2 = 0.86 and 0.74, respectively). Subsequently, a field study (Exp. 2) was performed to evaluate predictions of diet composition in cattle (646 ± 89 kg) grazing corn residue. Five cows were placed in 1 of 10 paddocks and allowed to graze continuously or to strip-graze corn residues. Predictions of diet composition from ADIA, ADL, and NIRS did not
Sample size planning for composite reliability coefficients: accuracy in parameter estimation via narrow confidence intervals.

Science.gov (United States)

Terry, Leann; Kelley, Ken

2012-11-01

Composite measures play an important role in psychology and related disciplines. Composite measures almost always have error. Correspondingly, it is important to understand the reliability of the scores from any particular composite measure. However, the point estimates of the reliability of composite measures are fallible and thus all such point estimates should be accompanied by a confidence interval. When confidence intervals are wide, there is much uncertainty in the population value of the reliability coefficient. Given the importance of reporting confidence intervals for estimates of reliability, coupled with the undesirability of wide confidence intervals, we develop methods that allow researchers to plan sample size in order to obtain narrow confidence intervals for population reliability coefficients. We first discuss composite reliability coefficients and then provide a discussion on confidence interval formation for the corresponding population value. Using the accuracy in parameter estimation approach, we develop two methods to obtain accurate estimates of reliability by planning sample size. The first method provides a way to plan sample size so that the expected confidence interval width for the population reliability coefficient is sufficiently narrow. The second method ensures that the confidence interval width will be sufficiently narrow with some desired degree of assurance (e.g., 99% assurance that the 95% confidence interval for the population reliability coefficient will be less than W units wide). The effectiveness of our methods was verified with Monte Carlo simulation studies. We demonstrate how to easily implement the methods with easy-to-use and freely available software. ©2011 The British Psychological Society.
Assessment of the effect of population and diary sampling methods on estimation of school-age children exposure to fine particles.

Science.gov (United States)

Che, W W; Frey, H Christopher; Lau, Alexis K H

2014-12-01

Population and diary sampling methods are employed in exposure models to sample simulated individuals and their daily activity on each simulation day. Different sampling methods may lead to variations in estimated human exposure. In this study, two population sampling methods (stratified-random and random-random) and three diary sampling methods (random resampling, diversity and autocorrelation, and Markov-chain cluster [MCC]) are evaluated. Their impacts on estimated children's exposure to ambient fine particulate matter (PM2.5 ) are quantified via case studies for children in Wake County, NC for July 2002. The estimated mean daily average exposure is 12.9 μg/m(3) for simulated children using the stratified population sampling method, and 12.2 μg/m(3) using the random sampling method. These minor differences are caused by the random sampling among ages within census tracts. Among the three diary sampling methods, there are differences in the estimated number of individuals with multiple days of exposures exceeding a benchmark of concern of 25 μg/m(3) due to differences in how multiday longitudinal diaries are estimated. The MCC method is relatively more conservative. In case studies evaluated here, the MCC method led to 10% higher estimation of the number of individuals with repeated exposures exceeding the benchmark. The comparisons help to identify and contrast the capabilities of each method and to offer insight regarding implications of method choice. Exposure simulation results are robust to the two population sampling methods evaluated, and are sensitive to the choice of method for simulating longitudinal diaries, particularly when analyzing results for specific microenvironments or for exposures exceeding a benchmark of concern. © 2014 Society for Risk Analysis.
ITOUGH2 sample problems

International Nuclear Information System (INIS)

Finsterle, S.

1997-11-01

This report contains a collection of ITOUGH2 sample problems. It complements the ITOUGH2 User's Guide [Finsterle, 1997a], and the ITOUGH2 Command Reference [Finsterle, 1997b]. ITOUGH2 is a program for parameter estimation, sensitivity analysis, and uncertainty propagation analysis. It is based on the TOUGH2 simulator for non-isothermal multiphase flow in fractured and porous media [Preuss, 1987, 1991a]. The report ITOUGH2 User's Guide [Finsterle, 1997a] describes the inverse modeling framework and provides the theoretical background. The report ITOUGH2 Command Reference [Finsterle, 1997b] contains the syntax of all ITOUGH2 commands. This report describes a variety of sample problems solved by ITOUGH2. Table 1.1 contains a short description of the seven sample problems discussed in this report. The TOUGH2 equation-of-state (EOS) module that needs to be linked to ITOUGH2 is also indicated. Each sample problem focuses on a few selected issues shown in Table 1.2. ITOUGH2 input features and the usage of program options are described. Furthermore, interpretations of selected inverse modeling results are given. Problem 1 is a multipart tutorial, describing basic ITOUGH2 input files for the main ITOUGH2 application modes; no interpretation of results is given. Problem 2 focuses on non-uniqueness, residual analysis, and correlation structure. Problem 3 illustrates a variety of parameter and observation types, and describes parameter selection strategies. Problem 4 compares the performance of minimization algorithms and discusses model identification. Problem 5 explains how to set up a combined inversion of steady-state and transient data. Problem 6 provides a detailed residual and error analysis. Finally, Problem 7 illustrates how the estimation of model-related parameters may help compensate for errors in that model
Performance and separation occurrence of binary probit regression estimator using maximum likelihood method and Firths approach under different sample size

Science.gov (United States)

Lusiana, Evellin Dewi

2017-12-01

The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
Background estimation in short-wave region during determination of total sample composition by x-ray fluorescence method

International Nuclear Information System (INIS)

Simakov, V.A.; Kordyukov, S.V.; Petrov, E.N.

1988-01-01

Method of background estimation in short-wave spectral region during determination of total sample composition by X-ray fluorescence method is described. 13 types of different rocks with considerable variations of base composition and Zr, Nb, Th, U content below 7x10 -3 % are investigated. The suggested method of background accounting provides for a less statistical error of the background estimation than direct isolated measurement and reliability of its determination in a short-wave region independent on the sample base. Possibilities of suggested method for artificial mixtures conforming by the content of main component to technological concemtrates - niobium, zirconium, tantalum are estimated
A Class of Estimators for Finite Population Mean in Double Sampling under Nonresponse Using Fractional Raw Moments

Directory of Open Access Journals (Sweden)

Manzoor Khan

2014-01-01

Full Text Available This paper presents new classes of estimators in estimating the finite population mean under double sampling in the presence of nonresponse when using information on fractional raw moments. The expressions for mean square error of the proposed classes of estimators are derived up to the first degree of approximation. It is shown that a proposed class of estimators performs better than the usual mean estimator, ratio type estimators, and Singh and Kumar (2009 estimator. An empirical study is carried out to demonstrate the performance of a proposed class of estimators.
Spacecraft Trajectory Estimation Using a Sampled-Data Extended Kalman Filter with Range-Only Measurements

National Research Council Canada - National Science Library

Erwin, R. S; Bernstein, Dennis S

2005-01-01

.... In this paper we use a sampled-data extended Kalman Filter to estimate the trajectory or a target satellite when only range measurements are available from a constellation or orbiting spacecraft...
Variance of discharge estimates sampled using acoustic Doppler current profilers from moving boats

Science.gov (United States)

Garcia, Carlos M.; Tarrab, Leticia; Oberg, Kevin; Szupiany, Ricardo; Cantero, Mariano I.

2012-01-01

This paper presents a model for quantifying the random errors (i.e., variance) of acoustic Doppler current profiler (ADCP) discharge measurements from moving boats for different sampling times. The model focuses on the random processes in the sampled flow field and has been developed using statistical methods currently available for uncertainty analysis of velocity time series. Analysis of field data collected using ADCP from moving boats from three natural rivers of varying sizes and flow conditions shows that, even though the estimate of the integral time scale of the actual turbulent flow field is larger than the sampling interval, the integral time scale of the sampled flow field is on the order of the sampling interval. Thus, an equation for computing the variance error in discharge measurements associated with different sampling times, assuming uncorrelated flow fields is appropriate. The approach is used to help define optimal sampling strategies by choosing the exposure time required for ADCPs to accurately measure flow discharge.
Selection of anchor values for human error probability estimation

International Nuclear Information System (INIS)

Buffardi, L.C.; Fleishman, E.A.; Allen, J.A.

1989-01-01

There is a need for more dependable information to assist in the prediction of human errors in nuclear power environments. The major objective of the current project is to establish guidelines for using error probabilities from other task settings to estimate errors in the nuclear environment. This involves: (1) identifying critical nuclear tasks, (2) discovering similar tasks in non-nuclear environments, (3) finding error data for non-nuclear tasks, and (4) establishing error-rate values for the nuclear tasks based on the non-nuclear data. A key feature is the application of a classification system to nuclear and non-nuclear tasks to evaluate their similarities and differences in order to provide a basis for generalizing human error estimates across tasks. During the first eight months of the project, several classification systems have been applied to a sample of nuclear tasks. They are discussed in terms of their potential for establishing task equivalence and transferability of human error rates across situations

Optimization of the sampling scheme for maps of physical and chemical properties estimated by kriging

Directory of Open Access Journals (Sweden)

Gener Tadeu Pereira

2013-10-01

Full Text Available The sampling scheme is essential in the investigation of the spatial variability of soil properties in Soil Science studies. The high costs of sampling schemes optimized with additional sampling points for each physical and chemical soil property, prevent their use in precision agriculture. The purpose of this study was to obtain an optimal sampling scheme for physical and chemical property sets and investigate its effect on the quality of soil sampling. Soil was sampled on a 42-ha area, with 206 geo-referenced points arranged in a regular grid spaced 50 m from each other, in a depth range of 0.00-0.20 m. In order to obtain an optimal sampling scheme for every physical and chemical property, a sample grid, a medium-scale variogram and the extended Spatial Simulated Annealing (SSA method were used to minimize kriging variance. The optimization procedure was validated by constructing maps of relative improvement comparing the sample configuration before and after the process. A greater concentration of recommended points in specific areas (NW-SE direction was observed, which also reflects a greater estimate variance at these locations. The addition of optimal samples, for specific regions, increased the accuracy up to 2 % for chemical and 1 % for physical properties. The use of a sample grid and medium-scale variogram, as previous information for the conception of additional sampling schemes, was very promising to determine the locations of these additional points for all physical and chemical soil properties, enhancing the accuracy of kriging estimates of the physical-chemical properties.
Impact of selective genotyping in the training population on accuracy and bias of genomic selection.

Science.gov (United States)

Zhao, Yusheng; Gowda, Manje; Longin, Friedrich H; Würschum, Tobias; Ranc, Nicolas; Reif, Jochen C

2012-08-01

Estimating marker effects based on routinely generated phenotypic data of breeding programs is a cost-effective strategy to implement genomic selection. Truncation selection in breeding populations, however, could have a strong impact on the accuracy to predict genomic breeding values. The main objective of our study was to investigate the influence of phenotypic selection on the accuracy and bias of genomic selection. We used experimental data of 788 testcross progenies from an elite maize breeding program. The testcross progenies were evaluated in unreplicated field trials in ten environments and fingerprinted with 857 SNP markers. Random regression best linear unbiased prediction method was used in combination with fivefold cross-validation based on genotypic sampling. We observed a substantial loss in the accuracy to predict genomic breeding values in unidirectional selected populations. In contrast, estimating marker effects based on bidirectional selected populations led to only a marginal decrease in the prediction accuracy of genomic breeding values. We concluded that bidirectional selection is a valuable approach to efficiently implement genomic selection in applied plant breeding programs.
Efficient Monte Carlo Estimation of the Expected Value of Sample Information Using Moment Matching.

Science.gov (United States)

Heath, Anna; Manolopoulou, Ioanna; Baio, Gianluca

2018-02-01

The Expected Value of Sample Information (EVSI) is used to calculate the economic value of a new research strategy. Although this value would be important to both researchers and funders, there are very few practical applications of the EVSI. This is due to computational difficulties associated with calculating the EVSI in practical health economic models using nested simulations. We present an approximation method for the EVSI that is framed in a Bayesian setting and is based on estimating the distribution of the posterior mean of the incremental net benefit across all possible future samples, known as the distribution of the preposterior mean. Specifically, this distribution is estimated using moment matching coupled with simulations that are available for probabilistic sensitivity analysis, which is typically mandatory in health economic evaluations. This novel approximation method is applied to a health economic model that has previously been used to assess the performance of other EVSI estimators and accurately estimates the EVSI. The computational time for this method is competitive with other methods. We have developed a new calculation method for the EVSI which is computationally efficient and accurate. This novel method relies on some additional simulation so can be expensive in models with a large computational cost.
Evaluation of single and two-stage adaptive sampling designs for estimation of density and abundance of freshwater mussels in a large river

Science.gov (United States)

Smith, D.R.; Rogala, J.T.; Gray, B.R.; Zigler, S.J.; Newton, T.J.

2011-01-01

Reliable estimates of abundance are needed to assess consequences of proposed habitat restoration and enhancement projects on freshwater mussels in the Upper Mississippi River (UMR). Although there is general guidance on sampling techniques for population assessment of freshwater mussels, the actual performance of sampling designs can depend critically on the population density and spatial distribution at the project site. To evaluate various sampling designs, we simulated sampling of populations, which varied in density and degree of spatial clustering. Because of logistics and costs of large river sampling and spatial clustering of freshwater mussels, we focused on adaptive and non-adaptive versions of single and two-stage sampling. The candidate designs performed similarly in terms of precision (CV) and probability of species detection for fixed sample size. Both CV and species detection were determined largely by density, spatial distribution and sample size. However, designs did differ in the rate that occupied quadrats were encountered. Occupied units had a higher probability of selection using adaptive designs than conventional designs. We used two measures of cost: sample size (i.e. number of quadrats) and distance travelled between the quadrats. Adaptive and two-stage designs tended to reduce distance between sampling units, and thus performed better when distance travelled was considered. Based on the comparisons, we provide general recommendations on the sampling designs for the freshwater mussels in the UMR, and presumably other large rivers.
Small-vessel Survey and Auction Sampling to Estimate Growth and Maturity of Eteline Snappers

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — Small-vessel Survey and Auction Sampling to Estimate Growth and Maturity of Eteline Snappers and Improve Data-Limited Stock Assessments. This biosampling project...
Empirically Driven Variable Selection for the Estimation of Causal Effects with Observational Data

Science.gov (United States)

Keller, Bryan; Chen, Jianshen

2016-01-01

Observational studies are common in educational research, where subjects self-select or are otherwise non-randomly assigned to different interventions (e.g., educational programs, grade retention, special education). Unbiased estimation of a causal effect with observational data depends crucially on the assumption of ignorability, which specifies…
A method for the estimation of the significance of cross-correlations in unevenly sampled red-noise time series

Science.gov (United States)

Max-Moerbeck, W.; Richards, J. L.; Hovatta, T.; Pavlidou, V.; Pearson, T. J.; Readhead, A. C. S.

2014-11-01

We present a practical implementation of a Monte Carlo method to estimate the significance of cross-correlations in unevenly sampled time series of data, whose statistical properties are modelled with a simple power-law power spectral density. This implementation builds on published methods; we introduce a number of improvements in the normalization of the cross-correlation function estimate and a bootstrap method for estimating the significance of the cross-correlations. A closely related matter is the estimation of a model for the light curves, which is critical for the significance estimates. We present a graphical and quantitative demonstration that uses simulations to show how common it is to get high cross-correlations for unrelated light curves with steep power spectral densities. This demonstration highlights the dangers of interpreting them as signs of a physical connection. We show that by using interpolation and the Hanning sampling window function we are able to reduce the effects of red-noise leakage and to recover steep simple power-law power spectral densities. We also introduce the use of a Neyman construction for the estimation of the errors in the power-law index of the power spectral density. This method provides a consistent way to estimate the significance of cross-correlations in unevenly sampled time series of data.
Interval estimation methods of the mean in small sample situation and the results' comparison

International Nuclear Information System (INIS)

Wu Changli; Guo Chunying; Jiang Meng; Lin Yuangen

2009-01-01

The methods of the sample mean's interval estimation, namely the classical method, the Bootstrap method, the Bayesian Bootstrap method, the Jackknife method and the spread method of the Empirical Characteristic distribution function are described. Numerical calculation on the samples' mean intervals is carried out where the numbers of the samples are 4, 5, 6 respectively. The results indicate the Bootstrap method and the Bayesian Bootstrap method are much more appropriate than others in small sample situation. (authors)
A NEW METHOD FOR NON DESTRUCTIVE ESTIMATION OF Jc IN YBaCuO CERAMIC SAMPLES

Directory of Open Access Journals (Sweden)

Giancarlo Cordeiro Costa

2014-12-01

Full Text Available This work presents a new method for estimation of Jc as a bulk characteristic of YBCO blocks. The experimental magnetic interaction force between a SmCo permanent magnet and a YBCO block was compared to finite element method (FEM simulations results, allowing us to search a best fitting value to the critical current of the superconducting sample. As FEM simulations were based on Bean model , the critical current density was taken as an unknown parameter. This is a non destructive estimation method. since there is no need of breaking even a little piece of the sample for analysis.
An Improved Estimation of Regional Fractional Woody/Herbaceous Cover Using Combined Satellite Data and High-Quality Training Samples

Directory of Open Access Journals (Sweden)

Xu Liu

2017-01-01

Full Text Available Mapping vegetation cover is critical for understanding and monitoring ecosystem functions in semi-arid biomes. As existing estimates tend to underestimate the woody cover in areas with dry deciduous shrubland and woodland, we present an approach to improve the regional estimation of woody and herbaceous fractional cover in the East Asia steppe. This developed approach uses Random Forest models by combining multiple remote sensing data—training samples derived from high-resolution image in a tailored spatial sampling and model inputs composed of specific metrics from MODIS sensor and ancillary variables including topographic, bioclimatic, and land surface information. We emphasize that effective spatial sampling, high-quality classification, and adequate geospatial information are important prerequisites of establishing appropriate model inputs and achieving high-quality training samples. This study suggests that the optimal models improve estimation accuracy (NMSE 0.47 for woody and 0.64 for herbaceous plants and show a consistent agreement with field observations. Compared with existing woody estimate product, the proposed woody cover estimation can delineate regions with subshrubs and shrubs, showing an improved capability of capturing spatialized detail of vegetation signals. This approach can be applicable over sizable semi-arid areas such as temperate steppes, savannas, and prairies.
SNP calling, genotype calling, and sample allele frequency estimation from new-generation sequencing data

DEFF Research Database (Denmark)

Nielsen, Rasmus; Korneliussen, Thorfinn Sand; Albrechtsen, Anders

2012-01-01

We present a statistical framework for estimation and application of sample allele frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the allele frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is cal...... be extended to various other cases including cases with deviations from Hardy-Weinberg equilibrium. We evaluate the statistical properties of the methods using simulations and by application to a real data set....
Cost-effective sampling of 137Cs-derived net soil redistribution: part 1 – estimating the spatial mean across scales of variation

International Nuclear Information System (INIS)

Li, Y.; Chappell, A.; Nyamdavaa, B.; Yu, H.; Davaasuren, D.; Zoljargal, K.

2015-01-01

The 137 Cs technique for estimating net time-integrated soil redistribution is valuable for understanding the factors controlling soil redistribution by all processes. The literature on this technique is dominated by studies of individual fields and describes its typically time-consuming nature. We contend that the community making these studies has inappropriately assumed that many 137 Cs measurements are required and hence estimates of net soil redistribution can only be made at the field scale. Here, we support future studies of 137 Cs-derived net soil redistribution to apply their often limited resources across scales of variation (field, catchment, region etc.) without compromising the quality of the estimates at any scale. We describe a hybrid, design-based and model-based, stratified random sampling design with composites to estimate the sampling variance and a cost model for fieldwork and laboratory measurements. Geostatistical mapping of net (1954–2012) soil redistribution as a case study on the Chinese Loess Plateau is compared with estimates for several other sampling designs popular in the literature. We demonstrate the cost-effectiveness of the hybrid design for spatial estimation of net soil redistribution. To demonstrate the limitations of current sampling approaches to cut across scales of variation, we extrapolate our estimate of net soil redistribution across the region, show that for the same resources, estimates from many fields could have been provided and would elucidate the cause of differences within and between regional estimates. We recommend that future studies evaluate carefully the sampling design to consider the opportunity to investigate 137 Cs-derived net soil redistribution across scales of variation. - Highlights: • The 137 Cs technique estimates net time-integrated soil redistribution by all processes. • It is time-consuming and dominated by studies of individual fields. • We use limited resources to estimate soil
Improving Estimation of Betweenness Centrality for Scale-Free Graphs

Energy Technology Data Exchange (ETDEWEB)

Bromberger, Seth A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Klymko, Christine F. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Henderson, Keith A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pearce, Roger [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sanders, Geoff [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2017-11-07

Betweenness centrality is a graph statistic used to nd vertices that are participants in a large number of shortest paths in a graph. This centrality measure is commonly used in path and network interdiction problems and its complete form requires the calculation of all-pairs shortest paths for each vertex. This leads to a time complexity of O(jV jjEj), which is impractical for large graphs. Estimation of betweenness centrality has focused on performing shortest-path calculations on a subset of randomly- selected vertices. This reduces the complexity of the centrality estimation to O(jSjjEj); jSj < jV j, which can be scaled appropriately based on the computing resources available. An estimation strategy that uses random selection of vertices for seed selection is fast and simple to implement, but may not provide optimal estimation of betweenness centrality when the number of samples is constrained. Our experimentation has identi ed a number of alternate seed-selection strategies that provide lower error than random selection in common scale-free graphs. These strategies are discussed and experimental results are presented.
Three-dimensional reconstruction of highly complex microscopic samples using scanning electron microscopy and optical flow estimation.

Directory of Open Access Journals (Sweden)

Ahmadreza Baghaie

Full Text Available Scanning Electron Microscope (SEM as one of the major research and industrial equipment for imaging of micro-scale samples and surfaces has gained extensive attention from its emerge. However, the acquired micrographs still remain two-dimensional (2D. In the current work a novel and highly accurate approach is proposed to recover the hidden third-dimension by use of multi-view image acquisition of the microscopic samples combined with pre/post-processing steps including sparse feature-based stereo rectification, nonlocal-based optical flow estimation for dense matching and finally depth estimation. Employing the proposed approach, three-dimensional (3D reconstructions of highly complex microscopic samples were achieved to facilitate the interpretation of topology and geometry of surface/shape attributes of the samples. As a byproduct of the proposed approach, high-definition 3D printed models of the samples can be generated as a tangible means of physical understanding. Extensive comparisons with the state-of-the-art reveal the strength and superiority of the proposed method in uncovering the details of the highly complex microscopic samples.
Channel and delay estimation for base-station–based cooperative communications in frequency-selective fading channels

Directory of Open Access Journals (Sweden)

Hongjun Xu

2011-07-01

Full Text Available A channel and delay estimation algorithm for both positive and negative delay, based on the distributed Alamouti scheme, has been recently discussed for base-station–based asynchronous cooperative systems in frequency-flat fading channels. This paper extends the algorithm, the maximum likelihood estimator, to work in frequency-selective fading channels. The minimum mean square error (MMSE performance of channel estimation for both packet schemes and normal schemes is discussed in this paper. The symbol error rate (SER performance of equalisation and detection for both time-reversal space-time block code (STBC and single-carrier STBC is also discussed in this paper. The MMSE simulation results demonstrated the superior performance of the packet scheme over the normal scheme with an improvement in performance of up to 6 dB when feedback was used in the frequency-selective channel at a MSE of 3 x 10^–2. The SER simulation results showed that, although both the normal and packet schemes achieved similar diversity orders, the packet scheme demonstrated a 1 dB coding gain over the normal scheme at a SER of 10^–5. Finally, the SER simulations showed that the frequency-selective fading system outperformed the frequency-flat fading system.
Selective removal of phosphate for analysis of organic acids in complex samples.

Science.gov (United States)

Deshmukh, Sandeep; Frolov, Andrej; Marcillo, Andrea; Birkemeyer, Claudia

2015-04-03

Accurate quantitation of compounds in samples of biological origin is often hampered by matrix interferences one of which occurs in GC-MS analysis from the presence of highly abundant phosphate. Consequently, high concentrations of phosphate need to be removed before sample analysis. Within this context, we screened 17 anion exchange solid-phase extraction (SPE) materials for selective phosphate removal using different protocols to meet the challenge of simultaneous recovery of six common organic acids in aqueous samples prior to derivatization for GC-MS analysis. Up to 75% recovery was achieved for the most organic acids, only the low pKa tartaric and citric acids were badly recovered. Compared to the traditional approach of phosphate removal by precipitation, SPE had a broader compatibility with common detection methods and performed more selectively among the organic acids under investigation. Based on the results of this study, it is recommended that phosphate removal strategies during the analysis of biologically relevant small molecular weight organic acids consider the respective pKa of the anticipated analytes and the detection method of choice. Copyright © 2015 Elsevier B.V. All rights reserved.
Ultra-small time-delay estimation via a weak measurement technique with post-selection

International Nuclear Information System (INIS)

Fang, Chen; Huang, Jing-Zheng; Yu, Yang; Li, Qinzheng; Zeng, Guihua

2016-01-01

Weak measurement is a novel technique for parameter estimation with higher precision. In this paper we develop a general theory for the parameter estimation based on a weak measurement technique with arbitrary post-selection. The weak-value amplification model and the joint weak measurement model are two special cases in our theory. Applying the developed theory, time-delay estimation is investigated in both theory and experiments. The experimental results show that when the time delay is ultra-small, the joint weak measurement scheme outperforms the weak-value amplification scheme, and is robust against not only misalignment errors but also the wavelength dependence of the optical components. These results are consistent with theoretical predictions that have not been previously verified by any experiment. (paper)
TH-EF-BRA-08: A Novel Technique for Estimating Volumetric Cine MRI (VC-MRI) From Multi-Slice Sparsely Sampled Cine Images Using Motion Modeling and Free Form Deformation

International Nuclear Information System (INIS)

Harris, W; Yin, F; Wang, C; Chang, Z; Cai, J; Zhang, Y; Ren, L

2016-01-01

Purpose: To develop a technique to estimate on-board VC-MRI using multi-slice sparsely-sampled cine images, patient prior 4D-MRI, motion-modeling and free-form deformation for real-time 3D target verification of lung radiotherapy. Methods: A previous method has been developed to generate on-board VC-MRI by deforming prior MRI images based on a motion model(MM) extracted from prior 4D-MRI and a single-slice on-board 2D-cine image. In this study, free-form deformation(FD) was introduced to correct for errors in the MM when large anatomical changes exist. Multiple-slice sparsely-sampled on-board 2D-cine images located within the target are used to improve both the estimation accuracy and temporal resolution of VC-MRI. The on-board 2D-cine MRIs are acquired at 20–30frames/s by sampling only 10% of the k-space on Cartesian grid, with 85% of that taken at the central k-space. The method was evaluated using XCAT(computerized patient model) simulation of lung cancer patients with various anatomical and respirational changes from prior 4D-MRI to onboard volume. The accuracy was evaluated using Volume-Percent-Difference(VPD) and Center-of-Mass-Shift(COMS) of the estimated tumor volume. Effects of region-of-interest(ROI) selection, 2D-cine slice orientation, slice number and slice location on the estimation accuracy were evaluated. Results: VCMRI estimated using 10 sparsely-sampled sagittal 2D-cine MRIs achieved VPD/COMS of 9.07±3.54%/0.45±0.53mm among all scenarios based on estimation with ROI_MM-ROI_FD. The FD optimization improved estimation significantly for scenarios with anatomical changes. Using ROI-FD achieved better estimation than global-FD. Changing the multi-slice orientation to axial, coronal, and axial/sagittal orthogonal reduced the accuracy of VCMRI to VPD/COMS of 19.47±15.74%/1.57±2.54mm, 20.70±9.97%/2.34±0.92mm, and 16.02±13.79%/0.60±0.82mm, respectively. Reducing the number of cines to 8 enhanced temporal resolution of VC-MRI by 25% while
TH-EF-BRA-08: A Novel Technique for Estimating Volumetric Cine MRI (VC-MRI) From Multi-Slice Sparsely Sampled Cine Images Using Motion Modeling and Free Form Deformation

Energy Technology Data Exchange (ETDEWEB)

Harris, W; Yin, F; Wang, C; Chang, Z; Cai, J; Zhang, Y; Ren, L [Duke University Medical Center, Durham, NC (United States)

2016-06-15

Purpose: To develop a technique to estimate on-board VC-MRI using multi-slice sparsely-sampled cine images, patient prior 4D-MRI, motion-modeling and free-form deformation for real-time 3D target verification of lung radiotherapy. Methods: A previous method has been developed to generate on-board VC-MRI by deforming prior MRI images based on a motion model(MM) extracted from prior 4D-MRI and a single-slice on-board 2D-cine image. In this study, free-form deformation(FD) was introduced to correct for errors in the MM when large anatomical changes exist. Multiple-slice sparsely-sampled on-board 2D-cine images located within the target are used to improve both the estimation accuracy and temporal resolution of VC-MRI. The on-board 2D-cine MRIs are acquired at 20–30frames/s by sampling only 10% of the k-space on Cartesian grid, with 85% of that taken at the central k-space. The method was evaluated using XCAT(computerized patient model) simulation of lung cancer patients with various anatomical and respirational changes from prior 4D-MRI to onboard volume. The accuracy was evaluated using Volume-Percent-Difference(VPD) and Center-of-Mass-Shift(COMS) of the estimated tumor volume. Effects of region-of-interest(ROI) selection, 2D-cine slice orientation, slice number and slice location on the estimation accuracy were evaluated. Results: VCMRI estimated using 10 sparsely-sampled sagittal 2D-cine MRIs achieved VPD/COMS of 9.07±3.54%/0.45±0.53mm among all scenarios based on estimation with ROI-MM-ROI-FD. The FD optimization improved estimation significantly for scenarios with anatomical changes. Using ROI-FD achieved better estimation than global-FD. Changing the multi-slice orientation to axial, coronal, and axial/sagittal orthogonal reduced the accuracy of VCMRI to VPD/COMS of 19.47±15.74%/1.57±2.54mm, 20.70±9.97%/2.34±0.92mm, and 16.02±13.79%/0.60±0.82mm, respectively. Reducing the number of cines to 8 enhanced temporal resolution of VC-MRI by 25% while
Variations among animals when estimating the undegradable fraction of fiber in forage samples

Directory of Open Access Journals (Sweden)

Cláudia Batista Sampaio

2014-10-01

Full Text Available The objective of this study was to assess the variability among animals regarding the critical time to estimate the undegradable fraction of fiber (ct using an in situ incubation procedure. Five rumenfistulated Nellore steers were used to estimate the degradation profile of fiber. Animals were fed a standard diet with an 80:20 forage:concentrate ratio. Sugarcane, signal grass hay, corn silage and fresh elephant grass samples were assessed. Samples were put in F57 Ankom® bags and were incubated in the rumens of the animals for 0, 6, 12, 18, 24, 48, 72, 96, 120, 144, 168, 192, 216, 240 and 312 hours. The degradation profiles were interpreted using a mixed non-linear model in which a random effect was associated with the degradation rate. For sugarcane, signal grass hay and corn silage, there were no significant variations among animals regarding the fractional degradation rate of neutral and acid detergent fiber; consequently, the ct required to estimate the undegradable fiber fraction did not vary among animals for those forages. However, a significant variability among animals was found for the fresh elephant grass. The results seem to suggest that the variability among animals regarding the degradation rate of fibrous components can be significant.

THE zCOSMOS-SINFONI PROJECT. I. SAMPLE SELECTION AND NATURAL-SEEING OBSERVATIONS

Energy Technology Data Exchange (ETDEWEB)

Mancini, C.; Renzini, A. [INAF-OAPD, Osservatorio Astronomico di Padova, Vicolo Osservatorio 5, I-35122 Padova (Italy); Foerster Schreiber, N. M.; Hicks, E. K. S.; Genzel, R.; Tacconi, L.; Davies, R. [Max-Planck-Institut fuer Extraterrestrische Physik, Giessenbachstrasse, D-85748 Garching (Germany); Cresci, G. [Osservatorio Astrofisico di Arcetri (OAF), INAF-Firenze, Largo E. Fermi 5, I-50125 Firenze (Italy); Peng, Y.; Lilly, S.; Carollo, M.; Oesch, P. [Institute of Astronomy, Department of Physics, Eidgenossische Technische Hochschule, ETH Zurich CH-8093 (Switzerland); Vergani, D.; Pozzetti, L.; Zamorani, G. [INAF-Bologna, Via Ranzani, I-40127 Bologna (Italy); Daddi, E. [CEA-Saclay, DSM/DAPNIA/Service d' Astrophysique, F-91191 Gif-Sur Yvette Cedex (France); Maraston, C. [Institute of Cosmology and Gravitation, University of Portsmouth, Dennis Sciama Building, Burnaby Road, PO1 3HE Portsmouth (United Kingdom); McCracken, H. J. [IAP, 98bis bd Arago, F-75014 Paris (France); Bouche, N. [Department of Physics, University of California, Santa Barbara, CA 93106 (United States); Shapiro, K. [Aerospace Research Laboratories, Northrop Grumman Aerospace Systems, Redondo Beach, CA 90278 (United States); and others

2011-12-10

The zCOSMOS-SINFONI project is aimed at studying the physical and kinematical properties of a sample of massive z {approx} 1.4-2.5 star-forming galaxies, through SINFONI near-infrared integral field spectroscopy (IFS), combined with the multiwavelength information from the zCOSMOS (COSMOS) survey. The project is based on one hour of natural-seeing observations per target, and adaptive optics (AO) follow-up for a major part of the sample, which includes 30 galaxies selected from the zCOSMOS/VIMOS spectroscopic survey. This first paper presents the sample selection, and the global physical characterization of the target galaxies from multicolor photometry, i.e., star formation rate (SFR), stellar mass, age, etc. The H{alpha} integrated properties, such as, flux, velocity dispersion, and size, are derived from the natural-seeing observations, while the follow-up AO observations will be presented in the next paper of this series. Our sample appears to be well representative of star-forming galaxies at z {approx} 2, covering a wide range in mass and SFR. The H{alpha} integrated properties of the 25 H{alpha} detected galaxies are similar to those of other IFS samples at the same redshifts. Good agreement is found among the SFRs derived from H{alpha} luminosity and other diagnostic methods, provided the extinction affecting the H{alpha} luminosity is about twice that affecting the continuum. A preliminary kinematic analysis, based on the maximum observed velocity difference across the source and on the integrated velocity dispersion, indicates that the sample splits nearly 50-50 into rotation-dominated and velocity-dispersion-dominated galaxies, in good agreement with previous surveys.
Limited sampling strategy models for estimating the AUC of gliclazide in Chinese healthy volunteers.

Science.gov (United States)

Huang, Ji-Han; Wang, Kun; Huang, Xiao-Hui; He, Ying-Chun; Li, Lu-Jin; Sheng, Yu-Cheng; Yang, Juan; Zheng, Qing-Shan

2013-06-01

The aim of this work is to reduce the cost of required sampling for the estimation of the area under the gliclazide plasma concentration versus time curve within 60 h (AUC0-60t ). The limited sampling strategy (LSS) models were established and validated by the multiple regression model within 4 or fewer gliclazide concentration values. Absolute prediction error (APE), root of mean square error (RMSE) and visual prediction check were used as criterion. The results of Jack-Knife validation showed that 10 (25.0 %) of the 40 LSS based on the regression analysis were not within an APE of 15 % using one concentration-time point. 90.2, 91.5 and 92.4 % of the 40 LSS models were capable of prediction using 2, 3 and 4 points, respectively. Limited sampling strategies were developed and validated for estimating AUC0-60t of gliclazide. This study indicates that the implementation of an 80 mg dosage regimen enabled accurate predictions of AUC0-60t by the LSS model. This study shows that 12, 6, 4, 2 h after administration are the key sampling times. The combination of (12, 2 h), (12, 8, 2 h) or (12, 8, 4, 2 h) can be chosen as sampling hours for predicting AUC0-60t in practical application according to requirement.
Reserves' potential of sedimentary basin: modeling and estimation; Potentiel de reserves d'un bassin petrolier: modelisation et estimation

Energy Technology Data Exchange (ETDEWEB)

Lepez, V.

2002-12-01

The aim of this thesis is to build a statistical model of oil and gas fields' sizes distribution in a given sedimentary basin, for both the fields that exist in:the subsoil and those which have already been discovered. The estimation of all the parameters of the model via estimation of the density of the observations by model selection of piecewise polynomials by penalized maximum likelihood techniques enables to provide estimates of the total number of fields which are yet to be discovered, by class of size. We assume that the set of underground fields' sizes is an i.i.d. sample of unknown population with Levy-Pareto law with unknown parameter. The set of already discovered fields is a sub-sample without replacement from the previous which is 'size-biased'. The associated inclusion probabilities are to be estimated. We prove that the probability density of the observations is the product of the underlying density and of an unknown weighting function representing the sampling bias. An arbitrary partition of the sizes' interval being set (called a model), the analytical solutions of likelihood maximization enables to estimate both the parameter of the underlying Levy-Pareto law and the weighting function, which is assumed to be piecewise constant and based upon the partition. We shall add a monotonousness constraint over the latter, taking into account the fact that the bigger a field, the higher its probability of being discovered. Horvitz-Thompson-like estimators finally give the conclusion. We then allow our partitions to vary inside several classes of models and prove a model selection theorem which aims at selecting the best partition within a class, in terms of both Kuilback and Hellinger risk of the associated estimator. We conclude by simulations and various applications to real data from sedimentary basins of four continents, in order to illustrate theoretical as well as practical aspects of our model. (author)
40 CFR Appendix A to Subpart G of... - Sampling Plans for Selective Enforcement Auditing of Marine Engines

Science.gov (United States)

2010-07-01

... Enforcement Auditing of Marine Engines A Appendix A to Subpart G of Part 91 Protection of Environment...-IGNITION ENGINES Selective Enforcement Auditing Regulations Pt. 91, Subpt. G, App. A Appendix A to Subpart G of Part 91—Sampling Plans for Selective Enforcement Auditing of Marine Engines Table 1—Sampling...
Using step and path selection functions for estimating resistance to movement: Pumas as a case study

Science.gov (United States)

Katherine A. Zeller; Kevin McGarigal; Samuel A. Cushman; Paul Beier; T. Winston Vickers; Walter M. Boyce

2015-01-01

GPS telemetry collars and their ability to acquire accurate and consistently frequent locations have increased the use of step selection functions (SSFs) and path selection functions (PathSFs) for studying animal movement and estimating resistance. However, previously published SSFs and PathSFs often do not accommodate multiple scales or multiscale modeling....
Using local multiplicity to improve effect estimation from a hypothesis-generating pharmacogenetics study.

Science.gov (United States)

Zou, W; Ouyang, H

2016-02-01

We propose a multiple estimation adjustment (MEA) method to correct effect overestimation due to selection bias from a hypothesis-generating study (HGS) in pharmacogenetics. MEA uses a hierarchical Bayesian approach to model individual effect estimates from maximal likelihood estimation (MLE) in a region jointly and shrinks them toward the regional effect. Unlike many methods that model a fixed selection scheme, MEA capitalizes on local multiplicity independent of selection. We compared mean square errors (MSEs) in simulated HGSs from naive MLE, MEA and a conditional likelihood adjustment (CLA) method that model threshold selection bias. We observed that MEA effectively reduced MSE from MLE on null effects with or without selection, and had a clear advantage over CLA on extreme MLE estimates from null effects under lenient threshold selection in small samples, which are common among 'top' associations from a pharmacogenetics HGS.
Estimating the price elasticity of beer: meta-analysis of data with heterogeneity, dependence, and publication bias.

Science.gov (United States)

Nelson, Jon P

2014-01-01

Precise estimates of price elasticities are important for alcohol tax policy. Using meta-analysis, this paper corrects average beer elasticities for heterogeneity, dependence, and publication selection bias. A sample of 191 estimates is obtained from 114 primary studies. Simple and weighted means are reported. Dependence is addressed by restricting number of estimates per study, author-restricted samples, and author-specific variables. Publication bias is addressed using funnel graph, trim-and-fill, and Egger's intercept model. Heterogeneity and selection bias are examined jointly in meta-regressions containing moderator variables for econometric methodology, primary data, and precision of estimates. Results for fixed- and random-effects regressions are reported. Country-specific effects and sample time periods are unimportant, but several methodology variables help explain the dispersion of estimates. In models that correct for selection bias and heterogeneity, the average beer price elasticity is about -0.20, which is less elastic by 50% compared to values commonly used in alcohol tax policy simulations. Copyright © 2013 Elsevier B.V. All rights reserved.
40 CFR Appendix A to Subpart F of... - Sampling Plans for Selective Enforcement Auditing of Nonroad Engines

Science.gov (United States)

2010-07-01

... Enforcement Auditing of Nonroad Engines A Appendix A to Subpart F of Part 89 Protection of Environment... NONROAD COMPRESSION-IGNITION ENGINES Selective Enforcement Auditing Pt. 89, Subpt. F, App. A Appendix A to Subpart F of Part 89—Sampling Plans for Selective Enforcement Auditing of Nonroad Engines Table 1—Sampling...
Electromembrane extraction as a rapid and selective miniaturized sample preparation technique for biological fluids

DEFF Research Database (Denmark)

Gjelstad, Astrid; Pedersen-Bjergaard, Stig; Seip, Knut Fredrik

2015-01-01

This special report discusses the sample preparation method electromembrane extraction, which was introduced in 2006 as a rapid and selective miniaturized extraction method. The extraction principle is based on isolation of charged analytes extracted from an aqueous sample, across a thin film....... Technical aspects of electromembrane extraction, important extraction parameters as well as a handful of examples of applications from different biological samples and bioanalytical areas are discussed in the paper....
Estimation of time-delayed mutual information and bias for irregularly and sparsely sampled time-series

International Nuclear Information System (INIS)

Albers, D.J.; Hripcsak, George

2012-01-01

Highlights: ► Time-delayed mutual information for irregularly sampled time-series. ► Estimation bias for the time-delayed mutual information calculation. ► Fast, simple, PDF estimator independent, time-delayed mutual information bias estimate. ► Quantification of data-set-size limits of the time-delayed mutual calculation. - Abstract: A method to estimate the time-dependent correlation via an empirical bias estimate of the time-delayed mutual information for a time-series is proposed. In particular, the bias of the time-delayed mutual information is shown to often be equivalent to the mutual information between two distributions of points from the same system separated by infinite time. Thus intuitively, estimation of the bias is reduced to estimation of the mutual information between distributions of data points separated by large time intervals. The proposed bias estimation techniques are shown to work for Lorenz equations data and glucose time series data of three patients from the Columbia University Medical Center database.
A genetic algorithm-based framework for wavelength selection on sample categorization.

Science.gov (United States)

Anzanello, Michel J; Yamashita, Gabrielli; Marcelo, Marcelo; Fogliatto, Flávio S; Ortiz, Rafael S; Mariotti, Kristiane; Ferrão, Marco F

2017-08-01

In forensic and pharmaceutical scenarios, the application of chemometrics and optimization techniques has unveiled common and peculiar features of seized medicine and drug samples, helping investigative forces to track illegal operations. This paper proposes a novel framework aimed at identifying relevant subsets of attenuated total reflectance Fourier transform infrared (ATR-FTIR) wavelengths for classifying samples into two classes, for example authentic or forged categories in case of medicines, or salt or base form in cocaine analysis. In the first step of the framework, the ATR-FTIR spectra were partitioned into equidistant intervals and the k-nearest neighbour (KNN) classification technique was applied to each interval to insert samples into proper classes. In the next step, selected intervals were refined through the genetic algorithm (GA) by identifying a limited number of wavelengths from the intervals previously selected aimed at maximizing classification accuracy. When applied to Cialis®, Viagra®, and cocaine ATR-FTIR datasets, the proposed method substantially decreased the number of wavelengths needed to categorize, and increased the classification accuracy. From a practical perspective, the proposed method provides investigative forces with valuable information towards monitoring illegal production of drugs and medicines. In addition, focusing on a reduced subset of wavelengths allows the development of portable devices capable of testing the authenticity of samples during police checking events, avoiding the need for later laboratorial analyses and reducing equipment expenses. Theoretically, the proposed GA-based approach yields more refined solutions than the current methods relying on interval approaches, which tend to insert irrelevant wavelengths in the retained intervals. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
THE ATACAMA COSMOLOGY TELESCOPE: DYNAMICAL MASSES AND SCALING RELATIONS FOR A SAMPLE OF MASSIVE SUNYAEV-ZEL'DOVICH EFFECT SELECTED GALAXY CLUSTERS ,

International Nuclear Information System (INIS)

Sifón, Cristóbal; Barrientos, L. Felipe; González, Jorge; Infante, Leopoldo; Dünner, Rolando; Menanteau, Felipe; Hughes, John P.; Baker, Andrew J.; Hasselfield, Matthew; Marriage, Tobias A.; Crichton, Devin; Gralla, Megan B.; Addison, Graeme E.; Dunkley, Joanna; Battaglia, Nick; Bond, J. Richard; Hajian, Amir; Das, Sudeep; Devlin, Mark J.; Hilton, Matt

2013-01-01

We present the first dynamical mass estimates and scaling relations for a sample of Sunyaev-Zel'dovich effect (SZE) selected galaxy clusters. The sample consists of 16 massive clusters detected with the Atacama Cosmology Telescope (ACT) over a 455 deg 2 area of the southern sky. Deep multi-object spectroscopic observations were taken to secure intermediate-resolution (R ∼ 700-800) spectra and redshifts for ≈60 member galaxies on average per cluster. The dynamical masses M 200c of the clusters have been calculated using simulation-based scaling relations between velocity dispersion and mass. The sample has a median redshift z = 0.50 and a median mass M 200c ≅12×10 14 h 70 -1 M sun with a lower limit M 200c ≅6×10 14 h 70 -1 M sun , consistent with the expectations for the ACT southern sky survey. These masses are compared to the ACT SZE properties of the sample, specifically, the match-filtered central SZE amplitude y 0 -tilde, the central Compton parameter y 0 , and the integrated Compton signal Y 200c , which we use to derive SZE-mass scaling relations. All SZE estimators correlate with dynamical mass with low intrinsic scatter (∼< 20%), in agreement with numerical simulations. We explore the effects of various systematic effects on these scaling relations, including the correlation between observables and the influence of dynamically disturbed clusters. Using the three-dimensional information available, we divide the sample into relaxed and disturbed clusters and find that ∼50% of the clusters are disturbed. There are hints that disturbed systems might bias the scaling relations, but given the current sample sizes, these differences are not significant; further studies including more clusters are required to assess the impact of these clusters on the scaling relations
Estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean.

Science.gov (United States)

Schillaci, Michael A; Schillaci, Mario E

2009-02-01

The use of small sample sizes in human and primate evolutionary research is commonplace. Estimating how well small samples represent the underlying population, however, is not commonplace. Because the accuracy of determinations of taxonomy, phylogeny, and evolutionary process are dependant upon how well the study sample represents the population of interest, characterizing the uncertainty, or potential error, associated with analyses of small sample sizes is essential. We present a method for estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean using small (nresearchers to determine post hoc the probability that their sample is a meaningful approximation of the population parameter. We tested the method using a large craniometric data set commonly used by researchers in the field. Given our results, we suggest that sample estimates of the population mean can be reasonable and meaningful even when based on small, and perhaps even very small, sample sizes.
Estimation of sediment sources using selected chemical tracers in the Perry lake basin, Kansas, USA

Science.gov (United States)

Juracek, K.E.; Ziegler, A.C.

2009-01-01

The ability to achieve meaningful decreases in sediment loads to reservoirs requires a determination of the relative importance of sediment sources within the contributing basins. In an investigation of sources of fine-grained sediment (clay and silt) within the Perry Lake Basin in northeast Kansas, representative samples of channel-bank sources, surface-soil sources (cropland and grassland), and reservoir bottom sediment were collected, chemically analyzed, and compared. The samples were sieved to isolate the TOC), and 137Cs were selected for use in the estimation of sediment sources. To further account for differences in particle-size composition between the sources and the reservoir bottom sediment, constituent ratio and clay-normalization techniques were used. Computed ratios included TOC to TN, TOC to TP, and TN to TP. Constituent concentrations (TN, TP, TOC) and activities (137Cs) were normalized by dividing by the percentage of clay. Thus, the sediment-source estimations involved the use of seven sediment-source indicators. Within the Perry Lake Basin, the consensus of the seven indicators was that both channel-bank and surface-soil sources were important in the Atchison County Lake and Banner Creek Reservoir subbasins, whereas channel-bank sources were dominant in the Mission Lake subbasin. On the sole basis of 137Cs activity, surface-soil sources contributed the most fine-grained sediment to Atchison County Lake, and channel-bank sources contributed the most fine-grained sediment to Banner Creek Reservoir and Mission Lake. Both the seven-indicator consensus and 137Cs indicated that channel-bank sources were dominant for Perry Lake and that channel-bank sources increased in importance with distance downstream in the basin. ?? 2009 International Research and Training Centre on Erosion and Sedimentation and the World Association for Sedimentation and Erosion Research.
Designing a monitoring program to estimate estuarine survival of anadromous salmon smolts: simulating the effect of sample design on inference

Science.gov (United States)

Romer, Jeremy D.; Gitelman, Alix I.; Clements, Shaun; Schreck, Carl B.

2015-01-01

A number of researchers have attempted to estimate salmonid smolt survival during outmigration through an estuary. However, it is currently unclear how the design of such studies influences the accuracy and precision of survival estimates. In this simulation study we consider four patterns of smolt survival probability in the estuary, and test the performance of several different sampling strategies for estimating estuarine survival assuming perfect detection. The four survival probability patterns each incorporate a systematic component (constant, linearly increasing, increasing and then decreasing, and two pulses) and a random component to reflect daily fluctuations in survival probability. Generally, spreading sampling effort (tagging) across the season resulted in more accurate estimates of survival. All sampling designs in this simulation tended to under-estimate the variation in the survival estimates because seasonal and daily variation in survival probability are not incorporated in the estimation procedure. This under-estimation results in poorer performance of estimates from larger samples. Thus, tagging more fish may not result in better estimates of survival if important components of variation are not accounted for. The results of our simulation incorporate survival probabilities and run distribution data from previous studies to help illustrate the tradeoffs among sampling strategies in terms of the number of tags needed and distribution of tagging effort. This information will assist researchers in developing improved monitoring programs and encourage discussion regarding issues that should be addressed prior to implementation of any telemetry-based monitoring plan. We believe implementation of an effective estuary survival monitoring program will strengthen the robustness of life cycle models used in recovery plans by providing missing data on where and how much mortality occurs in the riverine and estuarine portions of smolt migration. These data
Estimation of the deoxynivalenol and moisture contents of bulk wheat grain samples by FT-NIR spectroscopy

Science.gov (United States)

Deoxynivalenol (DON) levels in harvested grain samples are used to evaluate the Fusarium head blight (FHB) resistance of wheat cultivars and breeding lines. Fourier transform near-infrared (FT-NIR) calibrations were developed to estimate the DON and moisture content (MC) of bulk wheat grain samples ...
Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient

Science.gov (United States)

Krishnamoorthy, K.; Xia, Yanping

2008-01-01

The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…
Asymptotic analysis of the role of spatial sampling for covariance parameter estimation of Gaussian processes

International Nuclear Information System (INIS)

Bachoc, Francois

2014-01-01

Covariance parameter estimation of Gaussian processes is analyzed in an asymptotic framework. The spatial sampling is a randomly perturbed regular grid and its deviation from the perfect regular grid is controlled by a single scalar regularity parameter. Consistency and asymptotic normality are proved for the Maximum Likelihood and Cross Validation estimators of the covariance parameters. The asymptotic covariance matrices of the covariance parameter estimators are deterministic functions of the regularity parameter. By means of an exhaustive study of the asymptotic covariance matrices, it is shown that the estimation is improved when the regular grid is strongly perturbed. Hence, an asymptotic confirmation is given to the commonly admitted fact that using groups of observation points with small spacing is beneficial to covariance function estimation. Finally, the prediction error, using a consistent estimator of the covariance parameters, is analyzed in detail. (authors)
The influence of selection on the evolutionary distance estimated from the base changes observed between homologous nucleotide sequences.

Science.gov (United States)

Otsuka, J; Kawai, Y; Sugaya, N

2001-11-21

In most studies of molecular evolution, the nucleotide base at a site is assumed to change with the apparent rate under functional constraint, and the comparison of base changes between homologous genes is thought to yield the evolutionary distance corresponding to the site-average change rate multiplied by the divergence time. However, this view is not sufficiently successful in estimating the divergence time of species, but mostly results in the construction of tree topology without a time-scale. In the present paper, this problem is investigated theoretically by considering that observed base changes are the results of comparing the survivals through selection of mutated bases. In the case of weak selection, the time course of base changes due to mutation and selection can be obtained analytically, leading to a theoretical equation showing how the selection has influence on the evolutionary distance estimated from the enumeration of base changes. This result provides a new method for estimating the divergence time more accurately from the observed base changes by evaluating both the strength of selection and the mutation rate. The validity of this method is verified by analysing the base changes observed at the third codon positions of amino acid residues with four-fold codon degeneracy in the protein genes of mammalian mitochondria; i.e. the ratios of estimated divergence times are fairly well consistent with a series of fossil records of mammals. Throughout this analysis, it is also suggested that the mutation rates in mitochondrial genomes are almost the same in different lineages of mammals and that the lineage-specific base-change rates indicated previously are due to the selection probably arising from the preference of transfer RNAs to codons.
Perils of parsimony: properties of reduced-rank estimates of genetic covariance matrices.

Science.gov (United States)

Meyer, Karin; Kirkpatrick, Mark

2008-10-01

Eigenvalues and eigenvectors of covariance matrices are important statistics for multivariate problems in many applications, including quantitative genetics. Estimates of these quantities are subject to different types of bias. This article reviews and extends the existing theory on these biases, considering a balanced one-way classification and restricted maximum-likelihood estimation. Biases are due to the spread of sample roots and arise from ignoring selected principal components when imposing constraints on the parameter space, to ensure positive semidefinite estimates or to estimate covariance matrices of chosen, reduced rank. In addition, it is shown that reduced-rank estimators that consider only the leading eigenvalues and -vectors of the "between-group" covariance matrix may be biased due to selecting the wrong subset of principal components. In a genetic context, with groups representing families, this bias is inverse proportional to the degree of genetic relationship among family members, but is independent of sample size. Theoretical results are supplemented by a simulation study, demonstrating close agreement between predicted and observed bias for large samples. It is emphasized that the rank of the genetic covariance matrix should be chosen sufficiently large to accommodate all important genetic principal components, even though, paradoxically, this may require including a number of components with negligible eigenvalues. A strategy for rank selection in practical analyses is outlined.

Soybean yield modeling using bootstrap methods for small samples

Energy Technology Data Exchange (ETDEWEB)

Dalposso, G.A.; Uribe-Opazo, M.A.; Johann, J.A.

2016-11-01

One of the problems that occur when working with regression models is regarding the sample size; once the statistical methods used in inferential analyzes are asymptotic if the sample is small the analysis may be compromised because the estimates will be biased. An alternative is to use the bootstrap methodology, which in its non-parametric version does not need to guess or know the probability distribution that generated the original sample. In this work we used a set of soybean yield data and physical and chemical soil properties formed with fewer samples to determine a multiple linear regression model. Bootstrap methods were used for variable selection, identification of influential points and for determination of confidence intervals of the model parameters. The results showed that the bootstrap methods enabled us to select the physical and chemical soil properties, which were significant in the construction of the soybean yield regression model, construct the confidence intervals of the parameters and identify the points that had great influence on the estimated parameters. (Author)
Blind CP-OFDM and ZP-OFDM Parameter Estimation in Frequency Selective Channels

Directory of Open Access Journals (Sweden)

Vincent Le Nir

2009-01-01

Full Text Available A cognitive radio system needs accurate knowledge of the radio spectrum it operates in. Blind modulation recognition techniques have been proposed to discriminate between single-carrier and multicarrier modulations and to estimate their parameters. Some powerful techniques use autocorrelation- and cyclic autocorrelation-based features of the transmitted signal applying to OFDM signals using a Cyclic Prefix time guard interval (CP-OFDM. In this paper, we propose a blind parameter estimation technique based on a power autocorrelation feature applying to OFDM signals using a Zero Padding time guard interval (ZP-OFDM which in particular excludes the use of the autocorrelation- and cyclic autocorrelation-based techniques. The proposed technique leads to an efficient estimation of the symbol duration and zero padding duration in frequency selective channels, and is insensitive to receiver phase and frequency offsets. Simulation results are given for WiMAX and WiMedia signals using realistic Stanford University Interim (SUI and Ultra-Wideband (UWB IEEE 802.15.4a channel models, respectively.
Assessing NIR & MIR Spectral Analysis as a Method for Soil C Estimation Across a Network of Sampling Sites

Science.gov (United States)

Spencer, S.; Ogle, S.; Borch, T.; Rock, B.

2008-12-01

Monitoring soil C stocks is critical to assess the impact of future climate and land use change on carbon sinks and sources in agricultural lands. A benchmark network for soil carbon monitoring of stock changes is being designed for US agricultural lands with 3000-5000 sites anticipated and re-sampling on a 5- to10-year basis. Approximately 1000 sites would be sampled per year producing around 15,000 soil samples to be processed for total, organic, and inorganic carbon, as well as bulk density and nitrogen. Laboratory processing of soil samples is cost and time intensive, therefore we are testing the efficacy of using near-infrared (NIR) and mid-infrared (MIR) spectral methods for estimating soil carbon. As part of an initial implementation of national soil carbon monitoring, we collected over 1800 soil samples from 45 cropland sites in the mid-continental region of the U.S. Samples were processed using standard laboratory methods to determine the variables above. Carbon and nitrogen were determined by dry combustion and inorganic carbon was estimated with an acid-pressure test. 600 samples are being scanned using a bench- top NIR reflectance spectrometer (30 g of 2 mm oven-dried soil and 30 g of 8 mm air-dried soil) and 500 samples using a MIR Fourier-Transform Infrared Spectrometer (FTIR) with a DRIFT reflectance accessory (0.2 g oven-dried ground soil). Lab-measured carbon will be compared to spectrally-estimated carbon contents using Partial Least Squares (PLS) multivariate statistical approach. PLS attempts to develop a soil C predictive model that can then be used to estimate C in soil samples not lab-processed. The spectral analysis of soil samples either whole or partially processed can potentially save both funding resources and time to process samples. This is particularly relevant for the implementation of a national monitoring network for soil carbon. This poster will discuss our methods, initial results and potential for using NIR and MIR spectral
The use of laser-induced fluorescence or ultraviolet detectors for sensitive and selective analysis of tobramycin or erythropoietin in complex samples

Science.gov (United States)

Ahmed, Hytham M.; Ebeid, Wael B.

2015-05-01

Complex samples analysis is a challenge in pharmaceutical and biopharmaceutical analysis. In this work, tobramycin (TOB) analysis in human urine samples and recombinant human erythropoietin (rhEPO) analysis in the presence of similar protein were selected as representative examples of such samples analysis. Assays of TOB in urine samples are difficult because of poor detectability. Therefore laser induced fluorescence detector (LIF) was combined with a separation technique, micellar electrokinetic chromatography (MEKC), to determine TOB through derivatization with fluorescein isothiocyanate (FITC). Borate was used as background electrolyte (BGE) with negative-charged mixed micelles as additive. The method was successively applied to urine samples. The LOD and LOQ for Tobramycin in urine were 90 and 200 ng/ml respectively and recovery was >98% (n = 5). All urine samples were analyzed by direct injection without sample pre-treatment. Another use of hyphenated analytical technique, capillary zone electrophoresis (CZE) connected to ultraviolet (UV) detector was also used for sensitive analysis of rhEPO at low levels (2000 IU) in the presence of large amount of human serum albumin (HSA). Analysis of rhEPO was achieved by the use of the electrokinetic injection (EI) with discontinuous buffers. Phosphate buffer was used as BGE with metal ions as additive. The proposed method can be used for the estimation of large number of quality control rhEPO samples in a short period.
Sensitivity of landscape resistance estimates based on point selection functions to scale and behavioral state: Pumas as a case study

Science.gov (United States)

Katherine A. Zeller; Kevin McGarigal; Paul Beier; Samuel A. Cushman; T. Winston Vickers; Walter M. Boyce

2014-01-01

Estimating landscape resistance to animal movement is the foundation for connectivity modeling, and resource selection functions based on point data are commonly used to empirically estimate resistance. In this study, we used GPS data points acquired at 5-min intervals from radiocollared pumas in southern California to model context-dependent point selection...
Sampling in Developmental Science: Situations, Shortcomings, Solutions, and Standards.

Science.gov (United States)

Bornstein, Marc H; Jager, Justin; Putnick, Diane L

2013-12-01

Sampling is a key feature of every study in developmental science. Although sampling has far-reaching implications, too little attention is paid to sampling. Here, we describe, discuss, and evaluate four prominent sampling strategies in developmental science: population-based probability sampling, convenience sampling, quota sampling, and homogeneous sampling. We then judge these sampling strategies by five criteria: whether they yield representative and generalizable estimates of a study's target population, whether they yield representative and generalizable estimates of subsamples within a study's target population, the recruitment efforts and costs they entail, whether they yield sufficient power to detect subsample differences, and whether they introduce "noise" related to variation in subsamples and whether that "noise" can be accounted for statistically. We use sample composition of gender, ethnicity, and socioeconomic status to illustrate and assess the four sampling strategies. Finally, we tally the use of the four sampling strategies in five prominent developmental science journals and make recommendations about best practices for sample selection and reporting.
Statistical properties of mean stand biomass estimators in a LIDAR-based double sampling forest survey design.

Science.gov (United States)

H.E. Anderson; J. Breidenbach

2007-01-01

Airborne laser scanning (LIDAR) can be a valuable tool in double-sampling forest survey designs. LIDAR-derived forest structure metrics are often highly correlated with important forest inventory variables, such as mean stand biomass, and LIDAR-based synthetic regression estimators have the potential to be highly efficient compared to single-stage estimators, which...
Varying Coefficient Panel Data Model in the Presence of Endogenous Selectivity and Fixed Effects

OpenAIRE

Malikov, Emir; Kumbhakar, Subal C.; Sun, Yiguo

2013-01-01

This paper considers a flexible panel data sample selection model in which (i) the outcome equation is permitted to take a semiparametric, varying coefficient form to capture potential parameter heterogeneity in the relationship of interest, (ii) both the outcome and (parametric) selection equations contain unobserved fixed effects and (iii) selection is generalized to a polychotomous case. We propose a two-stage estimator. Given consistent parameter estimates from the selection equation obta...
Decomposing the Gender Wage Gap in the Netherlands with Sample Selection Adjustments

NARCIS (Netherlands)

Albrecht, James; Vuuren, van Aico; Vroman, Susan

2004-01-01

In this paper, we use quantile regression decomposition methods to analyzethe gender gap between men and women who work full time in the Nether-lands. Because the fraction of women working full time in the Netherlands isquite low, sample selection is a serious issue. In addition to shedding light
OPTIMAL WAVELENGTH SELECTION ON HYPERSPECTRAL DATA WITH FUSED LASSO FOR BIOMASS ESTIMATION OF TROPICAL RAIN FOREST

Directory of Open Access Journals (Sweden)

T. Takayama

2016-06-01

Full Text Available Above-ground biomass prediction of tropical rain forest using remote sensing data is of paramount importance to continuous large-area forest monitoring. Hyperspectral data can provide rich spectral information for the biomass prediction; however, the prediction accuracy is affected by a small-sample-size problem, which widely exists as overfitting in using high dimensional data where the number of training samples is smaller than the dimensionality of the samples due to limitation of require time, cost, and human resources for field surveys. A common approach to addressing this problem is reducing the dimensionality of dataset. Also, acquired hyperspectral data usually have low signal-to-noise ratio due to a narrow bandwidth and local or global shifts of peaks due to instrumental instability or small differences in considering practical measurement conditions. In this work, we propose a methodology based on fused lasso regression that select optimal bands for the biomass prediction model with encouraging sparsity and grouping, which solves the small-sample-size problem by the dimensionality reduction from the sparsity and the noise and peak shift problem by the grouping. The prediction model provided higher accuracy with root-mean-square error (RMSE of 66.16 t/ha in the cross-validation than other methods; multiple linear analysis, partial least squares regression, and lasso regression. Furthermore, fusion of spectral and spatial information derived from texture index increased the prediction accuracy with RMSE of 62.62 t/ha. This analysis proves efficiency of fused lasso and image texture in biomass estimation of tropical forests.
Evaluation of a segment-based LANDSAT full-frame approach to corp area estimation

Science.gov (United States)

Bauer, M. E. (Principal Investigator); Hixson, M. M.; Davis, S. M.

1981-01-01

As the registration of LANDSAT full frames enters the realm of current technology, sampling methods should be examined which utilize other than the segment data used for LACIE. The effect of separating the functions of sampling for training and sampling for area estimation. The frame selected for analysis was acquired over north central Iowa on August 9, 1978. A stratification of he full-frame was defined. Training data came from segments within the frame. Two classification and estimation procedures were compared: statistics developed on one segment were used to classify that segment, and pooled statistics from the segments were used to classify a systematic sample of pixels. Comparisons to USDA/ESCS estimates illustrate that the full-frame sampling approach can provide accurate and precise area estimates.
SnagPRO: snag and tree sampling and analysis methods for wildlife

Science.gov (United States)

Lisa J. Bate; Michael J. Wisdom; Edward O. Garton; Shawn C. Clabough

2008-01-01

We describe sampling methods and provide software to accurately and efficiently estimate snag and tree densities at desired scales to meet a variety of research and management objectives. The methods optimize sampling effort by choosing a plot size appropriate for the specified forest conditions and sampling goals. Plot selection and data analyses are supported by...
Software documentation and user's manual for fish-impingement sampling design and estimation method computer programs

International Nuclear Information System (INIS)

Murarka, I.P.; Bodeau, D.J.

1977-11-01

This report contains a description of three computer programs that implement the theory of sampling designs and the methods for estimating fish-impingement at the cooling-water intakes of nuclear power plants as described in companion report ANL/ES-60. Complete FORTRAN listings of these programs, named SAMPLE, ESTIMA, and SIZECO, are given and augmented with examples of how they are used
Application of the Sampling Selection Technique in Approaching Financial Audit

Directory of Open Access Journals (Sweden)

Victor Munteanu

2018-03-01

Full Text Available In his professional approach, the financial auditor has a wide range of working techniques, including selection techniques. They are applied depending on the nature of the information available to the financial auditor, the manner in which they are presented - paper or electronic format, and, last but not least, the time available. Several techniques are applied, successively or in parallel, to increase the safety of the expressed opinion and to provide the audit report with a solid basis of information. Sampling is used in the phase of control or clarification of the identified error. The main purpose is to corroborate or measure the degree of risk detected following a pertinent analysis. Since the auditor does not have time or means to thoroughly rebuild the information, the sampling technique can provide an effective response to the need for valorization.
Adaptive sampling program support for expedited site characterization

International Nuclear Information System (INIS)

Johnson, R.

1993-01-01

Expedited site characterizations offer substantial savings in time and money when assessing hazardous waste sites. Key to some of these savings is the ability to adapt a sampling program to the ''real-time'' data generated by an expedited site characterization. This paper presents a two-prong approach to supporting adaptive sampling programs: a specialized object-oriented database/geographical information system for data fusion, management and display; and combined Bayesian/geostatistical methods for contamination extent estimation and sample location selection
Correlations fo Sc, rare earths and other elements in selected rock samples from Arrua-i

Energy Technology Data Exchange (ETDEWEB)

Facetti, J F; Prats, M [Asuncion Nacional Univ. (Paraguay). Inst. de Ciencias

1972-01-01

The Sc and Eu contents in selected rocks samples from the stock of Arrua-i have been determined and correlations established with other elements and with the relative amount of some rare earths. These correlations suggest metasomatic phenomena for the formation of the rock samples.
Correlations fo Sc, rare earths and other elements in selected rock samples from Arrua-i

International Nuclear Information System (INIS)

Facetti, J.F.; Prats, M.

1972-01-01

The Sc and Eu contents in selected rocks samples from the stock of Arrua-i have been determined and correlations established with other elements and with the relative amount of some rare earths. These correlations suggest metasomatic phenomena for the formation of the rock samples
Ground Receiving Station Reference Pair Selection Technique for a Minimum Configuration 3D Emitter Position Estimation Multilateration System

Directory of Open Access Journals (Sweden)

Abdulmalik Shehu Yaro

2017-01-01

Full Text Available Multilateration estimates aircraft position using the Time Difference Of Arrival (TDOA with a lateration algorithm. The Position Estimation (PE accuracy of the lateration algorithm depends on several factors which are the TDOA estimation error, the lateration algorithm approach, the number of deployed GRSs and the selection of the GRS reference used for the PE process. Using the minimum number of GRSs for 3D emitter PE, a technique based on the condition number calculation is proposed to select the suitable GRS reference pair for improving the accuracy of the PE using the lateration algorithm. Validation of the proposed technique was performed with the GRSs in the square and triangular GRS configuration. For the selected emitter positions, the result shows that the proposed technique can be used to select the suitable GRS reference pair for the PE process. A unity condition number is achieved for GRS pair most suitable for the PE process. Monte Carlo simulation result, in comparison with the fixed GRS reference pair lateration algorithm, shows a reduction in PE error of at least 70% for both GRS in the square and triangular configuration.
Hierarchical feature selection for erythema severity estimation

Science.gov (United States)

Wang, Li; Shi, Chenbo; Shu, Chang

2014-10-01

At present PASI system of scoring is used for evaluating erythema severity, which can help doctors to diagnose psoriasis [1-3]. The system relies on the subjective judge of doctors, where the accuracy and stability cannot be guaranteed [4]. This paper proposes a stable and precise algorithm for erythema severity estimation. Our contributions are twofold. On one hand, in order to extract the multi-scale redness of erythema, we design the hierarchical feature. Different from traditional methods, we not only utilize the color statistical features, but also divide the detect window into small window and extract hierarchical features. Further, a feature re-ranking step is introduced, which can guarantee that extracted features are irrelevant to each other. On the other hand, an adaptive boosting classifier is applied for further feature selection. During the step of training, the classifier will seek out the most valuable feature for evaluating erythema severity, due to its strong learning ability. Experimental results demonstrate the high precision and robustness of our algorithm. The accuracy is 80.1% on the dataset which comprise 116 patients' images with various kinds of erythema. Now our system has been applied for erythema medical efficacy evaluation in Union Hosp, China.
Sampling procedures for inventory of commercial volume tree species in Amazon Forest.

Science.gov (United States)

Netto, Sylvio P; Pelissari, Allan L; Cysneiros, Vinicius C; Bonazza, Marcelo; Sanquetta, Carlos R

2017-01-01

The spatial distribution of tropical tree species can affect the consistency of the estimators in commercial forest inventories, therefore, appropriate sampling procedures are required to survey species with different spatial patterns in the Amazon Forest. For this, the present study aims to evaluate the conventional sampling procedures and introduce the adaptive cluster sampling for volumetric inventories of Amazonian tree species, considering the hypotheses that the density, the spatial distribution and the zero-plots affect the consistency of the estimators, and that the adaptive cluster sampling allows to obtain more accurate volumetric estimation. We use data from a census carried out in Jamari National Forest, Brazil, where trees with diameters equal to or higher than 40 cm were measured in 1,355 plots. Species with different spatial patterns were selected and sampled with simple random sampling, systematic sampling, linear cluster sampling and adaptive cluster sampling, whereby the accuracy of the volumetric estimation and presence of zero-plots were evaluated. The sampling procedures applied to species were affected by the low density of trees and the large number of zero-plots, wherein the adaptive clusters allowed concentrating the sampling effort in plots with trees and, thus, agglutinating more representative samples to estimate the commercial volume.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.