Estimating the exceedance probability of rain rate by logistic regression
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
Optimized support vector regression for drilling rate of penetration estimation
Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa
2015-12-01
In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.
Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples
Institute of Scientific and Technical Information of China (English)
Hongchang HU; Li WU
2012-01-01
Consider the following heteroscedastic semiparametric regression model:yi =XTiβ + g(ti) + σiei, 1 ＜ i ≤ n,where {Xi,1 ＜ i ＜ n} are random design points,errors {ei,1 ＜ i ＜ n} are negatively associated (NA) random variables,σ2i =h(ui),and {ui} and {ti} are two nonrandom sequences on [0,1].Some wavelet estimators of the parametric component β,the nonparametric component g(t) and the variance function h(u) are given.Under some general conditions,the strong convergence rate of these wavelet estimators is O(n -1/3 log n).Hence our results are extensions of those results on independent random error settings.
Spatter Rate Estimation of GMAW-S based on Partial Least Square Regression
Institute of Scientific and Technical Information of China (English)
CAI Yan; WANG Guang-wei; YANG Hai-lan; HUA Xue-ming; WU Yi-xiong
2008-01-01
This paper analyzes the drop transfer process in gas metal arc welding in short-circuit transfer mode (GMAW-S) in order to develop an optimized spatter rate model that can be used on line.According to thermodynamic characters and practical behavior,a complete arcing process is divided into three sub-processes:arc re-ignition,energy output and shorting preparation.Shorting process is then divided as drop spread,bridge sustention and bridge destabilization.Nine process variables and their distribution are analyzed based on welding experiments with high-speed photos and synchronous current and voltage signals.Method of variation coefficient is used to reflect process consistency and to design characteristic parameters.Partial least square regression (PLSR) is utilized to set up spatter rate model because of severe correlativity among the above characteristic parameters.PLSR is a new multivariate statistical analysis method,in which regression modeling,data simplification and relativity analysis are included in a single algorithm.Experiment results show that the regression equation based on PLSR is effective for on-line predicting spatter rate of its corresponding welding condition.
Directory of Open Access Journals (Sweden)
Miguel Angel Luque-Fernandez
2016-10-01
Full Text Available Abstract Background In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean. Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. Methods We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. Results All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value <0.001. However, the flexible piecewise exponential model showed the smallest overdispersion parameter (3.2 versus 21.3 for non-flexible piecewise exponential models. Conclusion We showed that there were no major differences between methods. However, using a flexible piecewise regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.
Modified Nonparametric Kernel Estimates of a Regression Function and their Consistencies with Rates.
1985-04-01
estimates. In each case the speed of convergence is examined. An explicit bound for the mean square error, lacking to date in the literature for the...suP cBIg (x)-g(x)Il - O(max{nS,(nn) "I/ 21 and - -1/2suPx B lg(x)-g(x)l O(max{nS(nn)’ 1) in prob. To deduce the uniform weak consistency of r and r...Multivariate Analysis 515 Thftckeray Hall University ofPittsburgh._Pgh._PA__15260______________ It. CONTROLLING OFFICE NAME AND ADDRESS ta. REPORT DATE Air
Risser, Dennis W.; Thompson, Ronald E.; Stuckey, Marla H.
2008-01-01
A method was developed for making estimates of long-term, mean annual ground-water recharge from streamflow data at 80 streamflow-gaging stations in Pennsylvania. The method relates mean annual base-flow yield derived from the streamflow data (as a proxy for recharge) to the climatic, geologic, hydrologic, and physiographic characteristics of the basins (basin characteristics) by use of a regression equation. Base-flow yield is the base flow of a stream divided by the drainage area of the basin, expressed in inches of water basinwide. Mean annual base-flow yield was computed for the period of available streamflow record at continuous streamflow-gaging stations by use of the computer program PART, which separates base flow from direct runoff on the streamflow hydrograph. Base flow provides a reasonable estimate of recharge for basins where streamflow is mostly unaffected by upstream regulation, diversion, or mining. Twenty-eight basin characteristics were included in the exploratory regression analysis as possible predictors of base-flow yield. Basin characteristics found to be statistically significant predictors of mean annual base-flow yield during 1971-2000 at the 95-percent confidence level were (1) mean annual precipitation, (2) average maximum daily temperature, (3) percentage of sand in the soil, (4) percentage of carbonate bedrock in the basin, and (5) stream channel slope. The equation for predicting recharge was developed using ordinary least-squares regression. The standard error of prediction for the equation on log-transformed data was 9.7 percent, and the coefficient of determination was 0.80. The equation can be used to predict long-term, mean annual recharge rates for ungaged basins, providing that the explanatory basin characteristics can be determined and that the underlying assumption is accepted that base-flow yield derived from PART is a reasonable estimate of ground-water recharge rates. For example, application of the equation for 370
Tang, Yongqiang
2015-01-01
A sample size formula is derived for negative binomial regression for the analysis of recurrent events, in which subjects can have unequal follow-up time. We obtain sharp lower and upper bounds on the required size, which is easy to compute. The upper bound is generally only slightly larger than the required size, and hence can be used to approximate the sample size. The lower and upper size bounds can be decomposed into two terms. The first term relies on the mean number of events in each group, and the second term depends on two factors that measure, respectively, the extent of between-subject variability in event rates, and follow-up time. Simulation studies are conducted to assess the performance of the proposed method. An application of our formulae to a multiple sclerosis trial is provided.
Asymptotic theory of nonparametric regression estimates with censored data
Institute of Scientific and Technical Information of China (English)
施沛德; 王海燕; 张利华
2000-01-01
For regression analysis, some useful Information may have been lost when the responses are right censored. To estimate nonparametric functions, several estimates based on censored data have been proposed and their consistency and convergence rates have been studied in literat黵e, but the optimal rates of global convergence have not been obtained yet. Because of the possible Information loss, one may think that it is impossible for an estimate based on censored data to achieve the optimal rates of global convergence for nonparametric regression, which were established by Stone based on complete data. This paper constructs a regression spline estimate of a general nonparametric regression f unction based on right-censored response data, and proves, under some regularity condi-tions, that this estimate achieves the optimal rates of global convergence for nonparametric regression. Since the parameters for the nonparametric regression estimate have to be chosen based on a data driven criterion, we also obtai
Parametric Regression Models Using Reversed Hazard Rates
Directory of Open Access Journals (Sweden)
Asokan Mulayath Variyath
2014-01-01
Full Text Available Proportional hazard regression models are widely used in survival analysis to understand and exploit the relationship between survival time and covariates. For left censored survival times, reversed hazard rate functions are more appropriate. In this paper, we develop a parametric proportional hazard rates model using an inverted Weibull distribution. The estimation and construction of confidence intervals for the parameters are discussed. We assess the performance of the proposed procedure based on a large number of Monte Carlo simulations. We illustrate the proposed method using a real case example.
Nonlinear wavelet estimation of regression function with random desigm
Institute of Scientific and Technical Information of China (English)
张双林; 郑忠国
1999-01-01
The nonlinear wavelet estimator of regression function with random design is constructed. The optimal uniform convergence rate of the estimator in a ball of Besov space Bp,q? is proved under quite genera] assumpations. The adaptive nonlinear wavelet estimator with near-optimal convergence rate in a wide range of smoothness function classes is also constructed. The properties of the nonlinear wavelet estimator given for random design regression and only with bounded third order moment of the error can be compared with those of nonlinear wavelet estimator given in literature for equal-spaced fixed design regression with i.i.d. Gauss error.
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
Asymptotic theory of nonparametric regression estimates with censored data
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
For regression analysis, some useful information may have been lost when the responses are right censored. To estimate nonparametric functions, several estimates based on censored data have been proposed and their consistency and convergence rates have been studied in literature, but the optimal rates of global convergence have not been obtained yet. Because of the possible information loss, one may think that it is impossible for an estimate based on censored data to achieve the optimal rates of global convergence for nonparametric regression, which were established by Stone based on complete data. This paper constructs a regression spline estimate of a general nonparametric regression function based on right_censored response data, and proves, under some regularity conditions, that this estimate achieves the optimal rates of global convergence for nonparametric regression. Since the parameters for the nonparametric regression estimate have to be chosen based on a data driven criterion, we also obtain the asymptotic optimality of AIC, AICC, GCV, Cp and FPE criteria in the process of selecting the parameters.
Adaptive Rank Penalized Estimators in Multivariate Regression
Bunea, Florentina; Wegkamp, Marten
2010-01-01
We introduce a new criterion, the Rank Selection Criterion (RSC), for selecting the optimal reduced rank estimator of the coefficient matrix in multivariate response regression models. The corresponding RSC estimator minimizes the Frobenius norm of the fit plus a regularization term proportional to the number of parameters in the reduced rank model. The rank of the RSC estimator provides a consistent estimator of the rank of the coefficient matrix. The consistency results are valid not only in the classic asymptotic regime, when the number of responses $n$ and predictors $p$ stays bounded, and the number of observations $m$ grows, but also when either, or both, $n$ and $p$ grow, possibly much faster than $m$. Our finite sample prediction and estimation performance bounds show that the RSC estimator achieves the optimal balance between the approximation error and the penalty term. Furthermore, our procedure has very low computational complexity, linear in the number of candidate models, making it particularly ...
Change-point estimation for censored regression model
Institute of Scientific and Technical Information of China (English)
Zhan-feng WANG; Yao-hua WU; Lin-cheng ZHAO
2007-01-01
In this paper, we consider the change-point estimation in the censored regression model assuming that there exists one change point. A nonparametric estimate of the change-point is proposed and is shown to be strongly consistent. Furthermore, its convergence rate is also obtained.
ASYMPTOTIC EFFICIENT ESTIMATION IN SEMIPARAMETRIC NONLINEAR REGRESSION MODELS
Institute of Scientific and Technical Information of China (English)
ZhuZhongyi; WeiBocheng
1999-01-01
In this paper, the estimation method based on the “generalized profile likelihood” for the conditionally parametric models in the paper given by Severini and Wong (1992) is extendedto fixed design semiparametrie nonlinear regression models. For these semiparametrie nonlinear regression models,the resulting estimator of parametric component of the model is shown to beasymptotically efficient and the strong convergence rate of nonparametric component is investigated. Many results (for example Chen (1988) ,Gao & Zhao (1993), Rice (1986) et al. ) are extended to fixed design semiparametric nonlinear regression models.
Remaining Phosphorus Estimate Through Multiple Regression Analysis
Institute of Scientific and Technical Information of China (English)
M. E. ALVES; A. LAVORENTI
2006-01-01
The remaining phosphorus (Prem), P concentration that remains in solution after shaking soil with 0.01 mol L-1 CaCl2 containing 60 μg mL-1 P, is a very useful index for studies related to the chemistry of variable charge soils. Although the Prem determination is a simple procedure, the possibility of estimating accurate values of this index from easily and/or routinely determined soil properties can be very useful for practical purposes. The present research evaluated the Premestimation through multiple regression analysis in which routinely determined soil chemical data, soil clay content and soil pH measured in 1 mol L-1 NaF (pHNaF) figured as Prem predictor variables. The Prem can be estimated with acceptable accuracy using the above-mentioned approach, and PHNaF not only substitutes for clay content as a predictor variable but also confers more accuracy to the Prem estimates.
Estimation of pyrethroid pesticide intake using regression ...
Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression modeling performed on measurements of selected pesticides in composited duplicate diet samples allowed (1) estimation of pesticide intakes for a defined demographic community, and (2) comparison of dietary pesticide intakes between the composite and individual samples. Extant databases were useful for assigning individual samples to composites, but they could not provide the breadth of information needed to facilitate measurable levels in every composite. Composite sample measurements were found to be good predictors of pyrethroid pesticide levels in their individual sample constituents where sufficient measurements are available above the method detection limit. Statistical inference shows little evidence of differences between individual and composite measurements and suggests that regression modeling of food groups based on composite dietary samples may provide an effective tool for estimating dietary pesticide intake for a defined population. The research presented in the journal article will improve community's ability to determine exposures through the dietary route with a less burdensome and costly method.
Spectral Experts for Estimating Mixtures of Linear Regressions
Chaganty, Arun Tejasvi; Liang, Percy
2013-01-01
Discriminative latent-variable models are typically learned using EM or gradient-based optimization, which suffer from local optima. In this paper, we develop a new computationally efficient and provably consistent estimator for a mixture of linear regressions, a simple instance of a discriminative latent-variable model. Our approach relies on a low-rank linear regression to recover a symmetric tensor, which can be factorized into the parameters using a tensor power method. We prove rates of ...
Projection-type estimation for varying coefficient regression models
Lee, Young K; Park, Byeong U; 10.3150/10-BEJ331
2012-01-01
In this paper we introduce new estimators of the coefficient functions in the varying coefficient regression model. The proposed estimators are obtained by projecting the vector of the full-dimensional kernel-weighted local polynomial estimators of the coefficient functions onto a Hilbert space with a suitable norm. We provide a backfitting algorithm to compute the estimators. We show that the algorithm converges at a geometric rate under weak conditions. We derive the asymptotic distributions of the estimators and show that the estimators have the oracle properties. This is done for the general order of local polynomial fitting and for the estimation of the derivatives of the coefficient functions, as well as the coefficient functions themselves. The estimators turn out to have several theoretical and numerical advantages over the marginal integration estimators studied by Yang, Park, Xue and H\\"{a}rdle [J. Amer. Statist. Assoc. 101 (2006) 1212--1227].
Estimates on compressed neural networks regression.
Zhang, Yongquan; Li, Youmei; Sun, Jianyong; Ji, Jiabing
2015-03-01
When the neural element number n of neural networks is larger than the sample size m, the overfitting problem arises since there are more parameters than actual data (more variable than constraints). In order to overcome the overfitting problem, we propose to reduce the number of neural elements by using compressed projection A which does not need to satisfy the condition of Restricted Isometric Property (RIP). By applying probability inequalities and approximation properties of the feedforward neural networks (FNNs), we prove that solving the FNNs regression learning algorithm in the compressed domain instead of the original domain reduces the sample error at the price of an increased (but controlled) approximation error, where the covering number theory is used to estimate the excess error, and an upper bound of the excess error is given.
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Regression Discontinuity Designs with Multiple Rating-Score Variables
Reardon, Sean F.; Robinson, Joseph P.
2012-01-01
In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those…
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Directory of Open Access Journals (Sweden)
Biqing Cai
2015-04-01
Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.
A SAS-macro for estimation of the cumulative incidence using Poisson regression
DEFF Research Database (Denmark)
Waltoft, Berit Lindum
2009-01-01
the hazard rates, and the hazard rates are often estimated by the Cox regression. This procedure may not be suitable for large studies due to limited computer resources. Instead one uses Poisson regression, which approximates the Cox regression. Rosthøj et al. presented a SAS-macro for the estimation...... of the cumulative incidences based on the Cox regression. I present the functional form of the probabilities and variances when using piecewise constant hazard rates and a SAS-macro for the estimation using Poisson regression. The use of the macro is demonstrated through examples and compared to the macro presented...
Improved estimation in a non-Gaussian parametric regression
Pchelintsev, Evgeny
2011-01-01
The paper considers the problem of estimating the parameters in a continuous time regression model with a non-Gaussian noise of pulse type. The noise is specified by the Ornstein-Uhlenbeck process driven by the mixture of a Brownian motion and a compound Poisson process. Improved estimates for the unknown regression parameters, based on a special modification of the James-Stein procedure with smaller quadratic risk than the usual least squares estimates, are proposed. The developed estimation scheme is applied for the improved parameter estimation in the discrete time regression with the autoregressive noise depending on unknown nuisance parameters.
Regression on manifolds: Estimation of the exterior derivative
Aswani, Anil; Tomlin, Claire; 10.1214/10-AOS823
2011-01-01
Collinearity and near-collinearity of predictors cause difficulties when doing regression. In these cases, variable selection becomes untenable because of mathematical issues concerning the existence and numerical stability of the regression coefficients, and interpretation of the coefficients is ambiguous because gradients are not defined. Using a differential geometric interpretation, in which the regression coefficients are interpreted as estimates of the exterior derivative of a function, we develop a new method to do regression in the presence of collinearities. Our regularization scheme can improve estimation error, and it can be easily modified to include lasso-type regularization. These estimators also have simple extensions to the "large $p$, small $n$" context.
A logistic regression estimating function for spatial Gibbs point processes
DEFF Research Database (Denmark)
Baddeley, Adrian; Coeurjolly, Jean-François; Rubak, Ege
We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related...
Abnormal behavior of the least squares estimate of multiple regression
Institute of Scientific and Technical Information of China (English)
陈希孺; 安鸿志
1997-01-01
An example is given to reveal the abnormal behavior of the least squares estimate of multiple regression. It is shown that the least squares estimate of the multiple linear regression may be "improved in the sense of weak consistency when nuisance parameters are introduced into the model. A discussion on the implications of this finding is given.
Generalized and synthetic regression estimators for randomized branch sampling
David L. R. Affleck; Timothy G. Gregoire
2015-01-01
In felled-tree studies, ratio and regression estimators are commonly used to convert more readily measured branch characteristics to dry crown mass estimates. In some cases, data from multiple trees are pooled to form these estimates. This research evaluates the utility of both tactics in the estimation of crown biomass following randomized branch sampling (...
Nonparametric Least Squares Estimation of a Multivariate Convex Regression Function
Seijo, Emilio
2010-01-01
This paper deals with the consistency of the least squares estimator of a convex regression function when the predictor is multidimensional. We characterize and discuss the computation of such an estimator via the solution of certain quadratic and linear programs. Mild sufficient conditions for the consistency of this estimator and its subdifferentials in fixed and stochastic design regression settings are provided. We also consider a regression function which is known to be convex and componentwise nonincreasing and discuss the characterization, computation and consistency of its least squares estimator.
Remodeling and Estimation for Sparse Partially Linear Regression Models
Directory of Open Access Journals (Sweden)
Yunhui Zeng
2013-01-01
Full Text Available When the dimension of covariates in the regression model is high, one usually uses a submodel as a working model that contains significant variables. But it may be highly biased and the resulting estimator of the parameter of interest may be very poor when the coefficients of removed variables are not exactly zero. In this paper, based on the selected submodel, we introduce a two-stage remodeling method to get the consistent estimator for the parameter of interest. More precisely, in the first stage, by a multistep adjustment, we reconstruct an unbiased model based on the correlation information between the covariates; in the second stage, we further reduce the adjusted model by a semiparametric variable selection method and get a new estimator of the parameter of interest simultaneously. Its convergence rate and asymptotic normality are also obtained. The simulation results further illustrate that the new estimator outperforms those obtained by the submodel and the full model in the sense of mean square errors of point estimation and mean square prediction errors of model prediction.
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno
2017-03-01
This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
PARAMETER ESTIMATION IN LINEAR REGRESSION MODELS FOR LONGITUDINAL CONTAMINATED DATA
Institute of Scientific and Technical Information of China (English)
QianWeimin; LiYumei
2005-01-01
The parameter estimation and the coefficient of contamination for the regression models with repeated measures are studied when its response variables are contaminated by another random variable sequence. Under the suitable conditions it is proved that the estimators which are established in the paper are strongly consistent estimators.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Sparse reduced-rank regression with covariance estimation
Chen, Lisha
2014-12-08
Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Trimmed Likelihood-based Estimation in Binary Regression Models
Cizek, P.
2005-01-01
The binary-choice regression models such as probit and logit are typically estimated by the maximum likelihood method.To improve its robustness, various M-estimation based procedures were proposed, which however require bias corrections to achieve consistency and their resistance to outliers is rela
FUNCTIONAL-COEFFICIENT REGRESSION MODEL AND ITS ESTIMATION
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested. This class of models,with the proposed estimation method,is a powerful means for exploratory data analysis.
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study,measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respectively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demonstrates that it is feasible to estimate the disease severity office brown spot using hyperspectral reflectance data at the leaf level.
Estimating the effects of Exchange and Interest Rates on Stock ...
African Journals Online (AJOL)
Estimating the effects of Exchange and Interest Rates on Stock Market in ... The need to empirically determine the predictive power of exchange rate and ... Keywords: Exchange rate, interest rate, All-share index, multiple regression models
Software Effort Estimation with Ridge Regression and Evolutionary Attribute Selection
Papatheocharous, Efi; Andreou, Andreas S
2010-01-01
Software cost estimation is one of the prerequisite managerial activities carried out at the software development initiation stages and also repeated throughout the whole software life-cycle so that amendments to the total cost are made. In software cost estimation typically, a selection of project attributes is employed to produce effort estimations of the expected human resources to deliver a software product. However, choosing the appropriate project cost drivers in each case requires a lot of experience and knowledge on behalf of the project manager which can only be obtained through years of software engineering practice. A number of studies indicate that popular methods applied in the literature for software cost estimation, such as linear regression, are not robust enough and do not yield accurate predictions. Recently the dual variables Ridge Regression (RR) technique has been used for effort estimation yielding promising results. In this work we show that results may be further improved if an AI meth...
Parameter Estimation for Improving Association Indicators in Binary Logistic Regression
Directory of Open Access Journals (Sweden)
Mahdi Bashiri
2012-02-01
Full Text Available The aim of this paper is estimation of Binary logistic regression parameters for maximizing the log-likelihood function with improved association indicators. In this paper the parameter estimation steps have been explained and then measures of association have been introduced and their calculations have been analyzed. Moreover a new related indicators based on membership degree level have been expressed. Indeed association measures demonstrate the number of success responses occurred in front of failure in certain number of Bernoulli independent experiments. In parameter estimation, existing indicators values is not sensitive to the parameter values, whereas the proposed indicators are sensitive to the estimated parameters during the iterative procedure. Therefore, proposing a new association indicator of binary logistic regression with more sensitivity to the estimated parameters in maximizing the log- likelihood in iterative procedure is innovation of this study.
A regression model to estimate regional ground water recharge.
Lorenz, David L; Delin, Geoffrey N
2007-01-01
A regional regression model was developed to estimate the spatial distribution of ground water recharge in subhumid regions. The regional regression recharge (RRR) model was based on a regression of basin-wide estimates of recharge from surface water drainage basins, precipitation, growing degree days (GDD), and average basin specific yield (SY). Decadal average recharge, precipitation, and GDD were used in the RRR model. The RRR estimates were derived from analysis of stream base flow using a computer program that was based on the Rorabaugh method. As expected, there was a strong correlation between recharge and precipitation. The model was applied to statewide data in Minnesota. Where precipitation was least in the western and northwestern parts of the state (50 to 65 cm/year), recharge computed by the RRR model also was lowest (0 to 5 cm/year). A strong correlation also exists between recharge and SY. SY was least in areas where glacial lake clay occurs, primarily in the northwest part of the state; recharge estimates in these areas were in the 0- to 5-cm/year range. In sand-plain areas where SY is greatest, recharge estimates were in the 15- to 29-cm/year range on the basis of the RRR model. Recharge estimates that were based on the RRR model compared favorably with estimates made on the basis of other methods. The RRR model can be applied in other subhumid regions where region wide data sets of precipitation, streamflow, GDD, and soils data are available.
Institute of Scientific and Technical Information of China (English)
Tao Hu; Heng-jian Cui; Xing-wei Tong
2009-01-01
This article considers a semiparametric varying-coefficient partially linear regression model with current status data. The semiparametric varying-coefficient partially linear regression model which is a gen-eralization of the partially linear regression model and varying-coefficient regression model that allows one to explore the possibly nonlinear effect of a certain covariate on the response variable. A Sieve maximum likelihood estimation method is proposed and the asymptotic properties of the proposed estimators are discussed. Under some mild conditions, the estimators are shown to be strongly consistent. The convergence rate of the estima-tor for the unknown smooth function is obtained and the estimator for the unknown parameter is shown to be asymptotically efficient and normally distributed. Simulation studies are conducted to examine the small-sample properties of the proposed estimates and a real dataset is used to illustrate our approach.
CONSERVATIVE ESTIMATING FUNCTIONIN THE NONLINEAR REGRESSION MODEL WITHAGGREGATED DATA
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
The purpose of this paper is to study the theory of conservative estimating functions in nonlinear regression model with aggregated data. In this model, a quasi-score function with aggregated data is defined. When this function happens to be conservative, it is projection of the true score function onto a class of estimation functions. By constructing, the potential function for the projected score with aggregated data is obtained, which have some properties of log-likelihood function.
Robust Bayesian Regularized Estimation Based on t Regression Model
Directory of Open Access Journals (Sweden)
Zean Li
2015-01-01
Full Text Available The t distribution is a useful extension of the normal distribution, which can be used for statistical modeling of data sets with heavy tails, and provides robust estimation. In this paper, in view of the advantages of Bayesian analysis, we propose a new robust coefficient estimation and variable selection method based on Bayesian adaptive Lasso t regression. A Gibbs sampler is developed based on the Bayesian hierarchical model framework, where we treat the t distribution as a mixture of normal and gamma distributions and put different penalization parameters for different regression coefficients. We also consider the Bayesian t regression with adaptive group Lasso and obtain the Gibbs sampler from the posterior distributions. Both simulation studies and real data example show that our method performs well compared with other existing methods when the error distribution has heavy tails and/or outliers.
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
Directory of Open Access Journals (Sweden)
Geeta Nagpal
2013-02-01
Full Text Available Due to the intangible nature of “software”, accurate and reliable software effort estimation is a challenge in the software Industry. It is unlikely to expect very accurate estimates of software development effort because of the inherent uncertainty in software development projects and the complex and dynamic interaction of factors that impact software development. Heterogeneity exists in the software engineering datasets because data is made available from diverse sources. This can be reduced by defining certain relationship between the data values by classifying them into different clusters. This study focuses on how the combination of clustering and regression techniques can reduce the potential problems in effectiveness of predictive efficiency due to heterogeneity of the data. Using a clustered approach creates the subsets of data having a degree of homogeneity that enhances prediction accuracy. It was also observed in this study that ridge regression performs better than other regression techniques used in the analysis.
Geometric effects of fuel regression rate in hybrid rocket motors
Institute of Scientific and Technical Information of China (English)
CAI GuoBiao; ZHANG YuanJun; WANG PengFei; HUI Tian; ZHAO Sheng; YU NanJia
2016-01-01
The geometric configuration of the solid fuel is a key parameter affecting the fuel regression rate in hybrid rocket motors.In this paper,a semi-empirical regression rate model is developed to investigate the geometric effect on the fuel regression rate by incorporating the hydraulic diameter into the classical model.The semi-empirical model indicates that the fuel regression rate decreases with increasing hydraulic diameter and is proportional to dh-0.2 when convective heat transfer is dominant.Then a numerical model considering turbulence,combustion,solid fuel pyrolysis,and a solid-gas coupling model is established to further investigate the geometric effect.Eight motors with different solid fuel grains are simulated,and four methods of scaling the regression rate between different solid fuel grains are compared.The results indicate that the solid fuel regression rates are approximate the same when the hydraulic diameters are equal.The numerical results verify the accuracy of the semi-empirical model.
Estimate of error bounds in the improved support vector regression
Institute of Scientific and Technical Information of China (English)
SUN Yanfeng; LIANG Yanchun; WU Chunguo; YANG Xiaowei; LEE Heow Pueh; LIN Wu Zhong
2004-01-01
An estimate of a generalization error bound of the improved support vector regression(SVR)is provided based on our previous work.The boundedness of the error of the improved SVR is proved when the algorithm is applied to the function approximation.
An Affine Invariant $k$-Nearest Neighbor Regression Estimate
Biau, Gérard; Dujmovic, Vida; Krzyzak, Adam
2012-01-01
We design a data-dependent metric in $\\mathbb R^d$ and use it to define the $k$-nearest neighbors of a given point. Our metric is invariant under all affine transformations. We show that, with this metric, the standard $k$-nearest neighbor regression estimate is asymptotically consistent under the usual conditions on $k$, and minimal requirements on the input data.
Two biased estimation techniques in linear regression: Application to aircraft
Klein, Vladislav
1988-01-01
Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.
Posture Estimation by Using High Frequency Markers and Kernel Regressions
Ono, Yuya; Iwai, Yoshio; Ishiguro, Hiroshi
Recently, research fields of augmented reality and robot navigation are actively investigated. Estimating a relative posture between an object and a camera is an important task in these fields. In this paper, we propose a novel method for posture estimation by using high frequency markers and kernel regressions. The markers are embedded in an object's texture in the high frequency domain. We observe the change of spatial frequency of object's texture to estimate a current posture of the object. We conduct experiments to show the effectiveness of our method.
Kernel regression estimates of time delays between gravitationally lensed fluxes
Otaibi, Sultanah AL; Cuevas-Tello, Juan C; Mandel, Ilya; Raychaudhury, Somak
2015-01-01
Strongly lensed variable quasars can serve as precise cosmological probes, provided that time delays between the image fluxes can be accurately measured. A number of methods have been proposed to address this problem. In this paper, we explore in detail a new approach based on kernel regression estimates, which is able to estimate a single time delay given several datasets for the same quasar. We develop realistic artificial data sets in order to carry out controlled experiments to test of performance of this new approach. We also test our method on real data from strongly lensed quasar Q0957+561 and compare our estimates against existing results.
Hybrid rocket fuel combustion and regression rate study
Strand, L. D.; Ray, R. L.; Anderson, F. A.; Cohen, N. S.
1992-01-01
The objectives of this study are to develop hybrid fuels (1) with higher regression rates and reduced dependence on fuel grain geometry and (2) that maximize potential specific impulse using low-cost materials. A hybrid slab window motor system was developed to screen candidate fuels - their combustion behavior and regression rate. Combustion behavior diagnostics consisted of video and high speed motion pictures coverage. The mean fuel regression rates were determined by before and after measurements of the fuel slabs. The fuel for this initial investigation consisted of hydroxyl-terminated polybutadiene binder with coal and aluminum fillers. At low oxidizer flux levels (and corresponding fuel regression rates) the filled-binder fuels burn in a layered fashion, forming an aluminum containing binder/coal surface melt that, in turn, forms into filigrees or flakes that are stripped off by the crossflow. This melt process appears to diminish with increasing oxidizer flux level. Heat transfer by radiation is a significant contributor, producing the desired increase in magnitude and reduction in flow dependency (power law exponent) of the fuel regression rate.
Robust Regression for Slope Estimation in Curriculum-Based Measurement Progress Monitoring
Mercer, Sterett H.; Lyons, Alina F.; Johnston, Lauren E.; Millhoff, Courtney L.
2015-01-01
Although ordinary least-squares (OLS) regression has been identified as a preferred method to calculate rates of improvement for individual students during curriculum-based measurement (CBM) progress monitoring, OLS slope estimates are sensitive to the presence of extreme values. Robust estimators have been developed that are less biased by…
An Analysis of the Indicator Saturation Estimator as a Robust Regression Estimator
DEFF Research Database (Denmark)
Johansen, Søren; Nielsen, Bent
An algorithm suggested by Hendry (1999) for estimation in a regression with more regressors than observations, is analyzed with the purpose of finding an estimator that is robust to outliers and structural breaks. This estimator is an example of a one-step M-estimator based on Huber's skip functi...
Approaches to Low Fuel Regression Rate in Hybrid Rocket Engines
Directory of Open Access Journals (Sweden)
Dario Pastrone
2012-01-01
Full Text Available Hybrid rocket engines are promising propulsion systems which present appealing features such as safety, low cost, and environmental friendliness. On the other hand, certain issues hamper the development hoped for. The present paper discusses approaches addressing improvements to one of the most important among these issues: low fuel regression rate. To highlight the consequence of such an issue and to better understand the concepts proposed, fundamentals are summarized. Two approaches are presented (multiport grain and high mixture ratio which aim at reducing negative effects without enhancing regression rate. Furthermore, fuel material changes and nonconventional geometries of grain and/or injector are presented as methods to increase fuel regression rate. Although most of these approaches are still at the laboratory or concept scale, many of them are promising.
Directory of Open Access Journals (Sweden)
Qiutong Jin
2016-06-01
Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.
Regression Equations for Stature Estimation among Medical Students of Ghaziabad
Rakhee Verma, Syed Esam Mahmood
2015-01-01
"Introduction: Ossification and maturation in the foot occurs earlier than the long bones and therefore, during adolescence height could be more accurately predicted from foot measurement as compared to that from long bones. This study was undertaken to find out the correlation between foot length and height of an individual and to derive regression formulae to estimate the height from the foot length in the study population. Materials & Method: This cross sectional study was cond...
Efficient robust nonparametric estimation in a semimartingale regression model
Konev, Victor
2010-01-01
The paper considers the problem of robust estimating a periodic function in a continuous time regression model with dependent disturbances given by a general square integrable semimartingale with unknown distribution. An example of such a noise is non-gaussian Ornstein-Uhlenbeck process with the L\\'evy process subordinator, which is used to model the financial Black-Scholes type markets with jumps. An adaptive model selection procedure, based on the weighted least square estimates, is proposed. Under general moment conditions on the noise distribution, sharp non-asymptotic oracle inequalities for the robust risks have been derived and the robust efficiency of the model selection procedure has been shown.
Robust Nonlinear Regression in Enzyme Kinetic Parameters Estimation
Directory of Open Access Journals (Sweden)
Maja Marasović
2017-01-01
Full Text Available Accurate estimation of essential enzyme kinetic parameters, such as Km and Vmax, is very important in modern biology. To this date, linearization of kinetic equations is still widely established practice for determining these parameters in chemical and enzyme catalysis. Although simplicity of linear optimization is alluring, these methods have certain pitfalls due to which they more often then not result in misleading estimation of enzyme parameters. In order to obtain more accurate predictions of parameter values, the use of nonlinear least-squares fitting techniques is recommended. However, when there are outliers present in the data, these techniques become unreliable. This paper proposes the use of a robust nonlinear regression estimator based on modified Tukey’s biweight function that can provide more resilient results in the presence of outliers and/or influential observations. Real and synthetic kinetic data have been used to test our approach. Monte Carlo simulations are performed to illustrate the efficacy and the robustness of the biweight estimator in comparison with the standard linearization methods and the ordinary least-squares nonlinear regression. We then apply this method to experimental data for the tyrosinase enzyme (EC 1.14.18.1 extracted from Solanum tuberosum, Agaricus bisporus, and Pleurotus ostreatus. The results on both artificial and experimental data clearly show that the proposed robust estimator can be successfully employed to determine accurate values of Km and Vmax.
Regression Rate Study in HTPB/GOX Hybrid Rocket Motors.
Directory of Open Access Journals (Sweden)
Philmon George
1996-12-01
Full Text Available The theoretical and experimenIal studies on hybrid rocket motor combustion research are briefly reviewed and the need for a clear understanding of hybrid rocket fuel regression rate mechanism is brought out. A test facility established at the Indian Institute of Technology, Madras, for hybrid rocket motor research study is described.The results of an experimental study on hydroxyl terminated polybutadiene and gaseous oxygen hybrid rocket motor are presented. Fuel grains with ammonium perchlorate "additive" have shownenhanced oxidizermass flux dependence. Smallergrains have higher regression rates than those of the larger ones.
Semi-coherent time of arrival estimation using regression.
Apartsin, Alexander; Cooper, Leon N; Intrator, Nathan
2012-08-01
Time of arrival (ToA) estimation is essential for many types of remote sensing applications including radar, sonar, and underground exploration. The standard method for ToA estimation employs a matched filter for computing the maximum likelihood estimator (MLE) for ToA. The accuracy of the MLE decreases rapidly whenever the amount of noise in a received signal rises above a certain threshold. This well-known threshold effect is unavoidable in several important applications due to various limitations on the power and the spectrum of a narrowband source pulse. A measurement performed in the presence of the threshold effect employs a receiver which operates in the semi-coherent state. Therefore, the conventional methods assuming a coherent state receiver should be adapted to the semi-coherent case. In this paper, a biosonar-inspired method for the semi-coherent ToA estimation is described. The method abandons the exploration of an echo signal by a single matched filter in favor of the analysis by multiple phase-shifted unmatched filters. Each phase-shifted unmatched filter gives rise to a biased ToA estimator. The described method uses regression for combining these estimators into a single unbiased ToA estimator that outperform the MLE in the presence of the threshold effect.
Wavelet Estimators in Nonparametric Regression: A Comparative Simulation Study
Directory of Open Access Journals (Sweden)
Anestis Antoniadis
2001-06-01
Full Text Available Wavelet analysis has been found to be a powerful tool for the nonparametric estimation of spatially-variable objects. We discuss in detail wavelet methods in nonparametric regression, where the data are modelled as observations of a signal contaminated with additive Gaussian noise, and provide an extensive review of the vast literature of wavelet shrinkage and wavelet thresholding estimators developed to denoise such data. These estimators arise from a wide range of classical and empirical Bayes methods treating either individual or blocks of wavelet coefficients. We compare various estimators in an extensive simulation study on a variety of sample sizes, test functions, signal-to-noise ratios and wavelet filters. Because there is no single criterion that can adequately summarise the behaviour of an estimator, we use various criteria to measure performance in finite sample situations. Insight into the performance of these estimators is obtained from graphical outputs and numerical tables. In order to provide some hints of how these estimators should be used to analyse real data sets, a detailed practical step-by-step illustration of a wavelet denoising analysis on electrical consumption is provided. Matlab codes are provided so that all figures and tables in this paper can be reproduced.
Energy Technology Data Exchange (ETDEWEB)
Glascoe, E
2008-08-11
It is estimated that PBXN-110 will burn laminarly with a burn function of B = (0.6-1.3)*P{sup 1.0} (B is the burn rate in mm/s and P is pressure in MPa). This paper provides a brief discussion of how this burn behavior was estimated.
K factor estimation in distribution transformers using linear regression models
Directory of Open Access Journals (Sweden)
Juan Miguel Astorga Gómez
2016-06-01
Full Text Available Background: Due to massive incorporation of electronic equipment to distribution systems, distribution transformers are subject to operation conditions other than the design ones, because of the circulation of harmonic currents. It is necessary to quantify the effect produced by these harmonic currents to determine the capacity of the transformer to withstand these new operating conditions. The K-factor is an indicator that estimates the ability of a transformer to withstand the thermal effects caused by harmonic currents. This article presents a linear regression model to estimate the value of the K-factor, from total current harmonic content obtained with low-cost equipment.Method: Two distribution transformers that feed different loads are studied variables, current total harmonic distortion factor K are recorded, and the regression model that best fits the data field is determined. To select the regression model the coefficient of determination R2 and the Akaike Information Criterion (AIC are used. With the selected model, the K-factor is estimated to actual operating conditions.Results: Once determined the model it was found that for both agricultural cargo and industrial mining, present harmonic content (THDi exceeds the values that these transformers can drive (average of 12.54% and minimum 8,90% in the case of agriculture and average value of 18.53% and a minimum of 6.80%, for industrial mining case.Conclusions: When estimating the K factor using polynomial models it was determined that studied transformers can not withstand the current total harmonic distortion of their current loads. The appropriate K factor for studied transformer should be 4; this allows transformers support the current total harmonic distortion of their respective loads.
Controlling the Type I Error Rate in Stepwise Regression Analysis.
Pohlmann, John T.
Three procedures used to control Type I error rate in stepwise regression analysis are forward selection, backward elimination, and true stepwise. In the forward selection method, a model of the dependent variable is formed by choosing the single best predictor; then the second predictor which makes the strongest contribution to the prediction of…
Institute of Scientific and Technical Information of China (English)
Li Wen XU; Song Gui WANG
2007-01-01
In this paper, the authors address the problem of the minimax estimator of linear com-binations of stochastic regression coefficients and parameters in the general normal linear model with random effects. Under a quadratic loss function, the minimax property of linear estimators is inves- tigated. In the class of all estimators, the minimax estimator of estimable functions, which is unique with probability 1, is obtained under a multivariate normal distribution.
Kiviet, J.F.; Phillips, G.D.A.
2014-01-01
In dynamic regression models conditional maximum likelihood (least-squares) coefficient and variance estimators are biased. Using expansion techniques an approximation is obtained to the bias in variance estimation yielding a bias corrected variance estimator. This is achieved for both the standard
Construction of the flow rate nomogram using polynomial regression.
Hosmane, B; Maurath, C; McConnell, M
1993-04-01
The urinary flow rates of normal individuals depend on the initial bladder volume in a non-linear fashion (J. Urol. 109 (1973) 874). A flow rate nomogram was developed by Siroky, Olsson and Krane, (J. Vol. 122 (1979) 665), taking the non-linear relationship into account, as an aid in the interpretation of urinary flow rate data. The use of a flow rate nomogram is to differentiate normal from obstructed individuals and is useful in the post operative follow-up of urinary outflow obstruction. It has been shown (J. Urol. 123 (1980) 123) that the flow rate nomogram is an objective measure of the efficacy of medical or surgical therapy. Instead of manually reading nomogram values from the flow rate nomogram, an algorithm is developed using polynomial regression to fit the flow rate nomograms and hence compute nomogram values directly from the fitted nomogram equations.
Costs of sea dikes - regressions and uncertainty estimates
Lenk, Stephan; Rybski, Diego; Heidrich, Oliver; Dawson, Richard J.; Kropp, Jürgen P.
2017-05-01
Failure to consider the costs of adaptation strategies can be seen by decision makers as a barrier to implementing coastal protection measures. In order to validate adaptation strategies to sea-level rise in the form of coastal protection, a consistent and repeatable assessment of the costs is necessary. This paper significantly extends current knowledge on cost estimates by developing - and implementing using real coastal dike data - probabilistic functions of dike costs. Data from Canada and the Netherlands are analysed and related to published studies from the US, UK, and Vietnam in order to provide a reproducible estimate of typical sea dike costs and their uncertainty. We plot the costs divided by dike length as a function of height and test four different regression models. Our analysis shows that a linear function without intercept is sufficient to model the costs, i.e. fixed costs and higher-order contributions such as that due to the volume of core fill material are less significant. We also characterise the spread around the regression models which represents an uncertainty stemming from factors beyond dike length and height. Drawing an analogy with project cost overruns, we employ log-normal distributions and calculate that the range between 3x and x/3 contains 95 % of the data, where x represents the corresponding regression value. We compare our estimates with previously published unit costs for other countries. We note that the unit costs depend not only on the country and land use (urban/non-urban) of the sites where the dikes are being constructed but also on characteristics included in the costs, e.g. property acquisition, utility relocation, and project management. This paper gives decision makers an order of magnitude on the protection costs, which can help to remove potential barriers to developing adaptation strategies. Although the focus of this research is sea dikes, our approach is applicable and transferable to other adaptation measures.
Estimating life expectancies for US small areas: a regression framework
Congdon, Peter
2014-01-01
Analysis of area mortality variations and estimation of area life tables raise methodological questions relevant to assessing spatial clustering, and socioeconomic inequalities in mortality. Existing small area analyses of US life expectancy variation generally adopt ad hoc amalgamations of counties to alleviate potential instability of mortality rates involved in deriving life tables, and use conventional life table analysis which takes no account of correlated mortality for adjacent areas or ages. The alternative strategy here uses structured random effects methods that recognize correlations between adjacent ages and areas, and allows retention of the original county boundaries. This strategy generalizes to include effects of area category (e.g. poverty status, ethnic mix), allowing estimation of life tables according to area category, and providing additional stabilization of estimated life table functions. This approach is used here to estimate stabilized mortality rates, derive life expectancies in US counties, and assess trends in clustering and in inequality according to county poverty category.
Improving gravitational-wave parameter estimation using Gaussian process regression
Moore, Christopher J; Chua, Alvin J K; Gair, Jonathan R
2015-01-01
Folding uncertainty in theoretical models into Bayesian parameter estimation is necessary in order to make reliable inferences. A general means of achieving this is by marginalising over model uncertainty using a prior distribution constructed using Gaussian process regression (GPR). Here, we apply this technique to (simulated) gravitational-wave signals from binary black holes that could be observed using advanced-era gravitational-wave detectors. Unless properly accounted for, uncertainty in the gravitational-wave templates could be the dominant source of error in studies of these systems. We explain our approach in detail and provide proofs of various features of the method, including the limiting behaviour for high signal-to-noise, where systematic model uncertainties dominate over noise errors. We find that the marginalised likelihood constructed via GPR offers a significant improvement in parameter estimation over the standard, uncorrected likelihood. We also examine the dependence of the method on the ...
DEFF Research Database (Denmark)
Sharifzadeh, Sara; Skytte, Jacob Lercke; Nielsen, Otto Højager Attermann;
2012-01-01
Statistical solutions find wide spread use in food and medicine quality control. We investigate the effect of different regression and sparse regression methods for a viscosity estimation problem using the spectro-temporal features from new Sub-Surface Laser Scattering (SLS) vision system. From...... this investigation, we propose the optimal solution for regression estimation in case of noisy and inconsistent optical measurements, which is the case in many practical measurement systems. The principal component regression (PLS), partial least squares (PCR) and least angle regression (LAR) methods are compared...
Genomic breeding value estimation using nonparametric additive regression models
Directory of Open Access Journals (Sweden)
Solberg Trygve
2009-01-01
Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.
Regression Rate Study in HTPB/GOX Hybrid Rocket Motors.
Philmon George; Krishnan, S; Lalitha Ramachandran; P. M. Varkey; Raveendran, M.
1996-01-01
The theoretical and experimenIal studies on hybrid rocket motor combustion research are briefly reviewed and the need for a clear understanding of hybrid rocket fuel regression rate mechanism is brought out. A test facility established at the Indian Institute of Technology, Madras, for hybrid rocket motor research study is described.The results of an experimental study on hydroxyl terminated polybutadiene and gaseous oxygen hybrid rocket motor are presented. Fuel grains with ammonium perchlor...
Approaches to Low Fuel Regression Rate in Hybrid Rocket Engines
Dario Pastrone
2012-01-01
Hybrid rocket engines are promising propulsion systems which present appealing features such as safety, low cost, and environmental friendliness. On the other hand, certain issues hamper the development hoped for. The present paper discusses approaches addressing improvements to one of the most important among these issues: low fuel regression rate. To highlight the consequence of such an issue and to better understand the concepts proposed, fundamentals are summarized. Two approaches are pre...
Logistic regression in estimates of femoral neck fracture by fall
Directory of Open Access Journals (Sweden)
Jaroslava Wendlová
2010-04-01
Full Text Available Jaroslava WendlováDerer’s University Hospital and Policlinic, Osteological Unit, Bratislava, SlovakiaAbstract: The latest methods in estimating the probability (absolute risk of osteoporotic fractures include several logistic regression models, based on qualitative risk factors plus bone mineral density (BMD, and the probability estimate of fracture in the future. The Slovak logistic regression model, in contrast to other models, is created from quantitative variables of the proximal femur (in International System of Units and estimates the probability of fracture by fall.Objectives: The first objective of this study was to order selected independent variables according to the intensity of their influence (statistical significance upon the occurrence of values of the dependent variable: femur strength index (FSI. The second objective was to determine, using logistic regression, whether the odds of FSI acquiring a pathological value (femoral neck fracture by fall increased or declined if the value of the variables (T–score total hip, BMI, alpha angle, theta angle and HAL were raised by one unit.Patients and methods: Bone densitometer measurements using dual energy X–ray absorptiometry (DXA, (Prodigy, Primo, GE, USA of the left proximal femur were obtained from 3 216 East Slovak women with primary or secondary osteoporosis or osteopenia, aged 20–89 years (mean age 58.9; 95% CI: −58.42; 59.38. The following variables were measured: FSI, T-score total hip BMD, body mass index (BMI, as were the geometrical variables of proximal femur alpha angle (α angle, theta angle (θ angle, and hip axis length (HAL.Statistical analysis: Logistic regression was used to measure the influence of the independent variables (T-score total hip, alpha angle, theta angle, HAL, BMI upon the dependent variable (FSI.Results: The order of independent variables according to the intensity of their influence (greatest to least upon the occurrence of values of the
Sieve M-estimation for semiparametric varying-coefficient partially linear regression model
Institute of Scientific and Technical Information of China (English)
无
2010-01-01
This article considers a semiparametric varying-coefficient partially linear regression model.The semiparametric varying-coefficient partially linear regression model which is a generalization of the partially linear regression model and varying-coefficient regression model that allows one to explore the possibly nonlinear effect of a certain covariate on the response variable.A sieve M-estimation method is proposed and the asymptotic properties of the proposed estimators are discussed.Our main object is to estimate the nonparametric component and the unknown parameters simultaneously.It is easier to compute and the required computation burden is much less than the existing two-stage estimation method.Furthermore,the sieve M-estimation is robust in the presence of outliers if we choose appropriate ρ(·).Under some mild conditions,the estimators are shown to be strongly consistent;the convergence rate of the estimator for the unknown nonparametric component is obtained and the estimator for the unknown parameter is shown to be asymptotically normally distributed.Numerical experiments are carried out to investigate the performance of the proposed method.
Robust estimation for homoscedastic regression in the secondary analysis of case-control data
Wei, Jiawei
2012-12-04
Primary analysis of case-control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of Y on X is different from what it is in the population. Previous work has assumed a parametric distribution for Y given X and derived semiparametric efficient estimation and inference without any distributional assumptions about X. We take up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model, but otherwise the distribution of Y is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for Y given X. We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for Y given X is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach.
Regression and kriging analysis for grid power factor estimation
Directory of Open Access Journals (Sweden)
Rajesh Guntaka
2014-12-01
Full Text Available The measurement of power factor (PF in electrical utility grids is a mainstay of load balancing and is also a critical element of transmission and distribution efficiency. The measurement of PF dates back to the earliest periods of electrical power distribution to public grids. In the wide-area distribution grid, measurement of current waveforms is trivial and may be accomplished at any point in the grid using a current tap transformer. However, voltage measurement requires reference to ground and so is more problematic and measurements are normally constrained to points that have ready and easy access to a ground source. We present two mathematical analysis methods based on kriging and linear least square estimation (LLSE (regression to derive PF at nodes with unknown voltages that are within a perimeter of sample nodes with ground reference across a selected power grid. Our results indicate an error average of 1.884% that is within acceptable tolerances for PF measurements that are used in load balancing tasks.
Institute of Scientific and Technical Information of China (English)
LINGNeng-xiang; DUXue-qiao
2005-01-01
In this paper, we study the strong consistency for partitioning estimation of regression function under samples that axe φ-mixing sequences with identically distribution.Key words: nonparametric regression function; partitioning estimation; strong convergence;φ-mixing sequences.
Xu, Xiaohong; Chen, Yu; Jia, Haiwei
2009-07-01
The paper study the relation between Interest rate and Inflation rate, we use the Stepwise Regression Method to build the math model about the relation between Interest rate and Inflation rate. And the model has passed the significance test, and we use the model to discuss the influence on social economy through adjust Deposit rate, so we can provide a lot of theory proof for government to draw policy.
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
Partly linear regression model is useful in practice, but littleis investigated in the literature to adapt it to the real data which are dependent and conditionally heteroscedastic. In this paper, the estimators of the regression components are constructed via local polynomial fitting and the large sample properties are explored. Under certain mild regularities, the conditions are obtained to ensure that the estimators of the nonparametric component and its derivatives are consistent up to the convergence rates which are optimal in the i.i.d. case, and the estimator of the parametric component is root-n consistent with the same rate as for parametric model. The technique adopted in the proof differs from that used and corrects the errors in the reference by Hamilton and Truong under i.i.d. samples.
Sparse Volterra and Polynomial Regression Models: Recoverability and Estimation
Kekatos, Vassilis
2011-01-01
Volterra and polynomial regression models play a major role in nonlinear system identification and inference tasks. Exciting applications ranging from neuroscience to genome-wide association analysis build on these models with the additional requirement of parsimony. This requirement has high interpretative value, but unfortunately cannot be met by least-squares based or kernel regression methods. To this end, compressed sampling (CS) approaches, already successful in linear regression settings, can offer a viable alternative. The viability of CS for sparse Volterra and polynomial models is the core theme of this work. A common sparse regression task is initially posed for the two models. Building on (weighted) Lasso-based schemes, an adaptive RLS-type algorithm is developed for sparse polynomial regressions. The identifiability of polynomial models is critically challenged by dimensionality. However, following the CS principle, when these models are sparse, they could be recovered by far fewer measurements. ...
Bulcock, J. W.; And Others
Multicollinearity refers to the presence of highly intercorrelated independent variables in structural equation models, that is, models estimated by using techniques such as least squares regression and maximum likelihood. There is a problem of multicollinearity in both the natural and social sciences where theory formulation and estimation is in…
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
Khoshravesh, Mojtaba; Sefidkouhi, Mohammad Ali Gholami; Valipour, Mohammad
2017-07-01
The proper evaluation of evapotranspiration is essential in food security investigation, farm management, pollution detection, irrigation scheduling, nutrient flows, carbon balance as well as hydrologic modeling, especially in arid environments. To achieve sustainable development and to ensure water supply, especially in arid environments, irrigation experts need tools to estimate reference evapotranspiration on a large scale. In this study, the monthly reference evapotranspiration was estimated by three different regression models including the multivariate fractional polynomial (MFP), robust regression, and Bayesian regression in Ardestan, Esfahan, and Kashan. The results were compared with Food and Agriculture Organization (FAO)-Penman-Monteith (FAO-PM) to select the best model. The results show that at a monthly scale, all models provided a closer agreement with the calculated values for FAO-PM ( R 2 > 0.95 and RMSE < 12.07 mm month-1). However, the MFP model gives better estimates than the other two models for estimating reference evapotranspiration at all stations.
Institute of Scientific and Technical Information of China (English)
CHEN Min; WU Guo-fu; QI Quan-yue
2001-01-01
In this paper, we consider a multiple regression model in the presence of serial correlation and heteroscedasticity. We establish the convergence rate of an efficient estimation of autoregressive coefficients suggested by Harvey and Robison (1988). We propose a method to identify order of serial correlation data and prove that it is of strong consistency. The simulation reports show that the method of identifying order is available.
Post-L1-Penalized Estimators in High-Dimensional Linear Regression Models
Belloni, Alexandre
2010-01-01
In this paper we study the post-penalized estimator which applies ordinary, unpenalized linear regression to the model selected by the first step penalized estimators, typically the LASSO. We show that post-LASSO can perform as well or nearly as well as the LASSO in terms of the rate of convergence. We show that this performance occurs even if the LASSO-based model selection "fails", in the sense of missing some components of the "true" regression model. Furthermore, post-LASSO can perform strictly better than LASSO, in the sense of a strictly faster rate of convergence, if the LASSO-based model selection correctly includes all components of the "true" model as a subset and enough sparsity is obtained. Of course, in the extreme case, when LASSO perfectly selects the true model, the past-LASSO estimator becomes the oracle estimator. We show that the results hold in both parametric and non-parametric models; and by the "true" model we mean the best $s$-dimensional approximation to the true regression model, whe...
Pac-bayesian bounds for sparse regression estimation with exponential weights
Alquier, Pierre
2010-01-01
We consider the sparse regression model where the number of parameters $p$ is larger than the sample size $n$. The difficulty when considering high-dimensional problems is to propose estimators achieving a good compromise between statistical and computational performances. The BIC estimator for instance performs well from the statistical point of view \\cite{BTW07} but can only be computed for values of $p$ of at most a few tens. The Lasso estimator is solution of a convex minimization problem, hence computable for large value of $p$. However stringent conditions on the design are required to establish fast rates of convergence for this estimator. Dalalyan and Tsybakov \\cite{arnak} propose a method achieving a good compromise between the statistical and computational aspects of the problem. Their estimator can be computed for reasonably large $p$ and satisfies nice statistical properties under weak assumptions on the design. However, \\cite{arnak} proposes sparsity oracle inequalities in expectation for the emp...
LSTA, Rawane Samb
2010-01-01
This thesis deals with the nonparametric estimation of density f of the regression error term E of the model Y=m(X)+E, assuming its independence with the covariate X. The difficulty linked to this study is the fact that the regression error E is not observed. In a such setup, it would be unwise, for estimating f, to use a conditional approach based upon the probability distribution function of Y given X. Indeed, this approach is affected by the curse of dimensionality, so that the resulting estimator of the residual term E would have considerably a slow rate of convergence if the dimension of X is very high. Two approaches are proposed in this thesis to avoid the curse of dimensionality. The first approach uses the estimated residuals, while the second integrates a nonparametric conditional density estimator of Y given X. If proceeding so can circumvent the curse of dimensionality, a challenging issue is to evaluate the impact of the estimated residuals on the final estimator of the density f. We will also at...
A Stochastic Restricted Principal Components Regression Estimator in the Linear Model
Directory of Open Access Journals (Sweden)
Daojiang He
2014-01-01
Full Text Available We propose a new estimator to combat the multicollinearity in the linear model when there are stochastic linear restrictions on the regression coefficients. The new estimator is constructed by combining the ordinary mixed estimator (OME and the principal components regression (PCR estimator, which is called the stochastic restricted principal components (SRPC regression estimator. Necessary and sufficient conditions for the superiority of the SRPC estimator over the OME and the PCR estimator are derived in the sense of the mean squared error matrix criterion. Finally, we give a numerical example and a Monte Carlo study to illustrate the performance of the proposed estimator.
Institute of Scientific and Technical Information of China (English)
LU; Zudi
2001-01-01
［1］Engle, R. F., Granger, C. W. J., Rice, J. et al., Semiparametric estimates of the relation between weather and electricity sales, Journal of the American Statistical Association, 1986, 81: 310.［2］Heckman, N. E., Spline smoothing in partly linear models, Journal of the Royal Statistical Society, Ser. B, 1986, 48: 244.［3］Rice, J., Convergence rates for partially splined models, Statistics & Probability Letters, 1986, 4: 203.［4］Chen, H., Convergence rates for parametric components in a partly linear model, Annals of Statistics, 1988, 16: 136.［5］Robinson, P. M., Root-n-consistent semiparametric regression, Econometrica, 1988, 56: 931.［6］Speckman, P., Kernel smoothing in partial linear models, Journal of the Royal Statistical Society, Ser. B, 1988, 50: 413.［7］Cuzick, J., Semiparametric additive regression, Journal of the Royal Statistical Society, Ser. B, 1992, 54: 831.［8］Cuzick, J., Efficient estimates in semiparametric additive regression models with unknown error distribution, Annals of Statistics, 1992, 20: 1129.［9］Chen, H., Shiau, J. H., A two-stage spline smoothing method for partially linear models, Journal of Statistical Planning & Inference, 1991, 27: 187.［10］Chen, H., Shiau, J. H., Data-driven efficient estimators for a partially linear model, Annals of Statistics, 1994, 22: 211.［11］Schick, A., Root-n consistent estimation in partly linear regression models, Statistics & Probability Letters, 1996, 28: 353.［12］Hamilton, S. A., Truong, Y. K., Local linear estimation in partly linear model, Journal of Multivariate Analysis, 1997, 60: 1.［13］Mills, T. C., The Econometric Modeling of Financial Time Series, Cambridge: Cambridge University Press, 1993, 137.［14］Engle, R. F., Autoregressive conditional heteroscedasticity with estimates of United Kingdom inflation, Econometrica, 1982, 50: 987.［15］Bera, A. K., Higgins, M. L., A survey of ARCH models: properties of estimation and testing, Journal of Economic
Hierarchical Matching and Regression with Application to Photometric Redshift Estimation
Murtagh, Fionn
2017-06-01
This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or `photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.
CONSISTENCY OF LS ESTIMATOR IN SIMPLE LINEAR EV REGRESSION MODELS
Institute of Scientific and Technical Information of China (English)
Liu Jixue; Chen Xiru
2005-01-01
Consistency of LS estimate of simple linear EV model is studied. It is shown that under some common assumptions of the model, both weak and strong consistency of the estimate are equivalent but it is not so for quadratic-mean consistency.
Theoretical prediction of regression rates in swirl-injection hybrid rocket engines
Ozawa, K.; Shimada, T.
2016-07-01
The authors theoretically and analytically predict what times regression rates of swirl injection hybrid rocket engines increase higher than the axial injection ones by estimating heat flux from boundary layer combustion to the fuel port. The schematic of engines is assumed as ones whose oxidizer is injected from the opposite side of the nozzle such as ones of Yuasa et al. propose. To simplify the estimation, we assume some hypotheses such as three-dimensional (3D) axisymmetric flows have been assumed. The results of this prediction method are largely consistent with Yuasa's experiments data in the range of high swirl numbers.
A Robbins-Monro procedure for estimation in semiparametric regression models
Bercu, Bernard
2011-01-01
This paper is devoted to the parametric estimation of a shift together with the nonparametric estimation of a regression function in a semiparametric regression model. We implement a Robbins-Monro procedure very efficient and easy to handle. On the one hand, we propose a stochastic algorithm similar to that of Robbins-Monro in order to estimate the shift parameter. A preliminary evaluation of the regression function is not necessary for estimating the shift parameter. On the other hand, we make use of a recursive Nadaraya-Watson estimator for the estimation of the regression function. This kernel estimator takes in account the previous estimation of the shift parameter. We establish the almost sure convergence for both Robbins-Monro and Nadaraya-Watson estimators. The asymptotic normality of our estimates is also provided.
On asymptotics of t-type regression estimation in multiple linear model
Institute of Scientific and Technical Information of China (English)
无
2004-01-01
We consider a robust estimator (t-type regression estimator) of multiple linear regression model by maximizing marginal likelihood of a scaled t-type error t-distribution.The marginal likelihood can also be applied to the de-correlated response when the withinsubject correlation can be consistently estimated from an initial estimate of the model based on the independent working assumption. This paper shows that such a t-type estimator is consistent.
Directory of Open Access Journals (Sweden)
Pudji Ismartini
2010-08-01
Full Text Available One of the major problem facing the data modelling at social area is multicollinearity. Multicollinearity can have significant impact on the quality and stability of the fitted regression model. Common classical regression technique by using Least Squares estimate is highly sensitive to multicollinearity problem. In such a problem area, Partial Least Squares Regression (PLSR is a useful and flexible tool for statistical model building; however, PLSR can only yields point estimations. This paper will construct the interval estimations for PLSR regression parameters by implementing Jackknife technique to poverty data. A SAS macro programme is developed to obtain the Jackknife interval estimator for PLSR.
Allelic drop-out probabilities estimated by logistic regression
DEFF Research Database (Denmark)
Tvedebrink, Torben; Eriksen, Poul Svante; Asplund, Maria
2012-01-01
We discuss the model for estimating drop-out probabilities presented by Tvedebrink et al. [7] and the concerns, that have been raised. The criticism of the model has demonstrated that the model is not perfect. However, the model is very useful for advanced forensic genetic work, where allelic dro...
Directory of Open Access Journals (Sweden)
Pasi A. Karjalainen
2007-01-01
Full Text Available Ventricular repolarization duration (VRD is affected by heart rate and autonomic control, and thus VRD varies in time in a similar way as heart rate. VRD variability is commonly assessed by determining the time differences between successive R- and T-waves, that is, RT intervals. Traditional methods for RT interval detection necessitate the detection of either T-wave apexes or offsets. In this paper, we propose a principal-component-regression- (PCR- based method for estimating RT variability. The main benefit of the method is that it does not necessitate T-wave detection. The proposed method is compared with traditional RT interval measures, and as a result, it is observed to estimate RT variability accurately and to be less sensitive to noise than the traditional methods. As a specific application, the method is applied to exercise electrocardiogram (ECG recordings.
Area-to-point parameter estimation with geographically weighted regression
Murakami, Daisuke; Tsutsumi, Morito
2015-07-01
The modifiable areal unit problem (MAUP) is a problem by which aggregated units of data influence the results of spatial data analysis. Standard GWR, which ignores aggregation mechanisms, cannot be considered to serve as an efficient countermeasure of MAUP. Accordingly, this study proposes a type of GWR with aggregation mechanisms, termed area-to-point (ATP) GWR herein. ATP GWR, which is closely related to geostatistical approaches, estimates the disaggregate-level local trend parameters by using aggregated variables. We examine the effectiveness of ATP GWR for mitigating MAUP through a simulation study and an empirical study. The simulation study indicates that the method proposed herein is robust to the MAUP when the spatial scales of aggregation are not too global compared with the scale of the underlying spatial variations. The empirical studies demonstrate that the method provides intuitively consistent estimates.
Regularized Regression and Density Estimation based on Optimal Transport
Burger, M.
2012-03-11
The aim of this paper is to investigate a novel nonparametric approach for estimating and smoothing density functions as well as probability densities from discrete samples based on a variational regularization method with the Wasserstein metric as a data fidelity. The approach allows a unified treatment of discrete and continuous probability measures and is hence attractive for various tasks. In particular, the variational model for special regularization functionals yields a natural method for estimating densities and for preserving edges in the case of total variation regularization. In order to compute solutions of the variational problems, a regularized optimal transport problem needs to be solved, for which we discuss several formulations and provide a detailed analysis. Moreover, we compute special self-similar solutions for standard regularization functionals and we discuss several computational approaches and results. © 2012 The Author(s).
Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression modeling performed on measurements of selected pesticides in composited duplicate diet samples allowed (1) estimation ...
Recursive bias estimation for high dimensional regression smoothers
Energy Technology Data Exchange (ETDEWEB)
Hengartner, Nicolas W [Los Alamos National Laboratory; Cornillon, Pierre - Andre [AGROSUP, FRANCE; Matzner - Lober, Eric [UNIV OF RENNES, FRANCE
2009-01-01
In multivariate nonparametric analysis, sparseness of the covariates also called curse of dimensionality, forces one to use large smoothing parameters. This leads to biased smoother. Instead of focusing on optimally selecting the smoothing parameter, we fix it to some reasonably large value to ensure an over-smoothing of the data. The resulting smoother has a small variance but a substantial bias. In this paper, we propose to iteratively correct of the bias initial estimator by an estimate of the latter obtained by smoothing the residuals. We examine in details the convergence of the iterated procedure for classical smoothers and relate our procedure to L{sub 2}-Boosting, For multivariate thin plate spline smoother, we proved that our procedure adapts to the correct and unknown order of smoothness for estimating an unknown function m belonging to H({nu}) (Sobolev space where m should be bigger than d/2). We apply our method to simulated and real data and show that our method compares favorably with existing procedures.
Tightness of M-estimators for multiple linear regression in time series
DEFF Research Database (Denmark)
Johansen, Søren; Nielsen, Bent
We show tightness of a general M-estimator for multiple linear regression in time series. The positive criterion function for the M-estimator is assumed lower semi-continuous and sufficiently large for large argument: Particular cases are the Huber-skip and quantile regression. Tightness requires...
Regression-based estimates of observed functional status in centenarians.
Mitchell, Meghan B; Miller, L Stephen; Woodard, John L; Davey, Adam; Martin, Peter; Burgess, Molly; Poon, Leonard W
2011-04-01
There is lack of consensus on the best method of functional assessment, and there is a paucity of studies on daily functioning in centenarians. We sought to compare associations between performance-based, self-report, and proxy report of functional status in centenarians. We expected the strongest relationships between proxy reports and observed performance of basic activities of daily living (BADLs) and instrumental activities of daily living (IADLs). We hypothesized that the discrepancy between self-report and observed daily functioning would be modified by cognitive status. We additionally sought to provide clinicians with estimates of centenarians' observed daily functioning based on their mental status in combination with subjective measures of activities of daily living (ADLs). Two hundred and forty-four centenarians from the Georgia Centenarian Study were included in this cross-sectional population-based study. Measures included the Direct Assessment of Functional Status, self-report and proxy report of functional status, and the Mini-Mental State Examination (MMSE). Associations between observed and proxy reports were stronger than between observed and self-report across BADL and IADL measures. A significant MMSE by type of report interaction was found, indicating that lower MMSE performance is associated with a greater discrepancy between subjective and objective ADL measures. Results demonstrate associations between 3 methods of assessing functional status and suggest proxy reports are generally more accurate than self-report measures. Cognitive status accounted for some of the discrepancy between observed and self-reports, and we provide clinicians with tables to estimate centenarians' performance on observed functional measures based on MMSE and subjective report of functional status.
Swets, Marije; Dekker, Jack; van Emmerik-van Oortmerssen, Katelijne; Smid, Geert E.; Smit, Filip; de Haan, Lieuwe; Schoevers, Robert A.
Aims: The aims of this study were to conduct a meta-analysis and meta-regression to estimate the prevalence rates for obsessive compulsive symptoms (OCS) and obsessive compulsive disorder (OCD) in schizophrenia, and to investigate what influences these prevalence rates. Method: Studies were
Difficulties with Regression Analysis of Age-Adjusted Rates.
1982-09-01
to biased estimates by analogy with standard arguments (e.g. Beber 1977, p.155; Johnston 1972, p.281). -13- 6. Summary We have considered the...compulsory inspection of vehicles. Journal of the American Medical Association, 201: 657-661. Johnston , J. (1977) Econometric Methods. New York: McGraw
Sidik, S. M.
1975-01-01
Ridge, Marquardt's generalized inverse, shrunken, and principal components estimators are discussed in terms of the objectives of point estimation of parameters, estimation of the predictive regression function, and hypothesis testing. It is found that as the normal equations approach singularity, more consideration must be given to estimable functions of the parameters as opposed to estimation of the full parameter vector; that biased estimators all introduce constraints on the parameter space; that adoption of mean squared error as a criterion of goodness should be independent of the degree of singularity; and that ordinary least-squares subset regression is the best overall method.
Efficient Quantile Estimation for Functional-Coefficient Partially Linear Regression Models
Institute of Scientific and Technical Information of China (English)
Zhangong ZHOU; Rong JIANG; Weimin QIAN
2011-01-01
The quantile estimation methods are proposed for functional-coefficient partially linear regression (FCPLR) model by combining nonparametric and functional-coefficient regression (FCR) model.The local linear scheme and the integrated method are used to obtain local quantile estimators of all unknown functions in the FCPLR model.These resulting estimators are asymptotically normal,but each of them has big variance.To reduce variances of these quantile estimators,the one-step backfitting technique is used to obtain the efficient quantile estimators of all unknown functions,and their asymptotic normalities are derived.Two simulated examples are carried out to illustrate the proposed estimation methodology.
Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said
2014-09-01
In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.
Bowlby, Heather D; Gibson, A Jamie F
2015-08-01
Describing how population-level survival rates are influenced by environmental change becomes necessary during recovery planning to identify threats that should be the focus for future remediation efforts. However, the ways in which data are analyzed have the potential to change our ecological understanding and thus subsequent recommendations for remedial actions to address threats. In regression, distributional assumptions underlying short time series of survival estimates cannot be investigated a priori and data likely contain points that do not follow the general trend (outliers) as well as contain additional variation relative to an assumed distribution (overdispersion). Using juvenile survival data from three endangered Atlantic salmon Salmo salar L. populations in response to hydrological variation, four distributions for the response were compared using lognormal and generalized linear models (GLM). The influence of outliers as well as overdispersion was investigated by comparing conclusions from robust regressions with these lognormal models and GLMs. The analyses strongly supported the use of a lognormal distribution for survival estimates (i.e., modeling the instantaneous rate of mortality as the response) and would have led to ambiguity in the identification of significant hydrological predictors as well as low overall confidence in the predicted relationships if only GLMs had been considered. However, using robust regression to evaluate the effect of additional variation and outliers in the data relative to regression assumptions resulted in a better understanding of relationships between hydrological variables and survival that could be used for population-specific recovery planning. This manuscript highlights how a systematic analysis that explicitly considers what monitoring data represent and where variation is likely to come from is required in order to draw meaningful conclusions when analyzing changes in survival relative to environmental
Stahel-Donoho kernel estimation for fixed design nonparametric regression models
Institute of Scientific and Technical Information of China (English)
LIN; Lu
2006-01-01
This paper reports a robust kernel estimation for fixed design nonparametric regression models.A Stahel-Donoho kernel estimation is introduced,in which the weight functions depend on both the depths of data and the distances between the design points and the estimation points.Based on a local approximation,a computational technique is given to approximate to the incomputable depths of the errors.As a result the new estimator is computationally efficient.The proposed estimator attains a high breakdown point and has perfect asymptotic behaviors such as the asymptotic normality and convergence in the mean squared error.Unlike the depth-weighted estimator for parametric regression models,this depth-weighted nonparametric estimator has a simple variance structure and then we can compare its efficiency with the original one.Some simulations show that the new method can smooth the regression estimation and achieve some desirable balances between robustness and efficiency.
truncSP: An R Package for Estimation of Semi-Parametric Truncated Linear Regression Models
Directory of Open Access Journals (Sweden)
Maria Karlsson
2014-05-01
Full Text Available Problems with truncated data occur in many areas, complicating estimation and inference. Regarding linear regression models, the ordinary least squares estimator is inconsistent and biased for these types of data and is therefore unsuitable for use. Alternative estimators, designed for the estimation of truncated regression models, have been developed. This paper presents the R package truncSP. The package contains functions for the estimation of semi-parametric truncated linear regression models using three different estimators: the symmetrically trimmed least squares, quadratic mode, and left truncated estimators, all of which have been shown to have good asymptotic and ?nite sample properties. The package also provides functions for the analysis of the estimated models. Data from the environmental sciences are used to illustrate the functions in the package.
WAVELET-BASED ESTIMATORS OF MEAN REGRESSION FUNCTION WITH LONG MEMORY DATA
Institute of Scientific and Technical Information of China (English)
LI Lin-yuan; XIAO Yi-min
2006-01-01
This paper provides an asymptotic expansion for the mean integrated squared error (MISE) of nonlinear wavelet-based mean regression function estimators with long memory data. This MISE expansion, when the underlying mean regression function is only piecewise smooth, is the same as analogous expansion for the kernel estimators.However, for the kernel estimators, this MISE expansion generally fails if the additional smoothness assumption is absent.
Estimation of social discount rate for Lithuania
Directory of Open Access Journals (Sweden)
Vilma Kazlauskiene
2016-09-01
Full Text Available Purpose of the article: The paper seeks to analyse the problematics of estimation of the social discount rate (SDR. The SDR is the critical parameter of cost-benefit analysis, which allows calculating the present value of cost and the benefit of public sector investment projects. Incorrect choice of the SDR can lead to the realisation of ineffective public project or conversely, cost-effective project will be rejected. The relevance of this problem analysis is determined by discussions and different viewpoints of scientists on the choice of the most appropriate approach to determine the SDR and absence of methodically based the SDR on the national level of Lithuania. Methodology/methods: The research is performed by the scientific and methodical literature analysis, systematization, time series and regression analysis. Scientific aim: The aim of the article is to calculate the SDR based on the statistical data of Lithuania. Findings: The analysis of methods of SDR determination, as well as the researches performed by foreign researchers, allows stating that the social rate of time preference (SRTP approach is the most appropriate. The SDR, calculated by the SRTP approach, reflects the main purpose of public investment projects, i.e. to enhance social benefit for society, the best. The analyses of SDR determination practice of the foreign countries shows that the SDR level should not be universal for all states. Each country should calculate the SDR based on its own data and apply it for the assessment of public projects. Conclusions: The calculated SDR for Lithuania using the SRTP approach varies between 3.5 % and 4.3 %. Although it is lower than 5 % that is offered by European Commission, this rate is based on the statistical data of Lithuania and should be used for the assessment of the national public projects. Application of the reasonable SDR let get the more accurate and reliable cost-benefit analysis of the public projects.
The efficiency of modified jackknife and ridge type regression estimators: a comparison
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2008-09-01
Full Text Available A common problem in multiple regression models is multicollinearity, which produces undesirable effects on the least squares estimator. To circumvent this problem, two well known estimation procedures are often suggested in the literature. They are Generalized Ridge Regression (GRR estimation suggested by Hoerl and Kennard iteb8 and the Jackknifed Ridge Regression (JRR estimation suggested by Singh et al. iteb13. The GRR estimation leads to a reduction in the sampling variance, whereas, JRR leads to a reduction in the bias. In this paper, we propose a new estimator namely, Modified Jackknife Ridge Regression Estimator (MJR. It is based on the criterion that combines the ideas underlying both the GRR and JRR estimators. We have investigated standard properties of this new estimator. From a simulation study, we find that the new estimator often outperforms the LASSO, and it is superior to both GRR and JRR estimators, using the mean squared error criterion. The conditions under which the MJR estimator is better than the other two competing estimators have been investigated.
Directory of Open Access Journals (Sweden)
M. Srinivasan
2012-01-01
Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.
A note on constrained M-estimation and its recursive analog in multivariate linear regression models
Institute of Scientific and Technical Information of China (English)
RAO; Calyampudi; R
2009-01-01
In this paper,the constrained M-estimation of the regression coeffcients and scatter parameters in a general multivariate linear regression model is considered.Since the constrained M-estimation is not easy to compute,an up-dating recursion procedure is proposed to simplify the com-putation of the estimators when a new observation is obtained.We show that,under mild conditions,the recursion estimates are strongly consistent.In addition,the asymptotic normality of the recursive constrained M-estimators of regression coeffcients is established.A Monte Carlo simulation study of the recursion estimates is also provided.Besides,robustness and asymptotic behavior of constrained M-estimators are briefly discussed.
Malaria transmission rates estimated from serological data.
Burattini, M. N.; Massad, E; Coutinho, F. A.
1993-01-01
A mathematical model was used to estimate malaria transmission rates based on serological data. The model is minimally stochastic and assumes an age-dependent force of infection for malaria. The transmission rates estimated were applied to a simple compartmental model in order to mimic the malaria transmission. The model has shown a good retrieving capacity for serological and parasite prevalence data.
Lee, Wonyul; Liu, Yufeng
2012-10-01
Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new methods for utilizing joint information among response variables. All methods are in a penalized likelihood framework with weighted L(1) regularization. The proposed methods provide sparse estimators of conditional inverse co-variance matrix of response vector given explanatory variables as well as sparse estimators of regression parameters. Our first approach is to estimate the regression coefficients with plug-in estimated inverse covariance matrices, and our second approach is to estimate the inverse covariance matrix with plug-in estimated regression parameters. Our third approach is to estimate both simultaneously. Asymptotic properties of these methods are explored. Our numerical examples demonstrate that the proposed methods perform competitively in terms of prediction, variable selection, as well as inverse covariance matrix estimation.
Estimating the Impact of Urbanization on Air Quality in China Using Spatial Regression Models
Directory of Open Access Journals (Sweden)
Chuanglin Fang
2015-11-01
Full Text Available Urban air pollution is one of the most visible environmental problems to have accompanied China’s rapid urbanization. Based on emission inventory data from 2014, gathered from 289 cities, we used Global and Local Moran’s I to measure the spatial autorrelation of Air Quality Index (AQI values at the city level, and employed Ordinary Least Squares (OLS, Spatial Lag Model (SAR, and Geographically Weighted Regression (GWR to quantitatively estimate the comprehensive impact and spatial variations of China’s urbanization process on air quality. The results show that a significant spatial dependence and heterogeneity existed in AQI values. Regression models revealed urbanization has played an important negative role in determining air quality in Chinese cities. The population, urbanization rate, automobile density, and the proportion of secondary industry were all found to have had a significant influence over air quality. Per capita Gross Domestic Product (GDP and the scale of urban land use, however, failed the significance test at 10% level. The GWR model performed better than global models and the results of GWR modeling show that the relationship between urbanization and air quality was not constant in space. Further, the local parameter estimates suggest significant spatial variation in the impacts of various urbanization factors on air quality.
A note on the maximum likelihood estimator in the gamma regression model
Directory of Open Access Journals (Sweden)
Jerzy P. Rydlewski
2009-01-01
Full Text Available This paper considers a nonlinear regression model, in which the dependent variable has the gamma distribution. A model is considered in which the shape parameter of the random variable is the sum of continuous and algebraically independent functions. The paper proves that there is exactly one maximum likelihood estimator for the gamma regression model.
Regression techniques for estimating soil organic carbon contents from VIS/NIR reflectance spectra
Schwanghart, W.; Jarmer, T.; Bayer, A.; Hoffmann, U.; Hunziker, M.; Kuhn, N. J.; Ehlers, M.
2012-04-01
Soil reflectance spectroscopy is regarded as a promising approach to efficiently obtain densely sampled soil organic carbon (SOC) estimates at various spatial scales. The estimates are usually based on a statistical modeling approach since physical models are mostly not applicable owing to the manifold influences on soil spectra by different soil constituents and properties. Different multivariate statistical methods exist to estimate SOC concentrations in soil samples using visible and near infra-red (VIS/NIR) reflectance spectra. All these techniques face the challenge of generating accurate predictive models with a disproportionate large number of variables compared to the number of observations in such datasets, and in addition highly correlated independent variables. This often results in overfitting and may at the same time reduce the predictive power of such models. In this study, we conduct a rigorous assessment of the predictive ability of different regression techniques (stepwise regression, robust regression with feature selection, lasso, ridge regression, elastic net, principal component (PC) regression, partial least squares (PLS) regression). We apply datasets from different environments to include a wide variety of soils and to investigate the effects of different SOC variances and concentrations on model performance. Our hypothesis is that the predictive ability of regression techniques can be significantly improved by using more advanced techniques such as PLS regression. We discuss our findings with respect to the applicability of SOC estimation from VIS/NIR reflectance spectra in different environments.
U.S. Environmental Protection Agency — Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression...
Institute of Scientific and Technical Information of China (English)
无
2008-01-01
A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method.The estimators are constructed based on the observations to ensure unbiasedness in the sense that the estimators in a certain class have the same expectation as the mean survival time.The estimators have good properties such as strong consistency (with the rate of O(n-1/2 (log log n)1/2)) and asymptotic normality.The application to linear regression is considered and the simulation reports are given.
Yoonseok Shin
2015-01-01
Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stag...
Asymptotic Normality of LS Estimate in Simple Linear EV Regression Model
Institute of Scientific and Technical Information of China (English)
Jixue LIU
2006-01-01
Though EV model is theoretically more appropriate for applications in which measurement errors exist, people are still more inclined to use the ordinary regression models and the traditional LS method owing to the difficulties of statistical inference and computation. So it is meaningful to study the performance of LS estimate in EV model.In this article we obtain general conditions guaranteeing the asymptotic normality of the estimates of regression coefficients in the linear EV model. It is noticeable that the result is in some way different from the corresponding result in the ordinary regression model.
Directory of Open Access Journals (Sweden)
KAYODE AYINDE
2012-11-01
Full Text Available Performances of estimators of linear regression model with autocorrelated error term have been attributed to the nature and specification of the explanatory variables. The violation of assumption of the independence of the explanatory variables is not uncommon especially in business, economic and social sciences, leading to the development of many estimators. Moreover, prediction is one of the main essences of regression analysis. This work, therefore, attempts to examine the parameter estimates of the Ordinary Least Square estimator (OLS, Cochrane-Orcutt estimator (COR, Maximum Likelihood estimator (ML and the estimators based on Principal Component analysis (PC in prediction of linear regression model with autocorrelated error terms under the violations of assumption of independent regressors (multicollinearity using Monte-Carlo experiment approach. With uniform variables as regressors, it further identifies the best estimator that can be used for prediction purpose by averaging the adjusted co-efficient of determination of each estimator over the number of trials. Results reveal that the performances of COR and ML estimators at each level of multicollinearity over the levels of autocorrelation are convex – like while that of the OLS and PC estimators are concave; and that asthe level of multicollinearity increases, the estimators perform much better at all the levels of autocorrelation. Except when the sample size is small (n=10, the performances of the COR and ML estimators are generally best and asymptotically the same. When the sample size is small, the COR estimator is still best except when the autocorrelation level is low. At these instances, the PC estimator is either best or competes with the best estimator. Moreover, at low level of autocorrelation in all the sample sizes, the OLS estimator competes with the best estimator in all the levels of multicollinearity.
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
Variance Estimation Using Refitted Cross-validation in Ultrahigh Dimensional Regression
Fan, Jianqing; Hao, Ning
2010-01-01
Variance estimation is a fundamental problem in statistical modeling. In ultrahigh dimensional linear regressions where the dimensionality is much larger than sample size, traditional variance estimation techniques are not applicable. Recent advances on variable selection in ultrahigh dimensional linear regressions make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to serious underestimate of the noise level. In this paper, we propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation (RCV), to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression functi...
EFFICIENT ESTIMATION OF FUNCTIONAL-COEFFICIENT REGRESSION MODELS WITH DIFFERENT SMOOTHING VARIABLES
Institute of Scientific and Technical Information of China (English)
Zhang Riquan; Li Guoying
2008-01-01
In this article, a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different co-efficient functions is defined. First step, by the local linear technique and the averaged method, the initial estimates of the coefficient functions are given. Second step, based on the initial estimates, the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure. The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions. Two simulated examples show that the procedure is effective.
Enders, Craig K.
2001-01-01
Examined the performance of a recently available full information maximum likelihood (FIML) estimator in a multiple regression model with missing data using Monte Carlo simulation and considering the effects of four independent variables. Results indicate that FIML estimation was superior to that of three ad hoc techniques, with less bias and less…
A Simple Introduction to Moving Least Squares and Local Regression Estimation
Energy Technology Data Exchange (ETDEWEB)
Garimella, Rao Veerabhadra [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-06-22
In this brief note, a highly simpli ed introduction to esimating functions over a set of particles is presented. The note starts from Global Least Squares tting, going on to Moving Least Squares estimation (MLS) and nally, Local Regression Estimation (LRE).
Ahearn, Elizabeth A.
2010-01-01
Multiple linear regression equations for determining flow-duration statistics were developed to estimate select flow exceedances ranging from 25- to 99-percent for six 'bioperiods'-Salmonid Spawning (November), Overwinter (December-February), Habitat Forming (March-April), Clupeid Spawning (May), Resident Spawning (June), and Rearing and Growth (July-October)-in Connecticut. Regression equations also were developed to estimate the 25- and 99-percent flow exceedances without reference to a bioperiod. In total, 32 equations were developed. The predictive equations were based on regression analyses relating flow statistics from streamgages to GIS-determined basin and climatic characteristics for the drainage areas of those streamgages. Thirty-nine streamgages (and an additional 6 short-term streamgages and 28 partial-record sites for the non-bioperiod 99-percent exceedance) in Connecticut and adjacent areas of neighboring States were used in the regression analysis. Weighted least squares regression analysis was used to determine the predictive equations; weights were assigned based on record length. The basin characteristics-drainage area, percentage of area with coarse-grained stratified deposits, percentage of area with wetlands, mean monthly precipitation (November), mean seasonal precipitation (December, January, and February), and mean basin elevation-are used as explanatory variables in the equations. Standard errors of estimate of the 32 equations ranged from 10.7 to 156 percent with medians of 19.2 and 55.4 percent to predict the 25- and 99-percent exceedances, respectively. Regression equations to estimate high and median flows (25- to 75-percent exceedances) are better predictors (smaller variability of the residual values around the regression line) than the equations to estimate low flows (less than 75-percent exceedance). The Habitat Forming (March-April) bioperiod had the smallest standard errors of estimate, ranging from 10.7 to 20.9 percent. In
Data Fusion for Improved Respiration Rate Estimation
Directory of Open Access Journals (Sweden)
Gari D. Clifford
2010-01-01
Full Text Available We present an application of a modified Kalman-Filter (KF framework for data fusion to the estimation of respiratory rate from multiple physiological sources which is robust to background noise. A novel index of the underlying signal quality of respiratory signals is presented and then used to modify the noise covariance matrix of the KF which discounts the effect of noisy data. The signal quality index, together with the KF innovation sequence, is also used to weight multiple independent estimates of the respiratory rate from independent KFs. The approach is evaluated both on a realistic artificial ECG model (with real additive noise and on real data taken from 30 subjects with overnight polysomnograms, containing ECG, respiration, and peripheral tonometry waveforms from which respiration rates were estimated. Results indicate that our automated voting system can out-perform any individual respiration rate estimation technique at all levels of noise and respiration rates exhibited in our data. We also demonstrate that even the addition of a noisier extra signal leads to an improved estimate using our framework. Moreover, our simulations demonstrate that different ECG respiration extraction techniques have different error profiles with respect to the respiration rate, and therefore a respiration rate-related modification of any fusion algorithm may be appropriate.
Transient simulation of regression rate on thrust regulation process in hybrid rocket motor
Directory of Open Access Journals (Sweden)
Tian Hui
2014-12-01
Full Text Available The main goal of this paper is to study the characteristics of regression rate of solid grain during thrust regulation process. For this purpose, an unsteady numerical model of regression rate is established. Gas–solid coupling is considered between the solid grain surface and combustion gas. Dynamic mesh is used to simulate the regression process of the solid fuel surface. Based on this model, numerical simulations on a H2O2/HTPB (hydroxyl-terminated polybutadiene hybrid motor have been performed in the flow control process. The simulation results show that under the step change of the oxidizer mass flow rate condition, the regression rate cannot reach a stable value instantly because the flow field requires a short time period to adjust. The regression rate increases with the linear gain of oxidizer mass flow rate, and has a higher slope than the relative inlet function of oxidizer flow rate. A shorter regulation time can cause a higher regression rate during regulation process. The results also show that transient calculation can better simulate the instantaneous regression rate in the operation process.
Transient simulation of regression rate on thrust regulation process in hybrid rocket motor
Institute of Scientific and Technical Information of China (English)
Tian Hui; Li Yijie; Zeng Peng
2014-01-01
The main goal of this paper is to study the characteristics of regression rate of solid grain during thrust regulation process. For this purpose, an unsteady numerical model of regression rate is established. Gas–solid coupling is considered between the solid grain surface and combustion gas. Dynamic mesh is used to simulate the regression process of the solid fuel surface. Based on this model, numerical simulations on a H2O2/HTPB (hydroxyl-terminated polybutadiene) hybrid motor have been performed in the flow control process. The simulation results show that under the step change of the oxidizer mass flow rate condition, the regression rate cannot reach a stable value instantly because the flow field requires a short time period to adjust. The regression rate increases with the linear gain of oxidizer mass flow rate, and has a higher slope than the relative inlet function of oxidizer flow rate. A shorter regulation time can cause a higher regression rate during regulation process. The results also show that transient calculation can better simulate the instantaneous regression rate in the operation process.
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
Convex weighting criteria for speaking rate estimation
Jiao, Yishan; Berisha, Visar; Tu, Ming; Liss, Julie
2015-01-01
Speaking rate estimation directly from the speech waveform is a long-standing problem in speech signal processing. In this paper, we pose the speaking rate estimation problem as that of estimating a temporal density function whose integral over a given interval yields the speaking rate within that interval. In contrast to many existing methods, we avoid the more difficult task of detecting individual phonemes within the speech signal and we avoid heuristics such as thresholding the temporal envelope to estimate the number of vowels. Rather, the proposed method aims to learn an optimal weighting function that can be directly applied to time-frequency features in a speech signal to yield a temporal density function. We propose two convex cost functions for learning the weighting functions and an adaptation strategy to customize the approach to a particular speaker using minimal training. The algorithms are evaluated on the TIMIT corpus, on a dysarthric speech corpus, and on the ICSI Switchboard spontaneous speech corpus. Results show that the proposed methods outperform three competing methods on both healthy and dysarthric speech. In addition, for spontaneous speech rate estimation, the result show a high correlation between the estimated speaking rate and ground truth values. PMID:26167516
Directory of Open Access Journals (Sweden)
Anwar Fitrianto
2014-01-01
Full Text Available When independent variables have high linear correlation in a multiple linear regression model, we can have wrong analysis. It happens if we do the multiple linear regression analysis based on common Ordinary Least Squares (OLS method. In this situation, we are suggested to use ridge regression estimator. We conduct some simulation study to compare the performance of ridge regression estimator and the OLS. We found that Hoerl and Kennard ridge regression estimation method has better performance than the other approaches.
Saadah, Nicholas H; van Hout, Fabienne M A; Schipperus, Martin R; le Cessie, Saskia; Middelburg, Rutger A; Wiersum-Osselton, Johanna C; van der Bom, Johanna G
2017-09-01
We estimated rates for common plasma-associated transfusion reactions and compared reported rates for various plasma types. We performed a systematic review and meta-analysis of peer-reviewed articles that reported plasma transfusion reaction rates. Random-effects pooled rates were calculated and compared between plasma types. Meta-regression was used to compare various plasma types with regard to their reported plasma transfusion reaction rates. Forty-eight studies reported transfusion reaction rates for fresh-frozen plasma (FFP; mixed-sex and male-only), amotosalen INTERCEPT FFP, methylene blue-treated FFP, and solvent/detergent-treated pooled plasma. Random-effects pooled average rates for FFP were: allergic reactions, 92/10(5) units transfused (95% confidence interval [CI], 46-184/10(5) units transfused); febrile nonhemolytic transfusion reactions (FNHTRs), 12/10(5) units transfused (95% CI, 7-22/10(5) units transfused); transfusion-associated circulatory overload (TACO), 6/10(5) units transfused (95% CI, 1-30/10(5) units transfused); transfusion-related acute lung injury (TRALI), 1.8/10(5) units transfused (95% CI, 1.2-2.7/10(5) units transfused); and anaphylactic reactions, 0.8/10(5) units transfused (95% CI, 0-45.7/10(5) units transfused). Risk differences between plasma types were not significant for allergic reactions, TACO, or anaphylactic reactions. Methylene blue-treated FFP led to fewer FNHTRs than FFP (risk difference = -15.3 FNHTRs/10(5) units transfused; 95% CI, -24.7 to -7.1 reactions/10(5) units transfused); and male-only FFP led to fewer cases of TRALI than mixed-sex FFP (risk difference = -0.74 TRALI/10(5) units transfused; 95% CI, -2.42 to -0.42 injuries/10(5) units transfused). Meta-regression demonstrates that the rate of FNHTRs is lower for methylene blue-treated compared with FFP, and the rate of TRALI is lower for male-only than for mixed-sex FFP; whereas no significant differences are observed between plasma types for allergic
Adaptive Algorithm for Chirp-Rate Estimation
Directory of Open Access Journals (Sweden)
Igor Djurović
2009-01-01
Full Text Available Chirp-rate, as a second derivative of signal phase, is an important feature of nonstationary signals in numerous applications such as radar, sonar, and communications. In this paper, an adaptive algorithm for the chirp-rate estimation is proposed. It is based on the confidence intervals rule and the cubic-phase function. The window width is adaptively selected to achieve good tradeoff between bias and variance of the chirp-rate estimate. The proposed algorithm is verified by simulations and the results show that it outperforms the standard algorithm with fixed window width.
Resting heart rate estimation using PIR sensors
Kapu, Hemanth; Saraswat, Kavisha; Ozturk, Yusuf; Cetin, A. Enis
2017-09-01
In this paper, we describe a non-invasive and non-contact system of estimating resting heart rate (RHR) using a pyroelectric infrared (PIR) sensor. This infrared system monitors and records the chest motion of a subject using the analog output signal of the PIR sensor. The analog output signal represents the composite motion due to inhale-exhale process with magnitude much larger than the minute vibrations of heartbeat. Since the acceleration of the heart activity is much faster than breathing the second derivative of the PIR sensor signal monitoring the chest of the subject is used to estimate the resting heart rate. Experimental results indicate that this ambient sensor can measure resting heart rate with a chi-square significance level of α = 0.05 compared to an industry standard PPG sensor. This new system provides a low cost and an effective way to estimate the resting heart rate, which is an important biological marker.
Dai, Wenlin
2017-09-01
Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.
Quasi-estimation as a Basis for Two-stage Solving of Regression Problem
Gordinsky, Anatoly
2010-01-01
An effective two-stage method for an estimation of parameters of the linear regression is considered. For this purpose we introduce a certain quasi-estimator that, in contrast to usual estimator, produces two alternative estimates. It is proved that, in comparison to the least squares estimate, one alternative has a significantly smaller quadratic risk, retaining at the same time unbiasedness and consistency. These properties hold true for one-dimensional, multi-dimensional, orthogonal and non-orthogonal problems. Moreover, a Monte-Carlo simulation confirms high robustness of the quasi-estimator to violations of the initial assumptions. Therefore, at the first stage of the estimation we calculate mentioned two alternative estimates. At the second stage we choose the better estimate out of these alternatives. In order to do so we use additional information, among it but not exclusively of a priori nature. In case of two alternatives the volume of such information should be minimal. Furthermore, the additional ...
Learning rates of least-square regularized regression with polynomial kernels
Institute of Scientific and Technical Information of China (English)
无
2009-01-01
This paper presents learning rates for the least-square regularized regression algorithms with polynomial kernels. The target is the error analysis for the regression problem in learning theory. A regularization scheme is given, which yields sharp learning rates. The rates depend on the dimension of polynomial space and polynomial reproducing kernel Hilbert space measured by covering numbers. Meanwhile, we also establish the direct approximation theorem by Bernstein-Durrmeyer operators in Lρ2X with Borel probability measure.
Experimental investigation of fuel regression rate in a HTPB based lab-scale hybrid rocket motor
Li, Xintian; Tian, Hui; Yu, Nanjia; Cai, Guobiao
2014-12-01
The fuel regression rate is an important parameter in the design process of the hybrid rocket motor. Additives in the solid fuel may have influences on the fuel regression rate, which will affect the internal ballistics of the motor. A series of firing experiments have been conducted on lab-scale hybrid rocket motors with 98% hydrogen peroxide (H2O2) oxidizer and hydroxyl terminated polybutadiene (HTPB) based fuels in this paper. An innovative fuel regression rate analysis method is established to diminish the errors caused by start and tailing stages in a short time firing test. The effects of the metal Mg, Al, aromatic hydrocarbon anthracene (C14H10), and carbon black (C) on the fuel regression rate are investigated. The fuel regression rate formulas of different fuel components are fitted according to the experiment data. The results indicate that the influence of C14H10 on the fuel regression rate of HTPB is not evident. However, the metal additives in the HTPB fuel can increase the fuel regression rate significantly.
Improved Estimation of Earth Rotation Parameters Using the Adaptive Ridge Regression
Huang, Chengli; Jin, Wenjing
1998-05-01
The multicollinearity among regression variables is a common phenomenon in the reduction of astronomical data. The phenomenon of multicollinearity and the diagnostic factors are introduced first. As a remedy, a new method, called adaptive ridge regression (ARR), which is an improved method of choosing the departure constant θ in ridge regression, is suggested and applied in a case that the Earth orientation parameters (EOP) are determined by lunar laser ranging (LLR). It is pointed out, via a diagnosis, the variance inflation factors (VIFs), that there exists serious multicollinearity among the regression variables. It is shown that the ARR method is effective in reducing the multicollinearity and makes the regression coefficients more stable than that of using ordinary least squares estimation (LS), especially when there is serious multicollinearity.
Estimation of biomass in wheat using random forest regression algorithm and remote sensing data
Institute of Scientific and Technical Information of China (English)
Li'ai Wang; Xudong Zhou; Xinkai Zhu; Zhaodi Dong; Wenshan Guo
2016-01-01
Wheat biomass can be estimated using appropriate spectral vegetation indices. However, the accuracy of estimation should be further improved for on-farm crop management. Previous studies focused on developing vegetation indices, however limited research exists on modeling algorithms. The emerging Random Forest (RF) machine-learning algorithm is regarded as one of the most precise prediction methods for regression modeling. The objectives of this study were to (1) investigate the applicability of the RF regression algorithm for remotely estimating wheat biomass, (2) test the performance of the RF regression model, and (3) compare the performance of the RF algorithm with support vector regression (SVR) and artificial neural network (ANN) machine-learning algorithms for wheat biomass estimation. Single HJ-CCD images of wheat from test sites in Jiangsu province were obtained during the jointing, booting, and anthesis stages of growth. Fifteen vegetation indices were calculated based on these images. In-situ wheat above-ground dry biomass was measured during the HJ-CCD data acquisition. The results showed that the RF model produced more accurate estimates of wheat biomass than the SVR and ANN models at each stage, and its robustness is as good as SVR but better than ANN. The RF algorithm provides a useful exploratory and predictive tool for estimating wheat biomass on a large scale in Southern China.
Estimation of biomass in wheat using random forest regression algorithm and remote sensing data
Institute of Scientific and Technical Information of China (English)
Li’ai Wang; Xudong Zhou; Xinkai Zhu; Zhaodi Dong; Wenshan Guo
2016-01-01
Wheat biomass can be estimated using appropriate spectral vegetation indices.However,the accuracy of estimation should be further improved for on-farm crop management.Previous studies focused on developing vegetation indices,however limited research exists on modeling algorithms.The emerging Random Forest(RF) machine-learning algorithm is regarded as one of the most precise prediction methods for regression modeling.The objectives of this study were to(1) investigate the applicability of the RF regression algorithm for remotely estimating wheat biomass,(2) test the performance of the RF regression model,and(3) compare the performance of the RF algorithm with support vector regression(SVR) and artificial neural network(ANN) machine-learning algorithms for wheat biomass estimation.Single HJ-CCD images of wheat from test sites in Jiangsu province were obtained during the jointing,booting,and anthesis stages of growth.Fifteen vegetation indices were calculated based on these images.In-situ wheat above-ground dry biomass was measured during the HJ-CCD data acquisition.The results showed that the RF model produced more accurate estimates of wheat biomass than the SVR and ANN models at each stage,and its robustness is as good as SVR but better than ANN.The RF algorithm provides a useful exploratory and predictive tool for estimating wheat biomass on a large scale in Southern China.
Simultaneous estimation and variable selection in median regression using Lasso-type penalty.
Xu, Jinfeng; Ying, Zhiliang
2010-06-01
We consider the median regression with a LASSO-type penalty term for variable selection. With the fixed number of variables in regression model, a two-stage method is proposed for simultaneous estimation and variable selection where the degree of penalty is adaptively chosen. A Bayesian information criterion type approach is proposed and used to obtain a data-driven procedure which is proved to automatically select asymptotically optimal tuning parameters. It is shown that the resultant estimator achieves the so-called oracle property. The combination of the median regression and LASSO penalty is computationally easy to implement via the standard linear programming. A random perturbation scheme can be made use of to get simple estimator of the standard error. Simulation studies are conducted to assess the finite-sample performance of the proposed method. We illustrate the methodology with a real example.
Kovalchik, Stephanie A; Varadhan, Ravi; Fetterman, Barbara; Poitras, Nancy E; Wacholder, Sholom; Katki, Hormuzd A
2013-02-28
Estimates of absolute risks and risk differences are necessary for evaluating the clinical and population impact of biomedical research findings. We have developed a linear-expit regression model (LEXPIT) to incorporate linear and nonlinear risk effects to estimate absolute risk from studies of a binary outcome. The LEXPIT is a generalization of both the binomial linear and logistic regression models. The coefficients of the LEXPIT linear terms estimate adjusted risk differences, whereas the exponentiated nonlinear terms estimate residual odds ratios. The LEXPIT could be particularly useful for epidemiological studies of risk association, where adjustment for multiple confounding variables is common. We present a constrained maximum likelihood estimation algorithm that ensures the feasibility of risk estimates of the LEXPIT model and describe procedures for defining the feasible region of the parameter space, judging convergence, and evaluating boundary cases. Simulations demonstrate that the methodology is computationally robust and yields feasible, consistent estimators. We applied the LEXPIT model to estimate the absolute 5-year risk of cervical precancer or cancer associated with different Pap and human papillomavirus test results in 167,171 women undergoing screening at Kaiser Permanente Northern California. The LEXPIT model found an increased risk due to abnormal Pap test in human papillomavirus-negative that was not detected with logistic regression. Our R package blm provides free and easy-to-use software for fitting the LEXPIT model.
Institute of Scientific and Technical Information of China (English)
Ge-mai Chen; Jin-hong You
2005-01-01
Consider a repeated measurement partially linear regression model with an unknown vector pasemiparametric generalized least squares estimator (SGLSE) ofβ, we propose an iterative weighted semiparametric least squares estimator (IWSLSE) and show that it improves upon the SGLSE in terms of asymptotic covariance matrix. An adaptive procedure is given to determine the number of iterations. We also show that when the number of replicates is less than or equal to two, the IWSLSE can not improve upon the SGLSE.These results are generalizations of those in [2] to the case of semiparametric regressions.
Regressions by leaps and bounds and biased estimation techniques in yield modeling
Marquina, N. E. (Principal Investigator)
1979-01-01
The author has identified the following significant results. It was observed that OLS was not adequate as an estimation procedure when the independent or regressor variables were involved in multicollinearities. This was shown to cause the presence of small eigenvalues of the extended correlation matrix A'A. It was demonstrated that the biased estimation techniques and the all-possible subset regression could help in finding a suitable model for predicting yield. Latent root regression was an excellent tool that found how many predictive and nonpredictive multicollinearities there were.
Thrust estimator design based on least squares support vector regression machine
Institute of Scientific and Technical Information of China (English)
ZHAO Yong-ping; SUN Jian-guo
2010-01-01
In order to realize direct thrust control instead of traditional sensor-based control for nero-engines,it is indispensable to design a thrust estimator with high accuracy,so a scheme for thrust estimator design based on the least square support vector regression machine is proposed to solve this problem.Furthermore,numerical simulations confirm the effectiveness of our presented scheme.During the process of estimator design,a wrap per criterion that can not only reduce the computational complexity but also enhance the generalization performance is proposed to select variables as input variables for estimator.
Estimation of Panel Data Regression Models with Two-Sided Censoring or Truncation
DEFF Research Database (Denmark)
Alan, Sule; Honore, Bo E.; Hu, Luojia;
2014-01-01
This paper constructs estimators for panel data regression models with individual speci…fic heterogeneity and two–sided censoring and truncation. Following Powell (1986) the estimation strategy is based on moment conditions constructed from re–censored or re–truncated residuals. While these moment...... conditions do not identify the parameter of interest, they can be used to motivate objective functions that do. We apply one of the estimators to study the e¤ect of a Danish tax reform on household portfolio choice. The idea behind the estimators can also be used in a cross sectional setting....
Estimating Children's Soil/Dust Ingestion Rates through ...
Background: Soil/dust ingestion rates are important variables in assessing children’s health risks in contaminated environments. Current estimates are based largely on soil tracer methodology, which is limited by analytical uncertainty, small sample size, and short study duration. Objectives: The objective was to estimate site-specific soil/dust ingestion rates through reevaluation of the lead absorption dose–response relationship using new bioavailability data from the Bunker Hill Mining and Metallurgical Complex Superfund Site (BHSS) in Idaho, USA. Methods: The U.S. Environmental Protection Agency (EPA) in vitro bioavailability methodology was applied to archived BHSS soil and dust samples. Using age-specific biokinetic slope factors, we related bioavailable lead from these sources to children’s blood lead levels (BLLs) monitored during cleanup from 1988 through 2002. Quantitative regression analyses and exposure assessment guidance were used to develop candidate soil/dust source partition scenarios estimating lead intake, allowing estimation of age-specific soil/dust ingestion rates. These ingestion rate and bioavailability estimates were simultaneously applied to the U.S. EPA Integrated Exposure Uptake Biokinetic Model for Lead in Children to determine those combinations best approximating observed BLLs. Results: Absolute soil and house dust bioavailability averaged 33% (SD ± 4%) and 28% (SD ± 6%), respectively. Estimated BHSS age-specific soil/du
Bayes and empirical Bayes iteration estimators in two seemingly unrelated regression equations
Institute of Scientific and Technical Information of China (English)
WANG; Lichun
2005-01-01
For a system of two seemingly unrelated regression equations given by {y1=X1β+ε1,y2=X2γ+ε2, (y1 is an m × 1 vector and y2 is an n × 1 vector, m≠ n), employing the covariance adjusted technique, we propose the parametric Bayes and empirical Bayes iteration estimator sequences for regression coefficients. We prove that both the covariance matrices converge monotonically and the Bayes iteration estimator squence is consistent as well. Based on the mean square error (MSE) criterion, we elaborate the superiority of empirical Bayes iteration estimator over the Bayes estimator of single equation when the covariance matrix of errors is unknown. The results obtained in this paper further show the power of the covariance adjusted approach.
Institute of Scientific and Technical Information of China (English)
LI; XinTian; TIAN; Hui; CAI; GuoBiao
2013-01-01
This paper presents three-dimensional numerical simulations of the hybrid rocket motor with hydrogen peroxide (HP) and hy-droxyl terminated polybutadiene (HTPB) propellant combination and investigates the fuel regression rate distribution charac-teristics of different fuel types. The numerical models are established to couple the Navier-Stokes equations with turbulence,chemical reactions, solid fuel pyrolysis and solid-gas interfacial boundary conditions. Simulation results including the temper-ature contours and fuel regression rate distributions are presented for the tube, star and wagon wheel grains. The results demonstrate that the changing trends of the regression rate along the axis are similar for all kinds of fuel types, which decrease sharply near the leading edges of the fuels and then gradually increase with increasing axial locations. The regression rates of the star and wagon wheel grains show apparent three-dimensional characteristics, and they are higher in the regions of fuel surfaces near the central core oxidizer flow. The average regression rates increase as the oxidizer mass fluxes rise for all of the fuel types. However, under same oxidizer mass flux, the average regression rates of the star and wagon wheel grains are much larger than that of the tube grain due to their lower hydraulic diameters.
Gu, Fei; Preacher, Kristopher J; Wu, Wei; Yung, Yiu-Fai
2014-01-01
Although the state space approach for estimating multilevel regression models has been well established for decades in the time series literature, it does not receive much attention from educational and psychological researchers. In this article, we (a) introduce the state space approach for estimating multilevel regression models and (b) extend the state space approach for estimating multilevel factor models. A brief outline of the state space formulation is provided and then state space forms for univariate and multivariate multilevel regression models, and a multilevel confirmatory factor model, are illustrated. The utility of the state space approach is demonstrated with either a simulated or real example for each multilevel model. It is concluded that the results from the state space approach are essentially identical to those from specialized multilevel regression modeling and structural equation modeling software. More importantly, the state space approach offers researchers a computationally more efficient alternative to fit multilevel regression models with a large number of Level 1 units within each Level 2 unit or a large number of observations on each subject in a longitudinal study.
Estimating Strain Changes in Concrete during Curing Using Regression and Artificial Neural Network
Kaveh Ahangari; Zahra Najafi; Seyed Jamal Sheikh Zakariaee; Alireza Arab
2013-01-01
Due to the cement hydration heat, concrete deforms during curing. These deformations may lead to cracks in the concrete. Therefore, a method which estimates the strain during curing is very valuable. In this research, two methods of multivariable regression and neural network were studied with the aim of estimating strain changes in concrete. For this purpose, laboratory cylindrical specimens were prepared under controlled situation at first and then vibration wire strain gauges equipped with...
Direct Multitype Cardiac Indices Estimation via Joint Representation and Regression Learning.
Xue, Wufeng; Islam, Ali; Bhaduri, Mousumi; Li, Shuo
2017-05-26
Cardiac indices estimation is of great importance during identification and diagnosis of cardiac disease in clinical routine. However, estimation of multitype cardiac indices with consistently reliable and high accuracy is still a great challenge due to the high variability of cardiac structures and complexity of temporal dynamics in cardiac MR sequences. While efforts have been devoted into cardiac volumes estimation through feature engineering followed by a independent regression model, these methods suffer from the vulnerable feature representation and incompatible regression model. In this paper, we propose a semi-automated method for multitype cardiac indices estimation. After manual labelling of two landmarks for ROI cropping, an integrated deep neural network Indices-Net is designed to jointly learn the representation and regression models. It comprises two tightly-coupled networks: a deep convolution autoencoder (DCAE) for cardiac image representation, and a multiple output convolution neural network (CNN) for indices regression. Joint learning of the two networks effectively enhances the expressiveness of image representation with respect to cardiac indices, and the compatibility between image representation and indices regression, thus leading to accurate and reliable estimations for all the cardiac indices. When applied with five-fold cross validation on MR images of 145 subjects, Indices-Net achieves consistently low estimation error for LV wall thicknesses (1.440.71mm) and areas of cavity and myocardium (204133mm2). It outperforms, with significant error reductions, segmentation method (55.1% and 17.4%) and two-phase direct volume-only methods (12.7% and 14.6%) for wall thicknesses and areas, respectively. These advantages endow the proposed method a great potential in clinical cardiac function assessment.
Rank Set Sampling in Improving the Estimates of Simple Regression Model
Directory of Open Access Journals (Sweden)
M Iqbal Jeelani
2015-04-01
Full Text Available In this paper Rank set sampling (RSS is introduced with a view of increasing the efficiency of estimates of Simple regression model. Regression model is considered with respect to samples taken from sampling techniques like Simple random sampling (SRS, Systematic sampling (SYS and Rank set sampling (RSS. It is found that R2 and Adj R2 obtained from regression model based on Rank set sample is higher than rest of two sampling schemes. Similarly Root mean square error, p-values, coefficient of variation are much lower in Rank set based regression model, also under validation technique (Jackknifing there is consistency in the measure of R2, Adj R2 and RMSE in case of RSS as compared to SRS and SYS. Results are supported with an empirical study involving a real data set generated of Pinus Wallichiana taken from block Langate of district Kupwara.
Estimating sufficient reductions of the predictors in abundant high-dimensional regressions
Cook, R Dennis; Rothman, Adam J; 10.1214/11-AOS962
2012-01-01
We study the asymptotic behavior of a class of methods for sufficient dimension reduction in high-dimension regressions, as the sample size and number of predictors grow in various alignments. It is demonstrated that these methods are consistent in a variety of settings, particularly in abundant regressions where most predictors contribute some information on the response, and oracle rates are possible. Simulation results are presented to support the theoretical conclusion.
Early cost estimating for road construction projects using multiple regression techniques
Directory of Open Access Journals (Sweden)
Ibrahim Mahamid
2011-12-01
Full Text Available The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that the input data for the required regression model could be easily extracted from sketches or scope definition of the project. 11 regression models are developed to estimate the total cost of road construction project in US dollar; 5 of them include bid quantities as input variables and 6 include road length and road width. The coefficient of determination r2 for the developed models is ranging from 0.92 to 0.98 which indicate that the predicted values from a forecast models fit with the real-life data. The values of the mean absolute percentage error (MAPE of the developed regression models are ranging from 13% to 31%, the results compare favorably with past researches which have shown that the estimate accuracy in the early stages of a project is between ±25% and ±50%.
Melguizo, Tatiana; Bos, Johannes M.; Ngo, Federick; Mills, Nicholas; Prather, George
2016-01-01
This study evaluates the effectiveness of math placement policies for entering community college students on these students' academic success in math. We estimate the impact of placement decisions by using a discrete-time survival model within a regression discontinuity framework. The primary conclusion that emerges is that initial placement in a…
Adding a Parameter Increases the Variance of an Estimated Regression Function
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Point Estimates and Confidence Intervals for Variable Importance in Multiple Linear Regression
Thomas, D. Roland; Zhu, PengCheng; Decady, Yves J.
2007-01-01
The topic of variable importance in linear regression is reviewed, and a measure first justified theoretically by Pratt (1987) is examined in detail. Asymptotic variance estimates are used to construct individual and simultaneous confidence intervals for these importance measures. A simulation study of their coverage properties is reported, and an…
Testing and Modeling Fuel Regression Rate in a Miniature Hybrid Burner
Directory of Open Access Journals (Sweden)
Luciano Fanton
2012-01-01
Full Text Available Ballistic characterization of an extended group of innovative HTPB-based solid fuel formulations for hybrid rocket propulsion was performed in a lab-scale burner. An optical time-resolved technique was used to assess the quasisteady regression history of single perforation, cylindrical samples. The effects of metalized additives and radiant heat transfer on the regression rate of such formulations were assessed. Under the investigated operating conditions and based on phenomenological models from the literature, analyses of the collected experimental data show an appreciable influence of the radiant heat flux from burnt gases and soot for both unloaded and loaded fuel formulations. Pure HTPB regression rate data are satisfactorily reproduced, while the impressive initial regression rates of metalized formulations require further assessment.
Energy Technology Data Exchange (ETDEWEB)
Elliott Campbell, J. [Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA 52242 (United States)], E-mail: elliott-campbell@uiowa.edu; Moen, Jeremie C. [Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA 52242 (United States); Ney, Richard A. [Sebesta Blomberg and Associates Inc., North Liberty, IA 52317 (United States); Schnoor, Jerald L. [Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA 52242 (United States)
2008-03-15
Estimates of forest soil organic carbon (SOC) have applications in carbon science, soil quality studies, carbon sequestration technologies, and carbon trading. Forest SOC has been modeled using a regression coefficient methodology that applies mean SOC densities (mass/area) to broad forest regions. A higher resolution model is based on an approach that employs a geographic information system (GIS) with soil databases and satellite-derived landcover images. Despite this advancement, the regression approach remains the basis of current state and federal level greenhouse gas inventories. Both approaches are analyzed in detail for Wisconsin forest soils from 1983 to 2001, applying rigorous error-fixing algorithms to soil databases. Resulting SOC stock estimates are 20% larger when determined using the GIS method rather than the regression approach. Average annual rates of increase in SOC stocks are 3.6 and 1.0 million metric tons of carbon per year for the GIS and regression approaches respectively. - Large differences in estimates of soil organic carbon stocks and annual changes in stocks for Wisconsin forestlands indicate a need for validation from forthcoming forest surveys.
Shen, Xueqin; Yan, Hui; Yan, Weili; Guo, Lei
2007-01-01
In this paper, we introduce multidimensional support vector regression (MSVR) with iterative re-weight least square (IRWLS) based procedure to estimating the regional conductivity in 2D disc head model. The results show that the method is capable of determining for the regional location of the disturbed conductivity in the 2D disc head model with single tissue and estimating for the tissue conductivities in the 2D disc head model with four kinds of tissue. The estimation errors are all within a few percent.
DEFF Research Database (Denmark)
Petersen, Jørgen Holm
2016-01-01
. For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one-dimensional integrals that may be solved numerically. Good properties of the resulting estimator......This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied...
An, Lihua; Fung, Karen Y; Krewski, Daniel
2010-09-01
Spontaneous adverse event reporting systems are widely used to identify adverse reactions to drugs following their introduction into the marketplace. In this article, a James-Stein type shrinkage estimation strategy was developed in a Bayesian logistic regression model to analyze pharmacovigilance data. This method is effective in detecting signals as it combines information and borrows strength across medically related adverse events. Computer simulation demonstrated that the shrinkage estimator is uniformly better than the maximum likelihood estimator in terms of mean squared error. This method was used to investigate the possible association of a series of diabetic drugs and the risk of cardiovascular events using data from the Canada Vigilance Online Database.
Shin, Yoonseok
2015-01-01
Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.
Directory of Open Access Journals (Sweden)
Yoonseok Shin
2015-01-01
Full Text Available Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.
High regression rate hybrid rocket fuel grains with helical port structures
Walker, Sean D.
Hybrid rockets are popular in the aerospace industry due to their storage safety, simplicity, and controllability during rocket motor burn. However, they produce fuel regression rates typically 25% lower than solid fuel motors of the same thrust level. These lowered regression rates produce unacceptably high oxidizer-to-fuel (O/F) ratios that produce a potential for motor instability, nozzle erosion, and reduced motor duty cycles. To achieve O/F ratios that produce acceptable combustion characteristics, traditional cylindrical fuel ports are fabricated with very long length-to-diameter ratios to increase the total burning area. These high aspect ratios produce further reduced fuel regression rate and thrust levels, poor volumetric efficiency, and a potential for lateral structural loading issues during high thrust burns. In place of traditional cylindrical fuel ports, it is proposed that by researching the effects of centrifugal flow patterns introduced by embedded helical fuel port structures, a significant increase in fuel regression rates can be observed. The benefits of increasing volumetric efficiencies by lengthening the internal flow path will also be observed. The mechanisms of this increased fuel regression rate are driven by enhancing surface skin friction and reducing the effect of boundary layer "blowing" to enhance convective heat transfer to the fuel surface. Preliminary results using additive manufacturing to fabricate hybrid rocket fuel grains from acrylonitrile-butadiene-styrene (ABS) with embedded helical fuel port structures have been obtained, with burn-rate amplifications up to 3.0x than that of cylindrical fuel ports.
Aulenbach, Brent T.
2013-10-01
A regression-model based approach is a commonly used, efficient method for estimating streamwater constituent load when there is a relationship between streamwater constituent concentration and continuous variables such as streamwater discharge, season and time. A subsetting experiment using a 30-year dataset of daily suspended sediment observations from the Mississippi River at Thebes, Illinois, was performed to determine optimal sampling frequency, model calibration period length, and regression model methodology, as well as to determine the effect of serial correlation of model residuals on load estimate precision. Two regression-based methods were used to estimate streamwater loads, the Adjusted Maximum Likelihood Estimator (AMLE), and the composite method, a hybrid load estimation approach. While both methods accurately and precisely estimated loads at the model's calibration period time scale, precisions were progressively worse at shorter reporting periods, from annually to monthly. Serial correlation in model residuals resulted in observed AMLE precision to be significantly worse than the model calculated standard errors of prediction. The composite method effectively improved upon AMLE loads for shorter reporting periods, but required a sampling interval of at least 15-days or shorter, when the serial correlations in the observed load residuals were greater than 0.15. AMLE precision was better at shorter sampling intervals and when using the shortest model calibration periods, such that the regression models better fit the temporal changes in the concentration-discharge relationship. The models with the largest errors typically had poor high flow sampling coverage resulting in unrepresentative models. Increasing sampling frequency and/or targeted high flow sampling are more efficient approaches to ensure sufficient sampling and to avoid poorly performing models, than increasing calibration period length.
Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming
2011-06-01
Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005-2007.
General model selection estimation of a periodic regression with a Gaussian noise
Konev, Victor; 10.1007/s10463-008-0193-1
2010-01-01
This paper considers the problem of estimating a periodic function in a continuous time regression model with an additive stationary gaussian noise having unknown correlation function. A general model selection procedure on the basis of arbitrary projective estimates, which does not need the knowledge of the noise correlation function, is proposed. A non-asymptotic upper bound for quadratic risk (oracle inequality) has been derived under mild conditions on the noise. For the Ornstein-Uhlenbeck noise the risk upper bound is shown to be uniform in the nuisance parameter. In the case of gaussian white noise the constructed procedure has some advantages as compared with the procedure based on the least squares estimates (LSE). The asymptotic minimaxity of the estimates has been proved. The proposed model selection scheme is extended also to the estimation problem based on the discrete data applicably to the situation when high frequency sampling can not be provided.
Multi-view space object recognition and pose estimation based on kernel regression
Institute of Scientific and Technical Information of China (English)
Zhang Haopeng; Jiang Zhiguo
2014-01-01
The application of high-performance imaging sensors in space-based space surveillance systems makes it possible to recognize space objects and estimate their poses using vision-based methods. In this paper, we proposed a kernel regression-based method for joint multi-view space object recognition and pose estimation. We built a new simulated satellite image dataset named BUAA-SID 1.5 to test our method using different image representations. We evaluated our method for recognition-only tasks, pose estimation-only tasks, and joint recognition and pose estimation tasks. Experimental results show that our method outperforms the state-of-the-arts in space object recognition, and can recognize space objects and estimate their poses effectively and robustly against noise and lighting conditions.
Estimation of Water Quality Parameters Using the Regression Model with Fuzzy K-Means Clustering
Directory of Open Access Journals (Sweden)
Muntadher A. SHAREEF
2014-07-01
Full Text Available the traditional methods in remote sensing used for monitoring and estimating pollutants are generally relied on the spectral response or scattering reflected from water. In this work, a new method has been proposed to find contaminants and determine the Water Quality Parameters (WQPs based on theories of the texture analysis. Empirical statistical models have been developed to estimate and classify contaminants in the water. Gray Level Co-occurrence Matrix (GLCM is used to estimate six texture parameters: contrast, correlation, energy, homogeneity, entropy and variance. These parameters are used to estimate the regression model with three WQPs. Finally, the fuzzy K-means clustering was used to generalize the water quality estimation on all segmented image. Using the in situ measurements and IKONOS data, the obtained results show that texture parameters and high resolution remote sensing able to monitor and predicate the distribution of WQPs in large rivers.
Graphical evaluation of the ridge-type robust regression estimators in mixture experiments.
Erkoc, Ali; Emiroglu, Esra; Akay, Kadri Ulas
2014-01-01
In mixture experiments, estimation of the parameters is generally based on ordinary least squares (OLS). However, in the presence of multicollinearity and outliers, OLS can result in very poor estimates. In this case, effects due to the combined outlier-multicollinearity problem can be reduced to certain extent by using alternative approaches. One of these approaches is to use biased-robust regression techniques for the estimation of parameters. In this paper, we evaluate various ridge-type robust estimators in the cases where there are multicollinearity and outliers during the analysis of mixture experiments. Also, for selection of biasing parameter, we use fraction of design space plots for evaluating the effect of the ridge-type robust estimators with respect to the scaled mean squared error of prediction. The suggested graphical approach is illustrated on Hald cement data set.
Graphical Evaluation of the Ridge-Type Robust Regression Estimators in Mixture Experiments
Directory of Open Access Journals (Sweden)
Ali Erkoc
2014-01-01
Full Text Available In mixture experiments, estimation of the parameters is generally based on ordinary least squares (OLS. However, in the presence of multicollinearity and outliers, OLS can result in very poor estimates. In this case, effects due to the combined outlier-multicollinearity problem can be reduced to certain extent by using alternative approaches. One of these approaches is to use biased-robust regression techniques for the estimation of parameters. In this paper, we evaluate various ridge-type robust estimators in the cases where there are multicollinearity and outliers during the analysis of mixture experiments. Also, for selection of biasing parameter, we use fraction of design space plots for evaluating the effect of the ridge-type robust estimators with respect to the scaled mean squared error of prediction. The suggested graphical approach is illustrated on Hald cement data set.
Multi-view space object recognition and pose estimation based on kernel regression
Directory of Open Access Journals (Sweden)
Zhang Haopeng
2014-10-01
Full Text Available The application of high-performance imaging sensors in space-based space surveillance systems makes it possible to recognize space objects and estimate their poses using vision-based methods. In this paper, we proposed a kernel regression-based method for joint multi-view space object recognition and pose estimation. We built a new simulated satellite image dataset named BUAA-SID 1.5 to test our method using different image representations. We evaluated our method for recognition-only tasks, pose estimation-only tasks, and joint recognition and pose estimation tasks. Experimental results show that our method outperforms the state-of-the-arts in space object recognition, and can recognize space objects and estimate their poses effectively and robustly against noise and lighting conditions.
Regression rate behaviors of HTPB-based propellant combinations for hybrid rocket motor
Sun, Xingliang; Tian, Hui; Li, Yuelong; Yu, Nanjia; Cai, Guobiao
2016-02-01
The purpose of this paper is to characterize the regression rate behavior of hybrid rocket motor propellant combinations, using hydrogen peroxide (HP), gaseous oxygen (GOX), nitrous oxide (N2O) as the oxidizer and hydroxyl-terminated poly-butadiene (HTPB) as the based fuel. In order to complete this research by experiment and simulation, a hybrid rocket motor test system and a numerical simulation model are established. Series of hybrid rocket motor firing tests are conducted burning different propellant combinations, and several of those are used as references for numerical simulations. The numerical simulation model is developed by combining the Navies-Stokes equations with the turbulence model, one-step global reaction model, and solid-gas coupling model. The distribution of regression rate along the axis is determined by applying simulation mode to predict the combustion process and heat transfer inside the hybrid rocket motor. The time-space averaged regression rate has a good agreement between the numerical value and experimental data. The results indicate that the N2O/HTPB and GOX/HTPB propellant combinations have a higher regression rate, since the enhancement effect of latter is significant due to its higher flame temperature. Furthermore, the containing of aluminum (Al) and/or ammonium perchlorate(AP) in the grain does enhance the regression rate, mainly due to the more energy released inside the chamber and heat feedback to the grain surface by the aluminum combustion.
Cason, Gerald J.; Cason, Carolyn L.
A more familiar and efficient method for estimating the parameters of Cason and Cason's model was examined. Using a two-step analysis based on linear regression, rather than the direct search interative procedure, gave about equally good results while providing a 33 to 1 computer processing time advantage, across 14 cohorts of junior medical…
Evolving Software Effort Estimation Models Using Multigene Symbolic Regression Genetic Programming
Directory of Open Access Journals (Sweden)
Sultan Aljahdali
2013-12-01
Full Text Available Software has played an essential role in engineering, economic development, stock market growth and military applications. Mature software industry count on highly predictive software effort estimation models. Correct estimation of software effort lead to correct estimation of budget and development time. It also allows companies to develop appropriate time plan for marketing campaign. Now a day it became a great challenge to get these estimates due to the increasing number of attributes which affect the software development life cycle. Software cost estimation models should be able to provide sufficient confidence on its prediction capabilities. Recently, Computational Intelligence (CI paradigms were explored to handle the software effort estimation problem with promising results. In this paper we evolve two new models for software effort estimation using Multigene Symbolic Regression Genetic Programming (GP. One model utilizes the Source Line Of Code (SLOC as input variable to estimate the Effort (E; while the second model utilize the Inputs, Outputs, Files, and User Inquiries to estimate the Function Point (FP. The proposed GP models show better estimation capabilities compared to other reported models in the literature. The validation results are accepted based Albrecht data set.
Directory of Open Access Journals (Sweden)
Suzi Alves Camey
2014-01-01
Full Text Available Recent studies have emphasized that there is no justification for using the odds ratio (OR as an approximation of the relative risk (RR or prevalence ratio (PR. Erroneous interpretations of the OR as RR or PR must be avoided, as several studies have shown that the OR is not a good approximation for these measures when the outcome is common (> 10%. For multinomial outcomes it is usual to use the multinomial logistic regression. In this context, there are no studies showing the impact of the approximation of the OR in the estimates of RR or PR. This study aimed to present and discuss alternative methods to multinomial logistic regression based upon robust Poisson regression and the log-binomial model. The approaches were compared by simulating various possible scenarios. The results showed that the proposed models have more precise and accurate estimates for the RR or PR than the multinomial logistic regression, as in the case of the binary outcome. Thus also for multinomial outcomes the OR must not be used as an approximation of the RR or PR, since this may lead to incorrect conclusions.
Roscoe, K. L.; Weerts, A. H.
2012-04-01
Water level predictions in rivers are used by operational managers to make water management decisions. Such decisions can concern water routing in times of drought, operation of weirs, and actions for flood protection, such as evacuation. Understanding the uncertainty in the predictions can help managers make better-informed decisions. Conditional Quantile Regression is a method that can be used to determine the uncertainty in forecasted water levels by providing an estimate of the probability density function of the error in the prediction conditional on the forecasted water level. To derive this relationship, a series of forecasts and errors in the forecasts (residuals) are required. Thus, conditional quantile regressions can be derived for locations where both observations and forecasts are available. However, 1D-hydraulic models that are used for operational forecasting produce forecasts at intermediate points where no measurements are available but for which predictive uncertainty estimates are also desired for decision making. The objective of our study is to test if interpolation methods can be used to adequately estimate conditional quantile regressions at these in-between locations. For this purpose, five years of hindcasts were used at seven stations along the IJssel River in the Netherlands. Residuals in water level hindcasts were interpolated at the five in-between lying stations. The interpolation was based solely on distance and the interpolated residuals were compared to the measured residuals at stations at the in-between locations. The resulting interpolated residuals estimated the measured residuals well, especially for longer lead times. Quantile regression was then carried out using the series of forecasts and interpolated residuals at the in-between stations. The interpolated quantile regressions were compared with regressions calibrated using the actual residuals at the in-between stations. Results show that even a simple interpolation based
Directory of Open Access Journals (Sweden)
Hiroyuki Nakamoto
2014-01-01
Full Text Available The human is covered with soft skin and has tactile receptors inside. The skin deforms along a contact surface. The tactile receptors detect the mechanical deformation. The detection of the mechanical deformation is essential for the tactile sensation. We propose a magnetic type tactile sensor which has a soft surface and eight magnetoresistive elements. The soft surface has a permanent magnet inside and the magnetoresistive elements under the soft surface measure the magnetic flux density of the magnet. The tactile sensor estimates the displacement and the rotation on the surface based on the change of the magnetic flux density. Determination of an estimate equation is difficult because the displacement and the rotation are not geometrically decided based on the magnetic flux density. In this paper, a stepwise regression analysis determines the estimate equation. The outputs of the magnetoresistive elements are used as explanatory variables, and the three-axis displacement and the two-axis rotation are response variables in the regression analysis. We confirm the regression analysis is effective for determining the estimate equations through simulation and experiment. The results show the tactile sensor measures both the displacement and the rotation generated on the surface by using the determined equation.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States.
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-21
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects' affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain's motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Wall parameters estimation based onsupport vector regression for through wall radar sensing
Chen, Xi; Chen, Weidong
2015-12-01
In through wall radar sensing, the wall parameters estimation (WPE) problem has been a topic that attracts a lot of attention since the wall parameters, i.e., the permittivity and the thickness, are of crucial importance to locate the targets and to produce a well-focused image, but they are usually unknown in practice. To solve this problem, in this paper, the support vector regression (SVR), a powerful tool for regression analysis, is introduced, and its performance on WPE, provided it is used it in the regular way, is investigated. Unfortunately, it is shown that the regular use of SVR cannot afford satisfactory estimation results since the sample data used in SVR, namely the received echoes from the walls, are seriously interfered with the echoes from the targets which are located near the walls. In view of this limitation, a novel SVR-based WPE approach that consists of three stages is proposed by this paper. In the first stage, three regression functions are trained by SVR, one of which will output the estimate of the permittivity in the second stage, and the others are designed to output two instrumental variables for estimating the thickness. In the third stage, the estimate of thickness will be achieved by minimizing a predefined cost function wherein the estimated permittivity and the outputted instrumental variables are involved. The better robustness and higher estimation accuracy of the proposed approach compared to the regular use of SVR are validated by the numerical experimental results using finite-difference time-domain simulations.
[Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].
Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao
2016-03-01
Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.
Modified Regression Rate Formula of PMMA Combustion by a Single Plane Impinging Jet
Directory of Open Access Journals (Sweden)
Tsuneyoshi Matsuoka
2017-01-01
Full Text Available A modified regression rate formula for the uppermost stage of CAMUI-type hybrid rocket motor is proposed in this study. Assuming a quasi-steady, one-dimensional, an energy balance against a control volume near the fuel surface is considered. Accordingly, the regression rate formula which can calculate the local regression rate by the quenching distance between the flame and the regression surface is derived. An experimental setup which simulates the combustion phenomenon involved in the uppermost stage of a CAMUI-type hybrid rocket motor was constructed and the burning tests with various flow velocities and impinging distances were performed. A PMMA slab of 20 mm height, 60 mm width, and 20 mm thickness was chosen as a sample specimen and pure oxygen and O2/N2 mixture (50/50 vol.% were employed as the oxidizers. The time-averaged regression rate along the fuel surface was measured by a laser displacement sensor. The quenching distance during the combustion event was also identified from the observation. The comparison between the purely experimental and calculated values showed good agreement, although a large systematic error was expected due to the difficulty in accurately identifying the quenching distance.
Dufrenois, F; Noyer, J C
2013-02-01
Linear discriminant analysis, such as Fisher's criterion, is a statistical learning tool traditionally devoted to separating a training dataset into two or even several classes by the way of linear decision boundaries. In this paper, we show that this tool can formalize the robust linear regression problem as a robust estimator will do. More precisely, we develop a one-class Fischer's criterion in which the maximization provides both the regression parameters and the separation of the data in two classes: typical data and atypical data or outliers. This new criterion is built on the statistical properties of the subspace decomposition of the hat matrix. From this angle, we improve the discriminative properties of the hat matrix which is traditionally used as outlier diagnostic measure in linear regression. Naturally, we call this new approach discriminative hat matrix. The proposed algorithm is fully nonsupervised and needs only the initialization of one parameter. Synthetic and real datasets are used to study the performance both in terms of regression and classification of the proposed approach. We also illustrate its potential application to image recognition and fundamental matrix estimation in computer vision.
A regression-kriging model for estimation of rainfall in the Laohahe basin
Wang, Hong; Ren, Li L.; Liu, Gao H.
2009-10-01
This paper presents a multivariate geostatistical algorithm called regression-kriging (RK) for predicting the spatial distribution of rainfall by incorporating five topographic/geographic factors of latitude, longitude, altitude, slope and aspect. The technique is illustrated using rainfall data collected at 52 rain gauges from the Laohahe basis in northeast China during 1986-2005 . Rainfall data from 44 stations were selected for modeling and the remaining 8 stations were used for model validation. To eliminate multicollinearity, the five explanatory factors were first transformed using factor analysis with three Principal Components (PCs) extracted. The rainfall data were then fitted using step-wise regression and residuals interpolated using SK. The regression coefficients were estimated by generalized least squares (GLS), which takes the spatial heteroskedasticity between rainfall and PCs into account. Finally, the rainfall prediction based on RK was compared with that predicted from ordinary kriging (OK) and ordinary least squares (OLS) multiple regression (MR). For correlated topographic factors are taken into account, RK improves the efficiency of predictions. RK achieved a lower relative root mean square error (RMSE) (44.67%) than MR (49.23%) and OK (73.60%) and a lower bias than MR and OK (23.82 versus 30.89 and 32.15 mm) for annual rainfall. It is much more effective for the wet season than for the dry season. RK is suitable for estimation of rainfall in areas where there are no stations nearby and where topography has a major influence on rainfall.
Burns, Douglas A.; Smith, Martyn J.; Freehafer, Douglas A.
2015-12-31
A new Web-based application, titled “Application of Flood Regressions and Climate Change Scenarios To Explore Estimates of Future Peak Flows”, has been developed by the U.S. Geological Survey, in cooperation with the New York State Department of Transportation, that allows a user to apply a set of regression equations to estimate the magnitude of future floods for any stream or river in New York State (exclusive of Long Island) and the Lake Champlain Basin in Vermont. The regression equations that are the basis of the current application were developed in previous investigations by the U.S. Geological Survey (USGS) and are described at the USGS StreamStats Web sites for New York (http://water.usgs.gov/osw/streamstats/new_york.html) and Vermont (http://water.usgs.gov/osw/streamstats/Vermont.html). These regression equations include several fixed landscape metrics that quantify aspects of watershed geomorphology, basin size, and land cover as well as a climate variable—either annual precipitation or annual runoff.
Least Square Regression Method for Estimating Gas Concentration in an Electronic Nose System
Directory of Open Access Journals (Sweden)
Walaa Khalaf
2009-03-01
Full Text Available We describe an Electronic Nose (ENose system which is able to identify the type of analyte and to estimate its concentration. The system consists of seven sensors, five of them being gas sensors (supplied with different heater voltage values, the remainder being a temperature and a humidity sensor, respectively. To identify a new analyte sample and then to estimate its concentration, we use both some machine learning techniques and the least square regression principle. In fact, we apply two different training models; the first one is based on the Support Vector Machine (SVM approach and is aimed at teaching the system how to discriminate among different gases, while the second one uses the least squares regression approach to predict the concentration of each type of analyte.
Haben, Stephen
2016-01-01
We present a model for generating probabilistic forecasts by combining kernel density estimation (KDE) and quantile regression techniques, as part of the probabilistic load forecasting track of the Global Energy Forecasting Competition 2014. The KDE method is initially implemented with a time-decay parameter. We later improve this method by conditioning on the temperature or the period of the week variables to provide more accurate forecasts. Secondly, we develop a simple but effective quantile regression forecast. The novel aspects of our methodology are two-fold. First, we introduce symmetry into the time-decay parameter of the kernel density estimation based forecast. Secondly we combine three probabilistic forecasts with different weights for different periods of the month.
The limiting behavior of the estimated parameters in a misspecified random field regression model
DEFF Research Database (Denmark)
Dahl, Christian Møller; Qin, Yu
convenient new uniform convergence results that we propose. This theory may have applications beyond those presented here. Our results indicate that classical statistical inference techniques, in general, works very well for random field regression models in finite samples and that these models succesfully......This paper examines the limiting properties of the estimated parameters in the random field regression model recently proposed by Hamilton (Econometrica, 2001). Though the model is parametric, it enjoys the flexibility of the nonparametric approach since it can approximate a large collection...... of nonlinear functions and it has the added advantage that there is no "curse of dimensionality."Contrary to existing literature on the asymptotic properties of the estimated parameters in random field models our results do not require that the explanatory variables are sampled on a grid. However...
Least square regression method for estimating gas concentration in an electronic nose system.
Khalaf, Walaa; Pace, Calogero; Gaudioso, Manlio
2009-01-01
We describe an Electronic Nose (ENose) system which is able to identify the type of analyte and to estimate its concentration. The system consists of seven sensors, five of them being gas sensors (supplied with different heater voltage values), the remainder being a temperature and a humidity sensor, respectively. To identify a new analyte sample and then to estimate its concentration, we use both some machine learning techniques and the least square regression principle. In fact, we apply two different training models; the first one is based on the Support Vector Machine (SVM) approach and is aimed at teaching the system how to discriminate among different gases, while the second one uses the least squares regression approach to predict the concentration of each type of analyte.
Directory of Open Access Journals (Sweden)
Menon Carlo
2011-09-01
Full Text Available Abstract Background Several regression models have been proposed for estimation of isometric joint torque using surface electromyography (SEMG signals. Common issues related to torque estimation models are degradation of model accuracy with passage of time, electrode displacement, and alteration of limb posture. This work compares the performance of the most commonly used regression models under these circumstances, in order to assist researchers with identifying the most appropriate model for a specific biomedical application. Methods Eleven healthy volunteers participated in this study. A custom-built rig, equipped with a torque sensor, was used to measure isometric torque as each volunteer flexed and extended his wrist. SEMG signals from eight forearm muscles, in addition to wrist joint torque data were gathered during the experiment. Additional data were gathered one hour and twenty-four hours following the completion of the first data gathering session, for the purpose of evaluating the effects of passage of time and electrode displacement on accuracy of models. Acquired SEMG signals were filtered, rectified, normalized and then fed to models for training. Results It was shown that mean adjusted coefficient of determination (Ra2 values decrease between 20%-35% for different models after one hour while altering arm posture decreased mean Ra2 values between 64% to 74% for different models. Conclusions Model estimation accuracy drops significantly with passage of time, electrode displacement, and alteration of limb posture. Therefore model retraining is crucial for preserving estimation accuracy. Data resampling can significantly reduce model training time without losing estimation accuracy. Among the models compared, ordinary least squares linear regression model (OLS was shown to have high isometric torque estimation accuracy combined with very short training times.
Tiberi, Lara; Costa, Giovanni
2017-04-01
The possibility to directly associate the damages to the ground motion parameters is always a great challenge, in particular for civil protections. Indeed a ground motion parameter, estimated in near real time that can express the damages occurred after an earthquake, is fundamental to arrange the first assistance after an event. The aim of this work is to contribute to the estimation of the ground motion parameter that better describes the observed intensity, immediately after an event. This can be done calculating for each ground motion parameter estimated in a near real time mode a regression law which correlates the above-mentioned parameter to the observed macro-seismic intensity. This estimation is done collecting high quality accelerometric data in near field, filtering them at different frequency steps. The regression laws are calculated using two different techniques: the non linear least-squares (NLLS) Marquardt-Levenberg algorithm and the orthogonal distance methodology (ODR). The limits of the first methodology are the needed of initial values for the parameters a and b (set 1.0 in this study), and the constraint that the independent variable must be known with greater accuracy than the dependent variable. While the second algorithm is based on the estimation of the errors perpendicular to the line, rather than just vertically. The vertical errors are just the errors in the 'y' direction, so only for the dependent variable whereas the perpendicular errors take into account errors for both the variables, the dependent and the independent. This makes possible also to directly invert the relation, so the a and b values can be used also to express the gmps as function of I. For each law the standard deviation and R2 value are estimated in order to test the quality and the reliability of the found relation. The Amatrice earthquake of 24th August of 2016 is used as case of study to test the goodness of the calculated regression laws.
Estimating Glomerular Filtration Rate in Older People
Directory of Open Access Journals (Sweden)
Sabrina Garasto
2014-01-01
Full Text Available We aimed at reviewing age-related changes in kidney structure and function, methods for estimating kidney function, and impact of reduced kidney function on geriatric outcomes, as well as the reliability and applicability of equations for estimating glomerular filtration rate (eGFR in older patients. CKD is associated with different comorbidities and adverse outcomes such as disability and premature death in older populations. Creatinine clearance and other methods for estimating kidney function are not easy to apply in older subjects. Thus, an accurate and reliable method for calculating eGFR would be highly desirable for early detection and management of CKD in this vulnerable population. Equations based on serum creatinine, age, race, and gender have been widely used. However, these equations have their own limitations, and no equation seems better than the other ones in older people. New equations specifically developed for use in older populations, especially those based on serum cystatin C, hold promises. However, further studies are needed to definitely accept them as the reference method to estimate kidney function in older patients in the clinical setting.
Bayesian Regression and Neuro-Fuzzy Methods Reliability Assessment for Estimating Streamflow
Directory of Open Access Journals (Sweden)
Yaseen A. Hamaamin
2016-07-01
Full Text Available Accurate and efficient estimation of streamflow in a watershed’s tributaries is prerequisite parameter for viable water resources management. This study couples process-driven and data-driven methods of streamflow forecasting as a more efficient and cost-effective approach to water resources planning and management. Two data-driven methods, Bayesian regression and adaptive neuro-fuzzy inference system (ANFIS, were tested separately as a faster alternative to a calibrated and validated Soil and Water Assessment Tool (SWAT model to predict streamflow in the Saginaw River Watershed of Michigan. For the data-driven modeling process, four structures were assumed and tested: general, temporal, spatial, and spatiotemporal. Results showed that both Bayesian regression and ANFIS can replicate global (watershed and local (subbasin results similar to a calibrated SWAT model. At the global level, Bayesian regression and ANFIS model performance were satisfactory based on Nash-Sutcliffe efficiencies of 0.99 and 0.97, respectively. At the subbasin level, Bayesian regression and ANFIS models were satisfactory for 155 and 151 subbasins out of 155 subbasins, respectively. Overall, the most accurate method was a spatiotemporal Bayesian regression model that outperformed other models at global and local scales. However, all ANFIS models performed satisfactory at both scales.
Messier, Kyle P.; Akita, Yasuyuki; Serre, Marc L.
2012-01-01
Geographic Information Systems (GIS) based techniques are cost-effective and efficient methods used by state agencies and epidemiology researchers for estimating concentration and exposure. However, budget limitations have made statewide assessments of contamination difficult, especially in groundwater media. Many studies have implemented address geocoding, land use regression, and geostatistics independently, but this is the first to examine the benefits of integrating these GIS techniques t...
DEFF Research Database (Denmark)
Fauser, Patrik; Thomsen, Marianne; Pistocchi, Alberto
2010-01-01
for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude...... of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions...
In search of a corrected prescription drug elasticity estimate: a meta-regression approach.
Gemmill, Marin C; Costa-Font, Joan; McGuire, Alistair
2007-06-01
An understanding of the relationship between cost sharing and drug consumption depends on consistent and unbiased price elasticity estimates. However, there is wide heterogeneity among studies, which constrains the applicability of elasticity estimates for empirical purposes and policy simulation. This paper attempts to provide a corrected measure of the drug price elasticity by employing meta-regression analysis (MRA). The results indicate that the elasticity estimates are significantly different from zero, and the corrected elasticity is -0.209 when the results are made robust to heteroskedasticity and clustering of observations. Elasticity values are higher when the study was published in an economic journal, when the study employed a greater number of observations, and when the study used aggregate data. Elasticity estimates are lower when the institutional setting was a tax-based health insurance system.
Energy Technology Data Exchange (ETDEWEB)
Akkaya, Ali Volkan [Department of Mechanical Engineering, Yildiz Technical University, 34349 Besiktas, Istanbul (Turkey)
2009-02-15
In this paper, multiple nonlinear regression models for estimation of higher heating value of coals are developed using proximate analysis data obtained generally from the low rank coal samples as-received basis. In this modeling study, three main model structures depended on the number of proximate analysis parameters, which are named the independent variables, such as moisture, ash, volatile matter and fixed carbon, are firstly categorized. Secondly, sub-model structures with different arrangements of the independent variables are considered. Each sub-model structure is analyzed with a number of model equations in order to find the best fitting model using multiple nonlinear regression method. Based on the results of nonlinear regression analysis, the best model for each sub-structure is determined. Among them, the models giving highest correlation for three main structures are selected. Although the selected all three models predicts HHV rather accurately, the model involving four independent variables provides the most accurate estimation of HHV. Additionally, when the chosen model with four independent variables and a literature model are tested with extra proximate analysis data, it is seen that that the developed model in this study can give more accurate prediction of HHV of coals. It can be concluded that the developed model is effective tool for HHV estimation of low rank coals. (author)
A Bayesian Approach for Graph-constrained Estimation for High-dimensional Regression.
Sun, Hokeun; Li, Hongzhe
Many different biological processes are represented by network graphs such as regulatory networks, metabolic pathways, and protein-protein interaction networks. Since genes that are linked on the networks usually have biologically similar functions, the linked genes form molecular modules to affect the clinical phenotypes/outcomes. Similarly, in large-scale genetic association studies, many SNPs are in high linkage disequilibrium (LD), which can also be summarized as a LD graph. In order to incorporate the graph information into regression analysis with high dimensional genomic data as predictors, we introduce a Bayesian approach for graph-constrained estimation (Bayesian GRACE) and regularization, which controls the amount of regularization for sparsity and smoothness of the regression coefficients. The Bayesian estimation with their posterior distributions can provide credible intervals for the estimates of the regression coefficients along with standard errors. The deviance information criterion (DIC) is applied for model assessment and tuning parameter selection. The performance of the proposed Bayesian approach is evaluated through simulation studies and is compared with Bayesian Lasso and Bayesian Elastic-net procedures. We demonstrate our method in an analysis of data from a case-control genome-wide association study of neuroblastoma using a weighted LD graph.
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Estimating stress heterogeneity from aftershock rate
Helmstetter, A; Helmstetter, Agnes; Shaw, Bruce E.
2005-01-01
We estimate the rate of aftershocks triggered by a heterogeneous stress change, using the rate-and-state model of Dieterich [1994]. We show than an exponential stress distribution P(\\tau)~ exp(-\\tau/\\tau_0) gives an Omori law decay of aftershocks with time ~1/t^p, with an exponent p=1-A\\sigma_n/\\tau_0, where A is a parameter of the rate-and-state friction law, and \\sigma_n the normal stress. Omori exponent p thus decreases if the stress "heterogeneity" \\tau_0 decreases. We also invert the stress distribution P(\\tau) from the seismicity rate R(t), assuming that the stress does not change with time. We apply this method to a synthetic stress map, using the (modified) scale invariant "k^2" slip model [Herrero and Bernard, 1994]. We generate synthetic aftershock catalogs from this stress change. The seismicity rate on the rupture area shows a huge increase at short times, even if the stress decreases on average. This stochastic slip model gives a Gaussian stress distribution, but nevertheless produces an aftersho...
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression.
Ibitoye, Morufu Olusola; Hamzaid, Nur Azah; Abdul Wahab, Ahmad Khairi; Hasnan, Nazirah; Olatunji, Sunday Olusanya; Davis, Glen M
2016-07-19
The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES) in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG) of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR) due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70%) and testing (30%) subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R²) between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE) of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
Weidmann, C; Schneider, S; Litaker, D; Weck, E; Klüter, H
2012-01-01
Previous studies have shown substantial geographical variation in blood donation within developed countries. To understand this issue better, we identified community characteristics associated with blood donor rates in German municipalities in an ecological analysis. We calculated an aggregated rate of voluntary blood donors from each of 1533 municipalities in south-west Germany in 2007 from a database of the German Red Cross Blood Service. A multiple linear regression model estimated the association between the municipality-specific donor rate and several community characteristics. Finally, a spatial lag regression model was used to control for spatial autocorrelation that occurs when neighbouring units are related to each other. The spatial lag regression model showed that a relatively larger population, a higher percentage of inhabitants older than 30 years, a higher percentage of non-German citizens and a higher percentage of unemployed persons were associated with lower municipality-specific donor rates. Conversely, a higher donor rate was correlated with higher voter turnout, a higher percentage of inhabitants between 18 and 24 years and more frequent mobile donation sites. Blood donation appears to be a highly clustered regional phenomenon, suggesting the need for regionally targeted recruiting efforts and careful consideration of the value of mobile donation sites. Our model further suggests that municipalities with a decreasing percentage of 18- to 24-year-olds and an increasing percentage of older inhabitants may experience substantial declines in future blood donations. © 2011 The Author(s). Vox Sanguinis © 2011 International Society of Blood Transfusion.
Study on Thermal Degradation Characteristics and Regression Rate Measurement of Paraffin-Based Fuel
Directory of Open Access Journals (Sweden)
Songqi Hu
2015-09-01
Full Text Available Paraffin fuel has been found to have a regression rate that is higher than conventional HTPB (hydroxyl-terminated polybutadiene fuel and, thus, presents itself as an ideal energy source for a hybrid rocket engine. The energy characteristics of paraffin-based fuel and HTPB fuel have been calculated by the method of minimum free energy. The thermal degradation characteristics were measured for paraffin, pretreated paraffin, HTPB and paraffin-based fuel in different working conditions by the using differential scanning calorimetry (DSC and a thermogravimetric analyzer (TGA. The regression rates of paraffin-based fuel and HTPB fuel were tested by a rectangular solid-gas hybrid engine. The research findings showed that: the specific impulse of paraffin-based fuel is almost the same as that of HTPB fuel; the decomposition temperature of pretreated paraffin is higher than that of the unprocessed paraffin, but lower than that of HTPB; with the increase of paraffin, the initial reaction exothermic peak of paraffin-based fuel is reached in advance, and the initial reaction heat release also increases; the regression rate of paraffin-based fuel is higher than the common HTPB fuel under the same conditions; with the increase of oxidizer mass flow rate, the regression rate of solid fuel increases accordingly for the same fuel formulation.
Carroll, Raymond J.
2011-03-01
In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment or dosage, the conditional mean of the response, Y , is often anticipated to be a monotone function of X. Indeed, if this regression mean is not monotone (in the appropriate direction) then the medical or commercial value of the treatment is likely to be significantly curtailed, at least for values of X that lie beyond the point at which monotonicity fails. In the case of a density, common shape constraints include log-concavity and unimodality. If we can correctly guess the shape of a curve, then nonparametric estimators can be improved by taking this information into account. Addressing such problems requires a method for testing the hypothesis that the curve of interest satisfies a shape constraint, and, if the conclusion of the test is positive, a technique for estimating the curve subject to the constraint. Nonparametric methodology for solving these problems already exists, but only in cases where the covariates are observed precisely. However in many problems, data can only be observed with measurement errors, and the methods employed in the error-free case typically do not carry over to this error context. In this paper we develop a novel approach to hypothesis testing and function estimation under shape constraints, which is valid in the context of measurement errors. Our method is based on tilting an estimator of the density or the regression mean until it satisfies the shape constraint, and we take as our test statistic the distance through which it is tilted. Bootstrap methods are used to calibrate the test. The constrained curve estimators that we develop are also based on tilting, and in that context our work has points of contact with methodology in the error-free case.
A robust background regression based score estimation algorithm for hyperspectral anomaly detection
Zhao, Rui; Du, Bo; Zhang, Liangpei; Zhang, Lefei
2016-12-01
Anomaly detection has become a hot topic in the hyperspectral image analysis and processing fields in recent years. The most important issue for hyperspectral anomaly detection is the background estimation and suppression. Unreasonable or non-robust background estimation usually leads to unsatisfactory anomaly detection results. Furthermore, the inherent nonlinearity of hyperspectral images may cover up the intrinsic data structure in the anomaly detection. In order to implement robust background estimation, as well as to explore the intrinsic data structure of the hyperspectral image, we propose a robust background regression based score estimation algorithm (RBRSE) for hyperspectral anomaly detection. The Robust Background Regression (RBR) is actually a label assignment procedure which segments the hyperspectral data into a robust background dataset and a potential anomaly dataset with an intersection boundary. In the RBR, a kernel expansion technique, which explores the nonlinear structure of the hyperspectral data in a reproducing kernel Hilbert space, is utilized to formulate the data as a density feature representation. A minimum squared loss relationship is constructed between the data density feature and the corresponding assigned labels of the hyperspectral data, to formulate the foundation of the regression. Furthermore, a manifold regularization term which explores the manifold smoothness of the hyperspectral data, and a maximization term of the robust background average density, which suppresses the bias caused by the potential anomalies, are jointly appended in the RBR procedure. After this, a paired-dataset based k-nn score estimation method is undertaken on the robust background and potential anomaly datasets, to implement the detection output. The experimental results show that RBRSE achieves superior ROC curves, AUC values, and background-anomaly separation than some of the other state-of-the-art anomaly detection methods, and is easy to implement
Stature estimation from footprint measurements in Indian Tamils by regression analysis
Directory of Open Access Journals (Sweden)
T. Nataraja Moorthy
2014-03-01
Full Text Available Stature estimation is of particular interest to forensic scientists for its importance in human identification. Footprint is one piece of valuable physical evidence encountered at crime scenes and its identification can facilitate narrowing down the suspects and establishing the identity of the criminals. Analysis of footprints helps in estimation of an individual’s stature because of the existence of the strong correlation between footprint and height. Foot impressions are still found at crime scenes, since offenders often tend to remove their footwear either to avoid noise or to gain a better grip in climbing walls, etc., while entering or exiting. In Asian countries like India, there are people who still have the habit of walking barefoot. The present study aims to estimate the stature in a sample of 2,040 bilateral footprints collected from 1,020 healthy adult male Indian Tamils, an ethnic group in Tamilnadu State, India, who consented to participate in the study and who range in age from 19 to 42 years old; this study will help to generate population-specific equations using a simple linear regression statistical method. All footprint lengths exhibit a statistically positive significant correlation with stature (p-value < 0.01 and the correlation coefficient (r ranges from 0.546 to 0.578. The accuracy of the regression equations was verified by comparing the estimated stature with the actual stature. Regression equations derived in this research can be used to estimate stature from the complete or even partial footprints among Indian Tamils.
Estimating HIES Data through Ratio and Regression Methods for Different Sampling Designs
Directory of Open Access Journals (Sweden)
Faqir Muhammad
2007-01-01
Full Text Available In this study, comparison has been made for different sampling designs, using the HIES data of North West Frontier Province (NWFP for 2001-02 and 1998-99 collected from the Federal Bureau of Statistics, Statistical Division, Government of Pakistan, Islamabad. The performance of the estimators has also been considered using bootstrap and Jacknife. A two-stage stratified random sample design is adopted by HIES. In the first stage, enumeration blocks and villages are treated as the first stage Primary Sampling Units (PSU. The sample PSU’s are selected with probability proportional to size. Secondary Sampling Units (SSU i.e., households are selected by systematic sampling with a random start. They have used a single study variable. We have compared the HIES technique with some other designs, which are: Stratified Simple Random Sampling. Stratified Systematic Sampling. Stratified Ranked Set Sampling. Stratified Two Phase Sampling. Ratio and Regression methods were applied with two study variables, which are: Income (y and Household sizes (x. Jacknife and Bootstrap are used for variance replication. Simple Random Sampling with sample size (462 to 561 gave moderate variances both by Jacknife and Bootstrap. By applying Systematic Sampling, we received moderate variance with sample size (467. In Jacknife with Systematic Sampling, we obtained variance of regression estimator greater than that of ratio estimator for a sample size (467 to 631. At a sample size (952 variance of ratio estimator gets greater than that of regression estimator. The most efficient design comes out to be Ranked set sampling compared with other designs. The Ranked set sampling with jackknife and bootstrap, gives minimum variance even with the smallest sample size (467. Two Phase sampling gave poor performance. Multi-stage sampling applied by HIES gave large variances especially if used with a single study variable.
Bowden, Jack; Davey Smith, George; Burgess, Stephen
2015-04-01
The number of Mendelian randomization analyses including large numbers of genetic variants is rapidly increasing. This is due to the proliferation of genome-wide association studies, and the desire to obtain more precise estimates of causal effects. However, some genetic variants may not be valid instrumental variables, in particular due to them having more than one proximal phenotypic correlate (pleiotropy). We view Mendelian randomization with multiple instruments as a meta-analysis, and show that bias caused by pleiotropy can be regarded as analogous to small study bias. Causal estimates using each instrument can be displayed visually by a funnel plot to assess potential asymmetry. Egger regression, a tool to detect small study bias in meta-analysis, can be adapted to test for bias from pleiotropy, and the slope coefficient from Egger regression provides an estimate of the causal effect. Under the assumption that the association of each genetic variant with the exposure is independent of the pleiotropic effect of the variant (not via the exposure), Egger's test gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid instrumental variables. We illustrate the use of this approach by re-analysing two published Mendelian randomization studies of the causal effect of height on lung function, and the causal effect of blood pressure on coronary artery disease risk. The conservative nature of this approach is illustrated with these examples. An adaption of Egger regression (which we call MR-Egger) can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations. The approach provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation. © The Author 2015; Published by Oxford University Press on behalf of the International Epidemiological Association.
Estimating diversification rates from phylogenetic information.
Ricklefs, Robert E
2007-11-01
Patterns of species richness reflect the balance between speciation and extinction over the evolutionary history of life. These processes are influenced by the size and geographical complexity of regions, conditions of the environment, and attributes of individuals and species. Diversity within clades also depends on age and thus the time available for accumulating species. Estimating rates of diversification is key to understanding how these factors have shaped patterns of species richness. Several approaches to calculating both relative and absolute rates of speciation and extinction within clades are based on phylogenetic reconstructions of evolutionary relationships. As the size and quality of phylogenies increases, these approaches will find broader application. However, phylogeny reconstruction fosters a perceptual bias of continual increase in species richness, and the analysis of primarily large clades produces a data selection bias. Recognizing these biases will encourage the development of more realistic models of diversification and the regulation of species richness.
Bayesian Estimation of Thermonuclear Reaction Rates
Iliadis, Christian; Coc, Alain; Timmes, Frank; Starrfield, Sumner
2016-01-01
The problem of estimating non-resonant astrophysical S-factors and thermonuclear reaction rates, based on measured nuclear cross sections, is of major interest for nuclear energy generation, neutrino physics, and element synthesis. Many different methods have been applied in the past to this problem, all of them based on traditional statistics. Bayesian methods, on the other hand, are now in widespread use in the physical sciences. In astronomy, for example, Bayesian statistics is applied to the observation of extra-solar planets, gravitational waves, and type Ia supernovae. However, nuclear physics, in particular, has been slow to adopt Bayesian methods. We present the first astrophysical S-factors and reaction rates based on Bayesian statistics. We develop a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties in a consistent manner. The method is applied to the d(p,$\\gamma$)$^3$He, $^3$He($^3$He,2p)$^4$He, and $^3$He($\\alpha$,$\\gamma$)$^7$Be reactions,...
Bayesian Estimation of Thermonuclear Reaction Rates
Iliadis, C.; Anderson, K. S.; Coc, A.; Timmes, F. X.; Starrfield, S.
2016-11-01
The problem of estimating non-resonant astrophysical S-factors and thermonuclear reaction rates, based on measured nuclear cross sections, is of major interest for nuclear energy generation, neutrino physics, and element synthesis. Many different methods have been applied to this problem in the past, almost all of them based on traditional statistics. Bayesian methods, on the other hand, are now in widespread use in the physical sciences. In astronomy, for example, Bayesian statistics is applied to the observation of extrasolar planets, gravitational waves, and Type Ia supernovae. However, nuclear physics, in particular, has been slow to adopt Bayesian methods. We present astrophysical S-factors and reaction rates based on Bayesian statistics. We develop a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties in a consistent manner. The method is applied to the reactions d(p,γ)3He, 3He(3He,2p)4He, and 3He(α,γ)7Be, important for deuterium burning, solar neutrinos, and Big Bang nucleosynthesis.
Energy Technology Data Exchange (ETDEWEB)
Bracegirdle, Thomas J. [British Antarctic Survey, Cambridge (United Kingdom); Stephenson, David B. [University of Exeter, Mathematics Research Institute, Exeter (United Kingdom); NCAS-Climate, Reading (United Kingdom)
2012-12-15
This study presents projections of twenty-first century wintertime surface temperature changes over the high-latitude regions based on the third Coupled Model Inter-comparison Project (CMIP3) multi-model ensemble. The state-dependence of the climate change response on the present day mean state is captured using a simple yet robust ensemble linear regression model. The ensemble regression approach gives different and more precise estimated mean responses compared to the ensemble mean approach. Over the Arctic in January, ensemble regression gives less warming than the ensemble mean along the boundary between sea ice and open ocean (sea ice edge). Most notably, the results show 3 C less warming over the Barents Sea ({proportional_to} 7 C compared to {proportional_to} 10 C). In addition, the ensemble regression method gives projections that are 30 % more precise over the Sea of Okhostk, Bering Sea and Labrador Sea. For the Antarctic in winter (July) the ensemble regression method gives 2 C more warming over the Southern Ocean close to the Greenwich Meridian ({proportional_to} 7 C compared to {proportional_to} 5 C). Projection uncertainty was almost half that of the ensemble mean uncertainty over the Southern Ocean between 30 W to 90 E and 30 % less over the northern Antarctic Peninsula. The ensemble regression model avoids the need for explicit ad hoc weighting of models and exploits the whole ensemble to objectively identify overly influential outlier models. Bootstrap resampling shows that maximum precision over the Southern Ocean can be obtained with ensembles having as few as only six climate models. (orig.)
Ulrich, David; Parkhouse, Bonnie L.
1982-01-01
An alumni-based model is proposed as an alternative to sports management curriculum design procedures. The model relies on the assessment of curriculum by sport management alumni and uses performance ratings of employers and measures of satisfaction by alumni in a regression model to identify curriculum leading to increased work performance and…
Numerical investigation on the regression rate of hybrid rocket motor with star swirl fuel grain
Zhang, Shuai; Hu, Fan; Zhang, Weihua
2016-10-01
Although hybrid rocket motor is prospected to have distinct advantages over liquid and solid rocket motor, low regression rate and insufficient efficiency are two major disadvantages which have prevented it from being commercially viable. In recent years, complex fuel grain configurations are attractive in overcoming the disadvantages with the help of Rapid Prototyping technology. In this work, an attempt has been made to numerically investigate the flow field characteristics and local regression rate distribution inside the hybrid rocket motor with complex star swirl grain. A propellant combination with GOX and HTPB has been chosen. The numerical model is established based on the three dimensional Navier-Stokes equations with turbulence, combustion, and coupled gas/solid phase formulations. The calculated fuel regression rate is compared with the experimental data to validate the accuracy of numerical model. The results indicate that, comparing the star swirl grain with the tube grain under the conditions of the same port area and the same grain length, the burning surface area rises about 200%, the spatially averaged regression rate rises as high as about 60%, and the oxidizer can combust sufficiently due to the big vortex around the axis in the aft-mixing chamber. The combustion efficiency of star swirl grain is better and more stable than that of tube grain.
Lopez, Michael J; Gutman, Roee
2014-11-28
Propensity score methods are common for estimating a binary treatment effect when treatment assignment is not randomized. When exposure is measured on an ordinal scale (i.e. low-medium-high), however, propensity score inference requires extensions which have received limited attention. Estimands of possible interest with an ordinal exposure are the average treatment effects between each pair of exposure levels. Using these estimands, it is possible to determine an optimal exposure level. Traditional methods, including dichotomization of the exposure or a series of binary propensity score comparisons across exposure pairs, are generally inadequate for identification of optimal levels. We combine subclassification with regression adjustment to estimate transitive, unbiased average causal effects across an ordered exposure, and apply our method on the 2005-2006 National Health and Nutrition Examination Survey to estimate the effects of nutritional label use on body mass index.
Directory of Open Access Journals (Sweden)
Sander MJ van Kuijk
2016-03-01
Full Text Available BackgroundThe purpose of this simulation study is to assess the performance of multiple imputation compared to complete case analysis when assumptions of missing data mechanisms are violated.MethodsThe authors performed a stochastic simulation study to assess the performance of Complete Case (CC analysis and Multiple Imputation (MI with different missing data mechanisms (missing completely at random (MCAR, at random (MAR, and not at random (MNAR. The study focused on the point estimation of regression coefficients and standard errors.ResultsWhen data were MAR conditional on Y, CC analysis resulted in biased regression coefficients; they were all underestimated in our scenarios. In these scenarios, analysis after MI gave correct estimates. Yet, in case of MNAR MI yielded biased regression coefficients, while CC analysis performed well.ConclusionThe authors demonstrated that MI was only superior to CC analysis in case of MCAR or MAR. In some scenarios CC may be superior over MI. Often it is not feasible to identify the reason why data in a given dataset are missing. Therefore, emphasis should be put on reporting the extent of missing values, the method used to address them, and the assumptions that were made about the mechanism that caused missing data.
Threshold estimation based on a p-value framework in dose-response and regression settings
Mallik, Atul; Banerjee, Moulinath; Michailidis, George
2011-01-01
We use p-values to identify the threshold level at which a regression function takes off from its baseline value, a problem motivated by applications in toxicological and pharmacological dose-response studies and environmental statistics. We study the problem in two sampling settings: one where multiple responses can be obtained at a number of different covariate-levels and the other the standard regression setting involving limited number of response values at each covariate. Our procedure involves testing the hypothesis that the regression function is at its baseline at each covariate value and then computing the potentially approximate p-value of the test. An estimate of the threshold is obtained by fitting a piecewise constant function with a single jump discontinuity, otherwise known as a stump, to these observed p-values, as they behave in markedly different ways on the two sides of the threshold. The estimate is shown to be consistent and its finite sample properties are studied through simulations. Ou...
Monopole and dipole estimation for multi-frequency sky maps by linear regression
Wehus, I K; Eriksen, H K; Banday, A J; Dickinson, C; Ghosh, T; Gorski, K M; Lawrence, C R; Leahy, J P; Maino, D; Reich, P; Reich, W
2014-01-01
We describe a simple but efficient method for deriving a consistent set of monopole and dipole corrections for multi-frequency sky map data sets, allowing robust parametric component separation with the same data set. The computational core of this method is linear regression between pairs of frequency maps, often called "T-T plots". Individual contributions from monopole and dipole terms are determined by performing the regression locally in patches on the sky, while the degeneracy between different frequencies is lifted when ever the dominant foreground component exhibits a significant spatial spectral index variation. Based on this method, we present two different, but each internally consistent, sets of monopole and dipole coefficients for the 9-year WMAP, Planck 2013, SFD 100 um, Haslam 408 MHz and Reich & Reich 1420 MHz maps. The two sets have been derived with different analysis assumptions and data selection, and provides an estimate of residual systematic uncertainties. In general, our values are...
Strand, L. D.; Schultz, A. L.; Reedy, G. K.
1972-01-01
A microwave Doppler shift system, with increased resolution over earlier microwave techniques, was developed for the purpose of measuring the regression rates of solid propellants during rapid pressure transients. A continuous microwave beam is transmitted to the base of a burning propellant sample cast in a metal waveguide tube. A portion of the wave is reflected from the regressing propellant-flame zone interface. The phase angle difference between the incident and reflected signals and its time differential are continuously measured using a high resolution microwave network analyzer and related instrumentation. The apparent propellant regression rate is directly proportional to this latter differential measurement. Experiments were conducted to verify the (1) spatial and time resolution of the system, (2) effect of propellant surface irregularities and compressibility on the measurements, and (3) accuracy of the system for quasi-steady-state regression rate measurements. The microwave system was also used in two different transient combustion experiments: in a rapid depressurization bomb, and in the high-frequency acoustic pressure environment of a T-burner.
Moore, Richard Bridge; Johnston, Craig M.; Robinson, Keith W.; Deacon, Jeffrey R.
2004-01-01
The U.S. Geological Survey (USGS), in cooperation with the U.S. Environmental Protection Agency (USEPA) and the New England Interstate Water Pollution Control Commission (NEIWPCC), has developed a water-quality model, called SPARROW (Spatially Referenced Regressions on Watershed Attributes), to assist in regional total maximum daily load (TMDL) and nutrient-criteria activities in New England. SPARROW is a spatially detailed, statistical model that uses regression equations to relate total nitrogen and phosphorus (nutrient) stream loads to nutrient sources and watershed characteristics. The statistical relations in these equations are then used to predict nutrient loads in unmonitored streams. The New England SPARROW models are built using a hydrologic network of 42,000 stream reaches and associated watersheds. Watershed boundaries are defined for each stream reach in the network through the use of a digital elevation model and existing digitized watershed divides. Nutrient source data is from permitted wastewater discharge data from USEPA's Permit Compliance System (PCS), various land-use sources, and atmospheric deposition. Physical watershed characteristics include drainage area, land use, streamflow, time-of-travel, stream density, percent wetlands, slope of the land surface, and soil permeability. The New England SPARROW models for total nitrogen and total phosphorus have R-squared values of 0.95 and 0.94, with mean square errors of 0.16 and 0.23, respectively. Variables that were statistically significant in the total nitrogen model include permitted municipal-wastewater discharges, atmospheric deposition, agricultural area, and developed land area. Total nitrogen stream-loss rates were significant only in streams with average annual flows less than or equal to 2.83 cubic meters per second. In streams larger than this, there is nondetectable in-stream loss of annual total nitrogen in New England. Variables that were statistically significant in the total
Wilms, M.; Werner, R.; Ehrhardt, J.; Schmidt-Richberg, A.; Schlemmer, H.-P.; Handels, H.
2014-03-01
Breathing-induced location uncertainties of internal structures are still a relevant issue in the radiation therapy of thoracic and abdominal tumours. Motion compensation approaches like gating or tumour tracking are usually driven by low-dimensional breathing signals, which are acquired in real-time during the treatment. These signals are only surrogates of the internal motion of target structures and organs at risk, and, consequently, appropriate models are needed to establish correspondence between the acquired signals and the sought internal motion patterns. In this work, we present a diffeomorphic framework for correspondence modelling based on the Log-Euclidean framework and multivariate regression. Within the framework, we systematically compare standard and subspace regression approaches (principal component regression, partial least squares, canonical correlation analysis) for different types of common breathing signals (1D: spirometry, abdominal belt, diaphragm tracking; multi-dimensional: skin surface tracking). Experiments are based on 4D CT and 4D MRI data sets and cover intra- and inter-cycle as well as intra- and inter-session motion variations. Only small differences in internal motion estimation accuracy are observed between the 1D surrogates. Increasing the surrogate dimensionality, however, improved the accuracy significantly; this is shown for both 2D signals, which consist of a common 1D signal and its time derivative, and high-dimensional signals containing the motion of many skin surface points. Eventually, comparing the standard and subspace regression variants when applied to the high-dimensional breathing signals, only small differences in terms of motion estimation accuracy are found.
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
To study the sensitivity of inter-subspecific hybrid rice to climatic conditions, the spikelet fertilized rate (SFR) of four types of rice including indica-japonica hybrid, intermediate hybrid, indica and japonica were analyzed during 2000-2004. The inter-subspecific hybrids showed lower SFR, and much higher fluctuation under various climatic conditions than indica and japonica rice, showing the inter-subspecific hybrids were sensitive to ecological conditions. Among 12 climatic factors, the key factor affecting rice SFR was temperature, with the most significant factor being the average temperature of the seven days around panicle flowering (T7). A regressive equation of SFR-temperature by T7, and a comprehensive synthetic model by four important temperature indices were put forward. The optimum temperature for inter-subspecific hybrids was estimated to be 26.1-26.6 ℃, and lower limit of safe temperature to be 22.5-23.3 ℃ for panicle flowering, showing higher by averagely 0.5℃ and 1.7℃, respectively, to be compared with indica and japonica rice. This suggested that inter-subspecific hybrids require proper climatic conditions. During panicle flowering, the suitable daily average temperature was 23.3-29.0 ℃, with the fittest one at 26.1-26.6 ℃. For an application example, optimum heading season for inter-subspecific hybrids in key rice growing areas in China was as same as common pure lines, while inferior limit for safe date of heading was about a ten-day period earlier than those of common pure lines.
Estimating Strain Changes in Concrete during Curing Using Regression and Artificial Neural Network
Directory of Open Access Journals (Sweden)
Kaveh Ahangari
2013-01-01
Full Text Available Due to the cement hydration heat, concrete deforms during curing. These deformations may lead to cracks in the concrete. Therefore, a method which estimates the strain during curing is very valuable. In this research, two methods of multivariable regression and neural network were studied with the aim of estimating strain changes in concrete. For this purpose, laboratory cylindrical specimens were prepared under controlled situation at first and then vibration wire strain gauges equipped with thermistors were placed inside each sample to measure the deformations. Two different groups of input data were used in which variables included time, environment temperature, concrete temperature, water-to-cement ratio, aggregate content, height, and specimen diameter. CEM I, 42.5 R was utilized in set (I and strain changes have been measured in six concrete specimens. In set (II CEM II, 52.5 R was employed and strain changes were measured in three different specimens in which the diameter was held constant. The best multivariate regression equations calculated the determined coefficients at 0.804 and 0.82 for sets (I and (II, whereas the artificial neural networks predicted the strain with higher of 1 and 0.996. Results show that the neural network method can be utilized as an efficient tool for estimating concrete strain during curing.
Directory of Open Access Journals (Sweden)
Hongjian Wang
2014-01-01
Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.
Regression Model-Based Walking Speed Estimation Using Wrist-Worn Inertial Sensor
Park, Edward J.
2016-01-01
Walking speed is widely used to study human health status. Wearable inertial measurement units (IMU) are promising tools for the ambulatory measurement of walking speed. Among wearable inertial sensors, the ones worn on the wrist, such as a watch or band, have relatively higher potential to be easily incorporated into daily lifestyle. Using the arm swing motion in walking, this paper proposes a regression model-based method for longitudinal walking speed estimation using a wrist-worn IMU. A novel kinematic variable is proposed, which finds the wrist acceleration in the principal axis (i.e. the direction of the arm swing). This variable (called pca-acc) is obtained by applying sensor fusion on IMU data to find the orientation followed by the use of principal component analysis. An experimental evaluation was performed on 15 healthy young subjects during free walking trials. The experimental results show that the use of the proposed pca-acc variable can significantly improve the walking speed estimation accuracy when compared to the use of raw acceleration information (p<0.01). When Gaussian process regression is used, the resulting walking speed estimation accuracy and precision is about 5.9% and 4.7%, respectively. PMID:27764231
Estimating river discharge rates through remotely sensed thermal plumes
Abou Najm, M.; Alameddine, I.; Ibrahim, E.; Nasr, R.
2016-12-01
An empirical relationship is developed for estimating river discharge rates from remotely sensed thermal plumes that generate due to the temperature gradient at the interface between rivers and large water bodies. The method first determines the plumes' near field area, length scale, and length scale deviation angle from river channel centerline from Landsat 7 ETM+ satellite images. It also makes use of mean river and ocean temperatures and tidal levels collected from NOAA. A multiple linear regression model is then used to predict measured daily discharge rates with the determined predictors. The approach is tested and validated with discharge rates collected from four USGS gauged rivers in Oregon and California. Results from 116 Landsat 7 ETM+ satellites images of the four rivers show that the standard error of the discharge estimates were within a factor of 1.5-2.0 of observed values, with mean estimate accuracy of 10%. Goodness of fit (R2) ranged from 0.51 for the Rogue River up to 0.64 for the Coquille and Siuslaw rivers. The method offers an opportunity to monitor changes in flow discharge in ungauged basins, where tidal flow is not dominating and where a temperature difference of 2 oC exists between the river and the receiving water body.
Impact of regression methods on improved effects of soil structure on soil water retention estimates
Nguyen, Phuong Minh; De Pue, Jan; Le, Khoa Van; Cornelis, Wim
2015-06-01
Increasing the accuracy of pedotransfer functions (PTFs), an indirect method for predicting non-readily available soil features such as soil water retention characteristics (SWRC), is of crucial importance for large scale agro-hydrological modeling. Adding significant predictors (i.e., soil structure), and implementing more flexible regression algorithms are among the main strategies of PTFs improvement. The aim of this study was to investigate whether the improved effect of categorical soil structure information on estimating soil-water content at various matric potentials, which has been reported in literature, could be enduringly captured by regression techniques other than the usually applied linear regression. Two data mining techniques, i.e., Support Vector Machines (SVM), and k-Nearest Neighbors (kNN), which have been recently introduced as promising tools for PTF development, were utilized to test if the incorporation of soil structure will improve PTF's accuracy under a context of rather limited training data. The results show that incorporating descriptive soil structure information, i.e., massive, structured and structureless, as grouping criterion can improve the accuracy of PTFs derived by SVM approach in the range of matric potential of -6 to -33 kPa (average RMSE decreased up to 0.005 m3 m-3 after grouping, depending on matric potentials). The improvement was primarily attributed to the outperformance of SVM-PTFs calibrated on structureless soils. No improvement was obtained with kNN technique, at least not in our study in which the data set became limited in size after grouping. Since there is an impact of regression techniques on the improved effect of incorporating qualitative soil structure information, selecting a proper technique will help to maximize the combined influence of flexible regression algorithms and soil structure information on PTF accuracy.
Wishart, Justin Rory
2011-01-01
In this paper, a lower bound is determined in the minimax sense for change point estimators of the first derivative of a regression function in the fractional white noise model. Similar minimax results presented previously in the area focus on change points in the derivatives of a regression function in the white noise model or consider estimation of the regression function in the presence of correlated errors.
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression
Directory of Open Access Journals (Sweden)
Morufu Olusola Ibitoye
2016-07-01
Full Text Available The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70% and testing (30% subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R2 between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
THE PROBLEM OF ESTIMATING CAUSAL RELATIONS BY REGRESSING ACCOUNTING (SEMI) IDENTITIES
F. Javier Sánchez Vidal
2007-01-01
Inferences about the coefficient values of a model estimated with a linear regression cannot be made when both the dependent and the independent variable are part of an accounting (semi) identity. The coefficients will no longer indicate a causal relation as they must adapt to satisfy the identity. A good example is an investment-cash flow sensitivity model. Este trabajo habla de la imposibilidad de extraer conclusiones sobre el valor de los coeficientes de un modelo de regresión lineal que i...
The limiting behavior of the estimated parameters in a misspecified random field regression model
DEFF Research Database (Denmark)
Dahl, Christian Møller; Qin, Yu
, as a consequence the random field model specification introduces non-stationarity and non-ergodicity in the misspecified model and it becomes non-trivial, relative to the existing literature, to establish the limiting behavior of the estimated parameters. The asymptotic results are obtained by applying some...... convenient new uniform convergence results that we propose. This theory may have applications beyond those presented here. Our results indicate that classical statistical inference techniques, in general, works very well for random field regression models in finite samples and that these models succesfully...
Widyaningsih, Purnami; Retno Sari Saputro, Dewi; Nugrahani Putri, Aulia
2017-06-01
GWOLR model combines geographically weighted regression (GWR) and (ordinal logistic reression) OLR models. Its parameter estimation employs maximum likelihood estimation. Such parameter estimation, however, yields difficult-to-solve system of nonlinear equations, and therefore numerical approximation approach is required. The iterative approximation approach, in general, uses Newton-Raphson (NR) method. The NR method has a disadvantage—its Hessian matrix is always the second derivatives of each iteration so it does not always produce converging results. With regard to this matter, NR model is modified by substituting its Hessian matrix into Fisher information matrix, which is termed Fisher scoring (FS). The present research seeks to determine GWOLR model parameter estimation using Fisher scoring method and apply the estimation on data of the level of vulnerability to Dengue Hemorrhagic Fever (DHF) in Semarang. The research concludes that health facilities give the greatest contribution to the probability of the number of DHF sufferers in both villages. Based on the number of the sufferers, IR category of DHF in both villages can be determined.
Fu, Yuan-Yuan; Wang, Ji-Hua; Yang, Gui-Jun; Song, Xiao-Yu; Xu, Xin-Gang; Feng, Hai-Kuan
2013-05-01
The major limitation of using existing vegetation indices for crop biomass estimation is that it approaches a saturation level asymptotically for a certain range of biomass. In order to resolve this problem, band depth analysis and partial least square regression (PLSR) were combined to establish winter wheat biomass estimation model in the present study. The models based on the combination of band depth analysis and PLSR were compared with the models based on common vegetation indexes from the point of view of estimation accuracy, subsequently. Band depth analysis was conducted in the visible spectral domain (550-750 nm). Band depth, band depth ratio (BDR), normalized band depth index, and band depth normalized to area were utilized to represent band depth information. Among the calibrated estimation models, the models based on the combination of band depth analysis and PLSR reached higher accuracy than those based on the vegetation indices. Among them, the combination of BDR and PLSR got the highest accuracy (R2 = 0.792, RMSE = 0.164 kg x m(-2)). The results indicated that the combination of band depth analysis and PLSR could well overcome the saturation problem and improve the biomass estimation accuracy when winter wheat biomass is large.
Directory of Open Access Journals (Sweden)
Schook Lawrence B
2000-07-01
Full Text Available Abstract A strategy of multi-step minimal conditional regression analysis has been developed to determine the existence of statistical testing and parameter estimation for a quantitative trait locus (QTL that are unaffected by linked QTLs. The estimation of marker-QTL recombination frequency needs to consider only three cases: 1 the chromosome has only one QTL, 2 one side of the target QTL has one or more QTLs, and 3 either side of the target QTL has one or more QTLs. Analytical formula was derived to estimate marker-QTL recombination frequency for each of the three cases. The formula involves two flanking markers for case 1, two flanking markers plus a conditional marker for case 2, and two flanking markers plus two conditional markers for case 3. Each QTL variance and effect, and the total QTL variance were also estimated using analytical formulae. Simulation data show that the formulae for estimating marker-QTL recombination frequency could be a useful statistical tool for fine QTL mapping. With 1 000 observations, a QTL could be mapped to a narrow chromosome region of 1.5 cM if no linked QTL is present, and to a 2.8 cM chromosome region if either side of the target QTL has at least one linked QTL.
SECANT-FUZZY LINEAR REGRESSION METHOD FOR HARMONIC COMPONENTS ESTIMATION IN A POWER SYSTEM
Institute of Scientific and Technical Information of China (English)
Garba Inoussa; LUO An
2003-01-01
In order to avoid unnecessary damage of electrical equipments and installations,high quality power should be delivered to the end user and strict control on frequency should be made, Therefore, it is important to estimate the power system's harmonic components with higher accuracy. This paper presents a new approach for estimating harmonic component in a power system using secant - fuzzy linear regression method. In this approach the non - sinusoidal voltage or current waveform is written as I linear function. The coefficient of this function is assumed to be fuzzy number with a membership function that has center and spread value. The time dependent quantity is written as Taylor series with two different time dependent quantities. The objective is to use the sample obtained from the transmission line to find the power system harmonic components and frequencies. We used an experimental voltage signal from a sub power station as a numerical test.
Estimation in semi-parametric regression with non-stationary regressors
Chen, Jia; Li, Degui; 10.3150/10-BEJ344
2012-01-01
In this paper, we consider a partially linear model of the form $Y_t=X_t^{\\tau}\\theta_0+g(V_t)+\\epsilon_t$, $t=1,...,n$, where $\\{V_t\\}$ is a $\\beta$ null recurrent Markov chain, $\\{X_t\\}$ is a sequence of either strictly stationary or non-stationary regressors and $\\{\\epsilon_t\\}$ is a stationary sequence. We propose to estimate both $\\theta_0$ and $g(\\cdot)$ by a semi-parametric least-squares (SLS) estimation method. Under certain conditions, we then show that the proposed SLS estimator of $\\theta_0$ is still asymptotically normal with the same rate as for the case of stationary time series. In addition, we also establish an asymptotic distribution for the nonparametric estimator of the function $g(\\cdot)$. Some numerical examples are provided to show that our theory and estimation method work well in practice.
Estimation of safe doses: critical review of the hockey stick regression method
Energy Technology Data Exchange (ETDEWEB)
Yanagimoto, T.; Yamamoto, E.
1979-10-01
The hockey stick regression method is a convenient method to estimate safe doses, which is a kind of regression method using segmented lines. The method seems intuitively to be useful, but needs the assumption of the existence of the positive threshold value. The validity of the assumption is considered to be difficult to be shown. The alternative methods which are not based on the assumption, are given under suitable dose-response curves by introducing a risk level. Here the method using the probit model is compared with the hockey stick regression method. Computational results suggest that the alternative method is preferable. Furthermore similar problems in the case that response is measured as a continuous value are also extended. Data exemplified are concerned with relations of SO/sub 2/ to simple chronic bronchitis, relations of photochemical oxidants to eye discomfort and residual antibiotics in the lever of the chicks. These data was analyzed by the original authors under the assumption of the existence of the positive threshold values.
Statistical downscaling modeling with quantile regression using lasso to estimate extreme rainfall
Santri, Dewi; Wigena, Aji Hamim; Djuraidah, Anik
2016-02-01
Rainfall is one of the climatic elements with high diversity and has many negative impacts especially extreme rainfall. Therefore, there are several methods that required to minimize the damage that may occur. So far, Global circulation models (GCM) are the best method to forecast global climate changes include extreme rainfall. Statistical downscaling (SD) is a technique to develop the relationship between GCM output as a global-scale independent variables and rainfall as a local- scale response variable. Using GCM method will have many difficulties when assessed against observations because GCM has high dimension and multicollinearity between the variables. The common method that used to handle this problem is principal components analysis (PCA) and partial least squares regression. The new method that can be used is lasso. Lasso has advantages in simultaneuosly controlling the variance of the fitted coefficients and performing automatic variable selection. Quantile regression is a method that can be used to detect extreme rainfall in dry and wet extreme. Objective of this study is modeling SD using quantile regression with lasso to predict extreme rainfall in Indramayu. The results showed that the estimation of extreme rainfall (extreme wet in January, February and December) in Indramayu could be predicted properly by the model at quantile 90th.
Using the jackknife for estimation in log link Bernoulli regression models.
Lipsitz, Stuart R; Fitzmaurice, Garrett M; Arriaga, Alex; Sinha, Debajyoti; Gawande, Atul A
2015-02-10
Bernoulli (or binomial) regression using a generalized linear model with a log link function, where the exponentiated regression parameters have interpretation as relative risks, is often more appropriate than logistic regression for prospective studies with common outcomes. In particular, many researchers regard relative risks to be more intuitively interpretable than odds ratios. However, for the log link, when the outcome is very prevalent, the likelihood may not have a unique maximum. To circumvent this problem, a 'COPY method' has been proposed, which is equivalent to creating for each subject an additional observation with the same covariates except the response variable has the outcome values interchanged (1's changed to 0's and 0's changed to 1's). The original response is given weight close to 1, while the new observation is given a positive weight close to 0; this approach always leads to convergence of the maximum likelihood algorithm, except for problems with convergence due to multicollinearity among covariates. Even though this method produces a unique maximum, when the outcome is very prevalent, and/or the sample size is relatively small, the COPY method can yield biased estimates. Here, we propose using the jackknife as a bias-reduction approach for the COPY method. The proposed method is motivated by a study of patients undergoing colorectal cancer surgery.
Using multivariate adaptive regression splines to estimate subadult age from diaphyseal dimensions.
Stull, Kyra E; L'Abbé, Ericka N; Ousley, Stephen D
2014-07-01
Subadult age estimation is considered the most accurate parameter estimated in a subadult biological profile, even though the methods are deficient and the samples from which they are based are inappropriate. The current study addresses the problems that plague subadult age estimation and creates age estimation models from diaphyseal dimensions of modern children. The sample included 1,310 males and females between the ages of birth and 12 years. Eighteen diaphyseal length and breadth measurements were obtained from Lodox Statscan radiographic images generated at two institutions in Cape Town, South Africa, between 2007 and 2012. Univariate and multivariate age estimation models were created using multivariate adaptive regression splines. k-fold cross-validated 95% prediction intervals (PIs) were created for each model, and the precision of each model was assessed. The diaphyseal length models generated the narrowest PIs (2 months to 6 years) for all univariate models. The majority of multivariate models had PIs that ranged from 3 months to 5 and 6 years. Mean bias approximated 0 for each model, but most models lost precision after 10 years of age. Univariate diaphyseal length models are recommended for younger children, whereas multivariate models are recommended for older children where the inclusion of more variables minimized the size of the PIs. If diaphyseal lengths are not available, multivariate breadth models are recommended. The present study provides applicable age estimation formulae and explores the advantages and disadvantages of different subadult age estimation models using diaphyseal dimensions. Am J Phys Anthropol 154:376-386, 2014. © 2014 Wiley Periodicals, Inc.
ACCURACY OF MILK YIELD ESTIMATION IN DAIRY CATTLE FROM MONTHLY RECORD BY REGRESSION METHOD
Directory of Open Access Journals (Sweden)
I.S. Kuswahyuni
2014-10-01
Full Text Available This experiment was conducted to estimate the actual milk yield and to compare the estimation accuracyof cumulative monthly record to actual milk yield by regression method. Materials used in this experimentwere records relating to milk yield and pedigree. The obtained data were categorized into 2 groups i.e. AgeGroup I (AG I that was cow calving at < 36 months old as many as 33 cows with 33 lactation records andAG II that cows calving e” 36 months old as many as 44 cows with 105 lactation records. The first three toseven months data were used to estimate actual milk yield. Results showed that mean of milk yield/ head/lactation at AG I (2479.5 ± 461.5 kg was lower than that of AG II (2989,7 ± 526,8 kg. Estimated milk yieldsfor three to seven months at AG I were 2455.6±419.7; 2455.7±432.9; 2455.5±446.4; 2455.6±450.8; 2455,64± 450,8; 2455,5 ± 459,3 kg respectively, meanwhile at AG II was 2972.3±479.8; 2972.0±497.2; 2972.4±509.6;2972.5±523.6 and 2972.5±535.1 respectively. Correlation coefficients between estimated and actual milkyield at AG I were 0.79; 0.82; 0.86; 0.86 and 0.88, respectively, meanwhile at AG II were 0.65; 0.66; 0.67;0.69 and 0.72 respectively. In conclusion, the mean of estimated milk yield at AG I was lower than AG II.The best record to estimate actual milk yield both at AG I and AG II were the seven cumulative months.
Lin, Feng-Chang; Zhu, Jun
2012-01-01
We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
Estimation of Subpixel Snow-Covered Area by Nonparametric Regression Splines
Kuter, S.; Akyürek, Z.; Weber, G.-W.
2016-10-01
Measurement of the areal extent of snow cover with high accuracy plays an important role in hydrological and climate modeling. Remotely-sensed data acquired by earth-observing satellites offer great advantages for timely monitoring of snow cover. However, the main obstacle is the tradeoff between temporal and spatial resolution of satellite imageries. Soft or subpixel classification of low or moderate resolution satellite images is a preferred technique to overcome this problem. The most frequently employed snow cover fraction methods applied on Moderate Resolution Imaging Spectroradiometer (MODIS) data have evolved from spectral unmixing and empirical Normalized Difference Snow Index (NDSI) methods to latest machine learning-based artificial neural networks (ANNs). This study demonstrates the implementation of subpixel snow-covered area estimation based on the state-of-the-art nonparametric spline regression method, namely, Multivariate Adaptive Regression Splines (MARS). MARS models were trained by using MODIS top of atmospheric reflectance values of bands 1-7 as predictor variables. Reference percentage snow cover maps were generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also employed to estimate the percentage snow-covered area on the same data set. The results indicated that the developed MARS model performed better than th
Tian, Guo-Liang; Tang, Man-Lai; Fang, Hong-Bin; Tan, Ming
2008-03-15
Fitting logistic regression models is challenging when their parameters are restricted. In this article, we first develop a quadratic lower-bound (QLB) algorithm for optimization with box or linear inequality constraints and derive the fastest QLB algorithm corresponding to the smallest global majorization matrix. The proposed QLB algorithm is particularly suited to problems to which EM-type algorithms are not applicable (e.g., logistic, multinomial logistic, and Cox's proportional hazards models) while it retains the same EM ascent property and thus assures the monotonic convergence. Secondly, we generalize the QLB algorithm to penalized problems in which the penalty functions may not be totally differentiable. The proposed method thus provides an alternative algorithm for estimation in lasso logistic regression, where the convergence of the existing lasso algorithm is not generally ensured. Finally, by relaxing the ascent requirement, convergence speed can be further accelerated. We introduce a pseudo-Newton method that retains the simplicity of the QLB algorithm and the fast convergence of the Newton method. Theoretical justification and numerical examples show that the pseudo-Newton method is up to 71 (in terms of CPU time) or 107 (in terms of number of iterations) times faster than the fastest QLB algorithm and thus makes bootstrap variance estimation feasible. Simulations and comparisons are performed and three real examples (Down syndrome data, kyphosis data, and colon microarray data) are analyzed to illustrate the proposed methods.
Webb, Sara Jane; Nalty, Theresa; Munson, Jeff; Brock, Catherine; Abbott, Robert; Dawson, Geraldine
2007-10-01
Several reports indicate that autism spectrum disorder is associated with increased rate of head growth in early childhood. Increased rate of growth may index aberrant processes during early development, may precede the onset of symptoms, and may predict severity of the disease course. We examined rate of change in occipitofrontal circumference measurements (abstracted from medical records) in 28 boys with autism spectrum disorder and in 8 boys with developmental delay without autism from birth to age 36 months. Only children who had more than 3 occipitofrontal circumference measurements available during this age period were included. All data were converted to z scores based on the Centers for Disease Control and Prevention norms. Rate of growth from birth to age 36 months was statistically significantly higher for the autism spectrum disorder group than the developmental delay group, with children with autism spectrum disorder showing a statistically significant increase in occipitofrontal circumference relative to norms between 7 and 10 months; this group difference in rate of growth was more robust when height was used as a covariate. Rate of growth was not found to be different for children with autism spectrum disorder whose parents reported a history of loss of skills (regression) vs those whose parents reported early onset of autism symptoms. Findings from this study suggest that the aberrant growth is present in the first year of life and precedes the onset and diagnosis in children with autism spectrum disorder with and without a history of autistic regression.
Flexible regression models for estimating postmortem interval (PMI) in forensic medicine.
Muñoz Barús, José Ignacio; Febrero-Bande, Manuel; Cadarso-Suárez, Carmen
2008-10-30
Correct determination of time of death is an important goal in forensic medicine. Numerous methods have been described for estimating postmortem interval (PMI), but most are imprecise, poorly reproducible and/or have not been validated with real data. In recent years, however, some progress in PMI estimation has been made, notably through the use of new biochemical methods for quantifying relevant indicator compounds in the vitreous humour. The best, but unverified, results have been obtained with [K+] and hypoxanthine [Hx], using simple linear regression (LR) models. The main aim of this paper is to offer more flexible alternatives to LR, such as generalized additive models (GAMs) and support vector machines (SVMs) in order to obtain improved PMI estimates. The present study, based on detailed analysis of [K+] and [Hx] in more than 200 vitreous humour samples from subjects with known PMI, compared classical LR methodology with GAM and SVM methodologies. Both proved better than LR for estimation of PMI. SVM showed somewhat greater precision than GAM, but GAM offers a readily interpretable graphical output, facilitating understanding of findings by legal professionals; there are thus arguments for using both types of models. R code for these methods is available from the authors, permitting accurate prediction of PMI from vitreous humour [K+], [Hx] and [U], with confidence intervals and graphical output provided. Copyright 2008 John Wiley & Sons, Ltd.
A fast nonlinear regression method for estimating permeability in CT perfusion imaging.
Bennink, Edwin; Riordan, Alan J; Horsch, Alexander D; Dankbaar, Jan Willem; Velthuis, Birgitta K; de Jong, Hugo W
2013-11-01
Blood-brain barrier damage, which can be quantified by measuring vascular permeability, is a potential predictor for hemorrhagic transformation in acute ischemic stroke. Permeability is commonly estimated by applying Patlak analysis to computed tomography (CT) perfusion data, but this method lacks precision. Applying more elaborate kinetic models by means of nonlinear regression (NLR) may improve precision, but is more time consuming and therefore less appropriate in an acute stroke setting. We propose a simplified NLR method that may be faster and still precise enough for clinical use. The aim of this study is to evaluate the reliability of in total 12 variations of Patlak analysis and NLR methods, including the simplified NLR method. Confidence intervals for the permeability estimates were evaluated using simulated CT attenuation-time curves with realistic noise, and clinical data from 20 patients. Although fixating the blood volume improved Patlak analysis, the NLR methods yielded significantly more reliable estimates, but took up to 12 × longer to calculate. The simplified NLR method was ∼4 × faster than other NLR methods, while maintaining the same confidence intervals (CIs). In conclusion, the simplified NLR method is a new, reliable way to estimate permeability in stroke, fast enough for clinical application in an acute stroke setting.
Institute of Scientific and Technical Information of China (English)
Mehdi Najafi; Seyed Mohammad Esmaiel Jalali; Reza KhaloKakaie; Farrokh Forouhandeh
2015-01-01
During underground coal gasification (UCG), whereby coal is converted to syngas in situ, a cavity is formed in the coal seam. The cavity growth rate (CGR) or the moving rate of the gasification face is affected by controllable (operation pressure, gasification time, geometry of UCG panel) and uncontrollable (coal seam properties) factors. The CGR is usually predicted by mathematical models and laboratory experiments, which are time consuming, cumbersome and expensive. In this paper, a new simple model for CGR is developed using non-linear regression analysis, based on data from 11 UCG field trials. The empirical model compares satisfactorily with Perkins model and can reliably predict CGR.
Bertipaglia, T S; Carreño, L O D; Aspilcueta-Borquis, R R; Boligon, A A; Farah, M M; Gomes, F J; Machado, C H C; Rey, F S B; da Fonseca, R
2015-08-01
Random regression models (RRM) and multitrait models (MTM) were used to estimate genetic parameters for growth traits in Brazilian Brahman cattle and to compare the estimated breeding values obtained by these 2 methodologies. For RRM, 78,641 weight records taken between 60 and 550 d of age from 16,204 cattle were analyzed, and for MTM, the analysis consisted of 17,385 weight records taken at the same ages from 12,925 cattle. All models included the fixed effects of contemporary group and the additive genetic, maternal genetic, and animal permanent environmental effects and the quadratic effect of age at calving (AAC) as covariate. For RRM, the AAC was nested in the animal's age class. The best RRM considered cubic polynomials and the residual variance heterogeneity (5 levels). For MTM, the weights were adjusted for standard ages. For RRM, additive heritability estimates ranged from 0.42 to 0.75, and for MTM, the estimates ranged from 0.44 to 0.72 for both models at 60, 120, 205, 365, and 550 d of age. The maximum maternal heritability estimate (0.08) was at 140 d for RRM, but for MTM, it was highest at weaning (0.09). The magnitude of the genetic correlations was generally from moderate to high. The RRM adequately modeled changes in variance or covariance with age, and provided there was sufficient number of samples, increased accuracy in the estimation of the genetic parameters can be expected. Correlation of bull classifications were different in both methods and at all the ages evaluated, especially at high selection intensities, which could affect the response to selection.
Directory of Open Access Journals (Sweden)
Gholam Reza Sheykhzadeh
2017-02-01
Full Text Available Introduction: Penetration resistance is one of the criteria for evaluating soil compaction. It correlates with several soil properties such as vehicle trafficability, resistance to root penetration, seedling emergence, and soil compaction by farm machinery. Direct measurement of penetration resistance is time consuming and difficult because of high temporal and spatial variability. Therefore, many different regressions and artificial neural network pedotransfer functions have been proposed to estimate penetration resistance from readily available soil variables such as particle size distribution, bulk density (Db and gravimetric water content (θm. The lands of Ardabil Province are one of the main production regions of potato in Iran, thus, obtaining the soil penetration resistance in these regions help with the management of potato production. The objective of this research was to derive pedotransfer functions by using regression and artificial neural network to predict penetration resistance from some soil variations in the agricultural soils of Ardabil plain and to compare the performance of artificial neural network with regression models. Materials and methods: Disturbed and undisturbed soil samples (n= 105 were systematically taken from 0-10 cm soil depth with nearly 3000 m distance in the agricultural lands of the Ardabil plain ((lat 38°15' to 38°40' N, long 48°16' to 48°61' E. The contents of sand, silt and clay (hydrometer method, CaCO3 (titration method, bulk density (cylinder method, particle density (Dp (pychnometer method, organic carbon (wet oxidation method, total porosity(calculating from Db and Dp, saturated (θs and field soil water (θf using the gravimetric method were measured in the laboratory. Mean geometric diameter (dg and standard deviation (σg of soil particles were computed using the percentages of sand, silt and clay. Penetration resistance was measured in situ using cone penetrometer (analog model at 10
Evaluation of Regression and Neuro_Fuzzy Models in Estimating Saturated Hydraulic Conductivity
Directory of Open Access Journals (Sweden)
J. Behmanesh
2015-06-01
Full Text Available Study of soil hydraulic properties such as saturated and unsaturated hydraulic conductivity is required in the environmental investigations. Despite numerous research, measuring saturated hydraulic conductivity using by direct methods are still costly, time consuming and professional. Therefore estimating saturated hydraulic conductivity using rapid and low cost methods such as pedo-transfer functions with acceptable accuracy was developed. The purpose of this research was to compare and evaluate 11 pedo-transfer functions and Adaptive Neuro-Fuzzy Inference System (ANFIS to estimate saturated hydraulic conductivity of soil. In this direct, saturated hydraulic conductivity and physical properties in 40 points of Urmia were calculated. The soil excavated was used in the lab to determine its easily accessible parameters. The results showed that among existing models, Aimrun et al model had the best estimation for soil saturated hydraulic conductivity. For mentioned model, the Root Mean Square Error and Mean Absolute Error parameters were 0.174 and 0.028 m/day respectively. The results of the present research, emphasises the importance of effective porosity application as an important accessible parameter in accuracy of pedo-transfer functions. sand and silt percent, bulk density and soil particle density were selected to apply in 561 ANFIS models. In training phase of best ANFIS model, the R2 and RMSE were calculated 1 and 1.2×10-7 respectively. These amounts in the test phase were 0.98 and 0.0006 respectively. Comparison of regression and ANFIS models showed that the ANFIS model had better results than regression functions. Also Nuro-Fuzzy Inference System had capability to estimatae with high accuracy in various soil textures.
Analyses of Developmental Rate Isomorphy in Ectotherms: Introducing the Dirichlet Regression.
Directory of Open Access Journals (Sweden)
David S Boukal
Full Text Available Temperature drives development in insects and other ectotherms because their metabolic rate and growth depends directly on thermal conditions. However, relative durations of successive ontogenetic stages often remain nearly constant across a substantial range of temperatures. This pattern, termed 'developmental rate isomorphy' (DRI in insects, appears to be widespread and reported departures from DRI are generally very small. We show that these conclusions may be due to the caveats hidden in the statistical methods currently used to study DRI. Because the DRI concept is inherently based on proportional data, we propose that Dirichlet regression applied to individual-level data is an appropriate statistical method to critically assess DRI. As a case study we analyze data on five aquatic and four terrestrial insect species. We find that results obtained by Dirichlet regression are consistent with DRI violation in at least eight of the studied species, although standard analysis detects significant departure from DRI in only four of them. Moreover, the departures from DRI detected by Dirichlet regression are consistently much larger than previously reported. The proposed framework can also be used to infer whether observed departures from DRI reflect life history adaptations to size- or stage-dependent effects of varying temperature. Our results indicate that the concept of DRI in insects and other ectotherms should be critically re-evaluated and put in a wider context, including the concept of 'equiproportional development' developed for copepods.
Motulsky, Harvey J; Brown, Ronald E
2006-03-09
Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely) one or more outlier in only about 1-3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives.
Directory of Open Access Journals (Sweden)
Motulsky Harvey J
2006-03-01
Full Text Available Abstract Background Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. Results We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely one or more outlier in only about 1–3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. Conclusion Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives.
Buonaccorsi, John; Prochenka, Agnieszka; Thoresen, Magne; Ploski, Rafal
2016-09-30
Motivated by a genetic application, this paper addresses the problem of fitting regression models when the predictor is a proportion measured with error. While the problem of dealing with additive measurement error in fitting regression models has been extensively studied, the problem where the additive error is of a binomial nature has not been addressed. The measurement errors here are heteroscedastic for two reasons; dependence on the underlying true value and changing sampling effort over observations. While some of the previously developed methods for treating additive measurement error with heteroscedasticity can be used in this setting, other methods need modification. A new version of simulation extrapolation is developed, and we also explore a variation on the standard regression calibration method that uses a beta-binomial model based on the fact that the true value is a proportion. Although most of the methods introduced here can be used for fitting non-linear models, this paper will focus primarily on their use in fitting a linear model. While previous work has focused mainly on estimation of the coefficients, we will, with motivation from our example, also examine estimation of the variance around the regression line. In addressing these problems, we also discuss the appropriate manner in which to bootstrap for both inferences and bias assessment. The various methods are compared via simulation, and the results are illustrated using our motivating data, for which the goal is to relate the methylation rate of a blood sample to the age of the individual providing the sample. Copyright © 2016 John Wiley & Sons, Ltd.
Estimating stellar atmospheric parameters based on LASSO and support-vector regression
Lu, Yu
2015-01-01
A scheme for estimating atmospheric parameters T$_{eff}$, log$~g$, and [Fe/H] is proposed on the basis of Least Absolute Shrinkage and Selection Operator (LASSO) algorithm and Haar wavelet. The proposed scheme consists of three processes. A spectrum is decomposed using the Haar wavelet transform and low-frequency components at the fourth level are considered as candidate features. Then, spectral features from the candidate features are detected using the LASSO algorithm to estimate the atmospheric parameters. Finally, atmospheric parameters are estimated from the extracted spectral features using the support-vector regression (SVR) method. The proposed scheme was evaluated using three sets of stellar spectra respectively from Sloan Digital Sky Survey (SDSS), Large Sky Area Multi-object Fiber Spectroscopic Telescope (LAMOST), and Kurucz's model, respectively. The mean absolute errors are as follows: for 40~000 SDSS spectra, 0.0062 dex for log~T$_{eff}$ (85.83 K for T$_{eff}$), 0.2035 dex for log$~g$ and 0.1512...
Speaker height estimation from speech: Fusing spectral regression and statistical acoustic models.
Hansen, John H L; Williams, Keri; Bořil, Hynek
2015-08-01
Estimating speaker height can assist in voice forensic analysis and provide additional side knowledge to benefit automatic speaker identification or acoustic model selection for automatic speech recognition. In this study, a statistical approach to height estimation that incorporates acoustic models within a non-uniform height bin width Gaussian mixture model structure as well as a formant analysis approach that employs linear regression on selected phones are presented. The accuracy and trade-offs of these systems are explored by examining the consistency of the results, location, and causes of error as well a combined fusion of the two systems using data from the TIMIT corpus. Open set testing is also presented using the Multi-session Audio Research Project corpus and publicly available YouTube audio to examine the effect of channel mismatch between training and testing data and provide a realistic open domain testing scenario. The proposed algorithms achieve a highly competitive performance to previously published literature. Although the different data partitioning in the literature and this study may prevent performance comparisons in absolute terms, the mean average error of 4.89 cm for males and 4.55 cm for females provided by the proposed algorithm on TIMIT utterances containing selected phones suggest a considerable estimation error decrease compared to past efforts.
Estimating stellar atmospheric parameters based on LASSO and support-vector regression
Lu, Yu; Li, Xiangru
2015-09-01
A scheme for estimating atmospheric parameters Teff, log g and [Fe/H] is proposed on the basis of the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm and Haar wavelet. The proposed scheme consists of three processes. A spectrum is decomposed using the Haar wavelet transform and low-frequency components at the fourth level are considered as candidate features. Then, spectral features from the candidate features are detected using the LASSO algorithm to estimate the atmospheric parameters. Finally, atmospheric parameters are estimated from the extracted spectral features using the support-vector regression (SVR) method. The proposed scheme was evaluated using three sets of stellar spectra from the Sloan Digital Sky Survey (SDSS), Large Sky Area Multi-object Fibre Spectroscopic Telescope (LAMOST) and Kurucz's model, respectively. The mean absolute errors are as follows: for the 40 000 SDSS spectra, 0.0062 dex for log Teff (85.83 K for Teff), 0.2035 dex for log g and 0.1512 dex for [Fe/H]; for the 23 963 LAMOST spectra, 0.0074 dex for log Teff (95.37 K for Teff), 0.1528 dex for log g and 0.1146 dex for [Fe/H]; for the 10 469 synthetic spectra, 0.0010 dex for log Teff (14.42K for Teff), 0.0123 dex for log g and 0.0125 dex for [Fe/H].
Styborski, Jeremy A.
This project was started in the interest of supplementing existing data on additives to composite solid propellants. The study on the addition of iron and aluminum nanoparticles to composite AP/HTPB propellants was conducted at the Combustion and Energy Systems Laboratory at RPI in the new strand-burner experiment setup. For this study, a large literature review was conducted on history of solid propellant combustion modeling and the empirical results of tests on binders, plasticizers, AP particle size, and additives. The study focused on the addition of nano-scale aluminum and iron in small concentrations to AP/HTPB solid propellants with an average AP particle size of 200 microns. Replacing 1% of the propellant's AP with 40-60 nm aluminum particles produced no change in combustive behavior. The addition of 1% 60-80 nm iron particles produced a significant increase in burn rate, although the increase was lesser at higher pressures. These results are summarized in Table 2. The increase in the burn rate at all pressures due to the addition of iron nanoparticles warranted further study on the effect of concentration of iron. Tests conducted at 10 atm showed that the mean regression rate varied with iron concentration, peaking at 1% and 3%. Regardless of the iron concentration, the regression rate was higher than the baseline AP/HTPB propellants. These results are summarized in Table 3.
Hermanek, P; Guggenmoos-Holzmann, I
1994-01-01
A total of 961 patients who had received resective surgery for gastric carcinoma were grouped according to prognosis by classification and regression trees (CART). This grouping was compared to the present UICC stage grouping. For patients resected for cure (R0) the CART approach allows a better discrimination of patients with poor prognosis (5-year survival rates 15%-30%) from patients with a 5-year survival of 50%, on the one hand, and from patients with extremely poor prognosis (5-year survival rates below 5%) on the other. In the present investigation CART grouping was not influenced by the differentiation between pT1 and pT2 or between pT3 and pT4.
Simulated data supporting inbreeding rate estimates from incomplete pedigrees
U.S. Geological Survey, Department of the Interior — This data release includes: (1) The data from simulations used to illustrate the behavior of inbreeding rate estimators. Estimating inbreeding rates is particularly...
Messier, Kyle P; Akita, Yasuyuki; Serre, Marc L
2012-03-06
Geographic information systems (GIS) based techniques are cost-effective and efficient methods used by state agencies and epidemiology researchers for estimating concentration and exposure. However, budget limitations have made statewide assessments of contamination difficult, especially in groundwater media. Many studies have implemented address geocoding, land use regression, and geostatistics independently, but this is the first to examine the benefits of integrating these GIS techniques to address the need of statewide exposure assessments. A novel framework for concentration exposure is introduced that integrates address geocoding, land use regression (LUR), below detect data modeling, and Bayesian Maximum Entropy (BME). A LUR model was developed for tetrachloroethylene that accounts for point sources and flow direction. We then integrate the LUR model into the BME method as a mean trend while also modeling below detects data as a truncated Gaussian probability distribution function. We increase available PCE data 4.7 times from previously available databases through multistage geocoding. The LUR model shows significant influence of dry cleaners at short ranges. The integration of the LUR model as mean trend in BME results in a 7.5% decrease in cross validation mean square error compared to BME with a constant mean trend.
Urrutia, Jackie D.; Tampis, Razzcelle L.; Mercado, Joseph; Baygan, Aaron Vito M.; Baccay, Edcon B.
2016-02-01
The objective of this research is to formulate a mathematical model for the Philippines' Real Gross Domestic Product (Real GDP). The following factors are considered: Consumers' Spending (x1), Government's Spending (x2), Capital Formation (x3) and Imports (x4) as the Independent Variables that can actually influence in the Real GDP in the Philippines (y). The researchers used a Normal Estimation Equation using Matrices to create the model for Real GDP and used α = 0.01.The researchers analyzed quarterly data from 1990 to 2013. The data were acquired from the National Statistical Coordination Board (NSCB) resulting to a total of 96 observations for each variable. The data have undergone a logarithmic transformation particularly the Dependent Variable (y) to satisfy all the assumptions of the Multiple Linear Regression Analysis. The mathematical model for Real GDP was formulated using Matrices through MATLAB. Based on the results, only three of the Independent Variables are significant to the Dependent Variable namely: Consumers' Spending (x1), Capital Formation (x3) and Imports (x4), hence, can actually predict Real GDP (y). The regression analysis displays that 98.7% (coefficient of determination) of the Independent Variables can actually predict the Dependent Variable. With 97.6% of the result in Paired T-Test, the Predicted Values obtained from the model showed no significant difference from the Actual Values of Real GDP. This research will be essential in appraising the forthcoming changes to aid the Government in implementing policies for the development of the economy.
A Note on Penalized Regression Spline Estimation in the Secondary Analysis of Case-Control Data
Gazioglu, Suzan
2013-05-25
Primary analysis of case-control studies focuses on the relationship between disease (D) and a set of covariates of interest (Y, X). A secondary application of the case-control study, often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated due to the case-control sampling, and to avoid the biased sampling that arises from the design, it is typical to use the control data only. In this paper, we develop penalized regression spline methodology that uses all the data, and improves precision of estimation compared to using only the controls. A simulation study and an empirical example are used to illustrate the methodology.
Monopole and dipole estimation for multi-frequency sky maps by linear regression
Wehus, I. K.; Fuskeland, U.; Eriksen, H. K.; Banday, A. J.; Dickinson, C.; Ghosh, T.; Górski, K. M.; Lawrence, C. R.; Leahy, J. P.; Maino, D.; Reich, P.; Reich, W.
2017-01-01
We describe a simple but efficient method for deriving a consistent set of monopole and dipole corrections for multi-frequency sky map data sets, allowing robust parametric component separation with the same data set. The computational core of this method is linear regression between pairs of frequency maps, often called T-T plots. Individual contributions from monopole and dipole terms are determined by performing the regression locally in patches on the sky, while the degeneracy between different frequencies is lifted whenever the dominant foreground component exhibits a significant spatial spectral index variation. Based on this method, we present two different, but each internally consistent, sets of monopole and dipole coefficients for the nine-year WMAP, Planck 2013, SFD 100 μm, Haslam 408 MHz and Reich & Reich 1420 MHz maps. The two sets have been derived with different analysis assumptions and data selection, and provide an estimate of residual systematic uncertainties. In general, our values are in good agreement with previously published results. Among the most notable results are a relative dipole between the WMAP and Planck experiments of 10-15μK (depending on frequency), an estimate of the 408 MHz map monopole of 8.9 ± 1.3 K, and a non-zero dipole in the 1420 MHz map of 0.15 ± 0.03 K pointing towards Galactic coordinates (l,b) = (308°,-36°) ± 14°. These values represent the sum of any instrumental and data processing offsets, as well as any Galactic or extra-Galactic component that is spectrally uniform over the full sky.
Energy Technology Data Exchange (ETDEWEB)
Harlim, John, E-mail: jharlim@psu.edu [Department of Mathematics and Department of Meteorology, the Pennsylvania State University, University Park, PA 16802, Unites States (United States); Mahdi, Adam, E-mail: amahdi@ncsu.edu [Department of Mathematics, North Carolina State University, Raleigh, NC 27695 (United States); Majda, Andrew J., E-mail: jonjon@cims.nyu.edu [Department of Mathematics and Center for Atmosphere and Ocean Science, Courant Institute of Mathematical Sciences, New York University, New York, NY 10012 (United States)
2014-01-15
A central issue in contemporary science is the development of nonlinear data driven statistical–dynamical models for time series of noisy partial observations from nature or a complex model. It has been established recently that ad-hoc quadratic multi-level regression models can have finite-time blow-up of statistical solutions and/or pathological behavior of their invariant measure. Recently, a new class of physics constrained nonlinear regression models were developed to ameliorate this pathological behavior. Here a new finite ensemble Kalman filtering algorithm is developed for estimating the state, the linear and nonlinear model coefficients, the model and the observation noise covariances from available partial noisy observations of the state. Several stringent tests and applications of the method are developed here. In the most complex application, the perfect model has 57 degrees of freedom involving a zonal (east–west) jet, two topographic Rossby waves, and 54 nonlinearly interacting Rossby waves; the perfect model has significant non-Gaussian statistics in the zonal jet with blocked and unblocked regimes and a non-Gaussian skewed distribution due to interaction with the other 56 modes. We only observe the zonal jet contaminated by noise and apply the ensemble filter algorithm for estimation. Numerically, we find that a three dimensional nonlinear stochastic model with one level of memory mimics the statistical effect of the other 56 modes on the zonal jet in an accurate fashion, including the skew non-Gaussian distribution and autocorrelation decay. On the other hand, a similar stochastic model with zero memory levels fails to capture the crucial non-Gaussian behavior of the zonal jet from the perfect 57-mode model.
ANALYSIS OF TUITION GROWTH RATES BASED ON CLUSTERING AND REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Long Cheng
2016-07-01
Full Text Available Tuition plays a significant role in determining whether a student could afford higher education, which is one of the major driving forces for country development and social prosperity. So it is necessary to fully understand what factors might affect the tuition and how they affect it. However, many existing studies on the tuition growth rate either lack sufficient real data and proper quantitative models to support their conclusions, or are limited to focus on only a few factors that might affect the tuition growth rate, failing to make a comprehensive analysis. In this paper, we explore a wide variety of factors that might affect the tuition growth rate by use of large amounts of authentic data and different quantitative methods such as clustering and regression models.
Joint Hacking and Latent Hazard Rate Estimation
Liu, Ziqi; Smola, Alexander J.; Soska, Kyle; Wang, Yu-Xiang; Zheng, Qinghua
2016-01-01
In this paper we describe an algorithm for predicting the websites at risk in a long range hacking activity, while jointly inferring the provenance and evolution of vulnerabilities on websites over continuous time. Specifically, we use hazard regression with a time-varying additive hazard function parameterized in a generalized linear form. The activation coefficients on each feature are continuous-time functions constrained with total variation penalty inspired by hacking campaigns. We show ...
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
The authors derive laws of the iterated logarithm for kernel estimator of regression function based on directional data. The results are distribution free in the sense that they are true for all distributions of design variable.
Li, Ming; Zhang, Peidong; Leng, Jianxing
2016-03-01
This article presents an improved autocorrelation correlation function (ACF) regression method of estimating the Hurst parameter of a time series with long-range dependence (LRD) by using golden section search (GSS). We shall show that the present method is substantially efficient than the conventional ACF regression method of H estimation. Our research uses fractional Gaussian noise as a data case but the method introduced is applicable to time series with LRD in general.
Estimating carbon and showing impacts of drought using satellite data in regression-tree models
Boyte, Stephen; Wylie, Bruce K.; Howard, Danny; Dahal, Devendra; Gilmanov, Tagir G.
2018-01-01
Integrating spatially explicit biogeophysical and remotely sensed data into regression-tree models enables the spatial extrapolation of training data over large geographic spaces, allowing a better understanding of broad-scale ecosystem processes. The current study presents annual gross primary production (GPP) and annual ecosystem respiration (RE) for 2000–2013 in several short-statured vegetation types using carbon flux data from towers that are located strategically across the conterminous United States (CONUS). We calculate carbon fluxes (annual net ecosystem production [NEP]) for each year in our study period, which includes 2012 when drought and higher-than-normal temperatures influence vegetation productivity in large parts of the study area. We present and analyse carbon flux dynamics in the CONUS to better understand how drought affects GPP, RE, and NEP. Model accuracy metrics show strong correlation coefficients (r) (r ≥ 94%) between training and estimated data for both GPP and RE. Overall, average annual GPP, RE, and NEP are relatively constant throughout the study period except during 2012 when almost 60% less carbon is sequestered than normal. These results allow us to conclude that this modelling method effectively estimates carbon dynamics through time and allows the exploration of impacts of meteorological anomalies and vegetation types on carbon dynamics.
Reddy, K. S.; Somasundharam, S.
2016-09-01
In this work, inverse heat conduction problem (IHCP) involving the simultaneous estimation of principal thermal conductivities (kxx,kyy,kzz ) and specific heat capacity of orthotropic materials is solved by using surrogate forward model. Uniformly distributed random samples for each unknown parameter is generated from the prior knowledge about these parameters and Finite Volume Method (FVM) is employed to solve the forward problem for temperature distribution with space and time. A supervised machine learning technique- Gaussian Process Regression (GPR) is used to construct the surrogate forward model with the available temperature solution and randomly generated unknown parameter data. The statistical and machine learning toolbox available in MATLAB R2015b is used for this purpose. The robustness of the surrogate model constructed using GPR is examined by carrying out the parameter estimation for 100 new randomly generated test samples at a measurement error of ±0.3K. The temperature measurement is obtained by adding random noise with the mean at zero and known standard deviation (σ = 0.1) to the FVM solution of the forward problem. The test results show that Mean Percentage Deviation (MPD) of all test samples for all parameters is < 10%.
Estimation of the laser cutting operating cost by support vector regression methodology
Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam
2016-09-01
Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.
Towards universal hybrid star formation rate estimators
Boquien, M; Calzetti, D; Dale, D; Galametz, M; Sauvage, M; Croxall, K; Draine, B; Kirkpatrick, A; Kumari, N; Hunt, L; De Looze, I; Pellegrini, E; Relano, M; Smith, J -D; Tabatabaei, F
2016-01-01
To compute the SFR of galaxies from the rest-frame UV it is essential to take into account the obscuration by dust. To do so, one of the most popular methods consists in combining the UV with the emission from the dust itself in the IR. Yet, different studies have derived different estimators, showing that no such hybrid estimator is truly universal. In this paper we aim at understanding and quantifying what physical processes drive the variations between different hybrid estimators. Doing so, we aim at deriving new universal UV+IR hybrid estimators to correct the UV for dust attenuation, taking into account the intrinsic physical properties of galaxies. We use the CIGALE code to model the spatially-resolved FUV to FIR SED of eight nearby star-forming galaxies drawn from the KINGFISH sample. This allows us to determine their local physical properties, and in particular their UV attenuation, average SFR, average specific SFR (sSFR), and their stellar mass. We then examine how hybrid estimators depend on said p...
Energy Technology Data Exchange (ETDEWEB)
Luukkonen, A.; Korkealaakso, J.; Pitkaenen, P. [VTT Communities and Infrastructure, Espoo (Finland)
1997-11-01
Teollisuuden Voima Oy selected five investigation areas for preliminary site studies (1987Ae1992). The more detailed site investigation project, launched at the beginning of 1993 and presently supervised by Posiva Oy, is concentrated to three investigation areas. Romuvaara at Kuhmo is one of the present target areas, and the geochemical, structural and hydrological data used in this study are extracted from there. The aim of the study is to develop suitable methods for groundwater composition estimation based on a group of known hydrogeological variables. The input variables used are related to the host type of groundwater, hydrological conditions around the host location, mixing potentials between different types of groundwater, and minerals equilibrated with the groundwater. The output variables are electrical conductivity, Ca, Mg, Mn, Na, K, Fe, Cl, S, HS, SO{sub 4}, alkalinity, {sup 3}H, {sup 14}C, {sup 13}C, Al, Sr, F, Br and I concentrations, and pH of the groundwater. The methodology is to associate the known hydrogeological conditions (i.e. input variables), with the known water compositions (output variables), and to evaluate mathematical relations between these groups. Output estimations are done with two separate procedures: partial least squares regressions on the principal components of input variables, and by training neural networks with input-output pairs. Coefficients of linear equations and trained networks are optional methods for actual predictions. The quality of output predictions are monitored with confidence limit estimations, evaluated from input variable covariances and output variances, and with charge balance calculations. Groundwater compositions in Romuvaara borehole KR10 are predicted at 10 metre intervals with both prediction methods. 46 refs.
Estimating earnings losses due to mental illness: a quantile regression approach.
Marcotte, Dave E; Wilcox-Gök, Virginia
2003-09-01
The ability of workers to remain productive and sustain earnings when afflicted with mental illness depends importantly on access to appropriate treatment and on flexibility and support from employers. In the United States there is substantial variation in access to health care and sick leave and other employment flexibilities across the earnings distribution. Consequently, a worker's ability to work and how much his/her earnings are impeded likely depend upon his/her position in the earnings distribution. Because of this, focusing on average earnings losses may provide insufficient information on the impact of mental illness in the labor market. In this paper, we examine the effects of mental illness on earnings by recognizing that effects could vary across the distribution of earnings. Using data from the National Comorbidity Survey, we employ a quantile regression estimator to identify the effects at key points in the earnings distribution. We find that earnings effects vary importantly across the distribution. While average effects are often not large, mental illness more commonly imposes earnings losses at the lower tail of the distribution, especially for women. In only one case do we find an illness to have negative effects across the distribution. Mental illness can have larger negative impacts on economic outcomes than previously estimated, even if those effects are not uniform. Consequently, researchers and policy makers alike should not be placated by findings that mean earnings effects are relatively small. Such estimates miss important features of how and where mental illness is associated with real economic losses for the ill.
Roşu, M. M.; Tarbă, C. I.; Neagu, C.
2016-11-01
The current models for inventory management are complementary, but together they offer a large pallet of elements for solving complex problems of companies when wanting to establish the optimum economic order quantity for unfinished products, row of materials, goods etc. The main objective of this paper is to elaborate an automated decisional model for the calculus of the economic order quantity taking into account the price regressive rates for the total order quantity. This model has two main objectives: first, to determine the periodicity when to be done the order n or the quantity order q; second, to determine the levels of stock: lighting control, security stock etc. In this way we can provide the answer to two fundamental questions: How much must be ordered? When to Order? In the current practice, the business relationships with its suppliers are based on regressive rates for price. This means that suppliers may grant discounts, from a certain level of quantities ordered. Thus, the unit price of the products is a variable which depends on the order size. So, the most important element for choosing the optimum for the economic order quantity is the total cost for ordering and this cost depends on the following elements: the medium price per units, the stock cost, the ordering cost etc.
Hu, L; Zhang, Z G; Mouraux, A; Iannetti, G D
2015-05-01
Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical
Estimation of OCDD degradation rate in soil
Institute of Scientific and Technical Information of China (English)
ZHAO Xing-ru; ZHENG Ming-hui; ZHANG Bing; QIAN Yong; XU Xiao-bai
2005-01-01
The current concentrations of polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs) were determined in soils contaminated with Chinese technical product sodium pentachlorophenate ( Na- PCP). The estimated half-life of octachlorodioxin (OCDD)was about 14 years in contaminated soils based on the local historical record and mass balance calculation during the past 43 years( 1960-2003). The isomer profiles remained the same regardless of paddy field soil or riverbank soil. The results indicated that the congenerspecific information was efficient in estimating the PCDD/Fs fate in contaminated soils.
Nonparametric estimation for hazard rate monotonously decreasing system
Institute of Scientific and Technical Information of China (English)
Han Fengyan; Li Weisong
2005-01-01
Estimation of density and hazard rate is very important to the reliability analysis of a system. In order to estimate the density and hazard rate of a hazard rate monotonously decreasing system, a new nonparametric estimator is put forward. The estimator is based on the kernel function method and optimum algorithm. Numerical experiment shows that the method is accurate enough and can be used in many cases.
Macnab, Ying C
2009-04-30
This paper presents Bayesian multivariate disease mapping and ecological regression models that take into account errors in covariates. Bayesian hierarchical formulations of multivariate disease models and covariate measurement models, with related methods of estimation and inference, are developed as an integral part of a Bayesian disability adjusted life years (DALYs) methodology for the analysis of multivariate disease or injury data and associated ecological risk factors and for small area DALYs estimation, inference, and mapping. The methodology facilitates the estimation of multivariate small area disease and injury rates and associated risk effects, evaluation of DALYs and 'preventable' DALYs, and identification of regions to which disease or injury prevention resources may be directed to reduce DALYs. The methodology interfaces and intersects the Bayesian disease mapping methodology and the global burden of disease framework such that the impact of disease, injury, and risk factors on population health may be evaluated to inform community health, health needs, and priority considerations for disease and injury prevention. A burden of injury study on road traffic accidents in local health areas in British Columbia, Canada, is presented as an illustrative example.
Estimating the Rate of Occurrence of Renal Stones in Astronauts
Myers, J.; Goodenow, D.; Gokoglu, S.; Kassemi, M.
2016-01-01
Changes in urine chemistry, during and post flight, potentially increases the risk of renal stones in astronauts. Although much is known about the effects of space flight on urine chemistry, no inflight incidence of renal stones in US astronauts exists and the question "How much does this risk change with space flight?" remains difficult to accurately quantify. In this discussion, we tackle this question utilizing a combination of deterministic and probabilistic modeling that implements the physics behind free stone growth and agglomeration, speciation of urine chemistry and published observations of population renal stone incidences to estimate changes in the rate of renal stone presentation. The modeling process utilizes a Population Balance Equation based model developed in the companion IWS abstract by Kassemi et al. (2016) to evaluate the maximum growth and agglomeration potential from a specified set of urine chemistry values. Changes in renal stone occurrence rates are obtained from this model in a probabilistic simulation that interrogates the range of possible urine chemistries using Monte Carlo techniques. Subsequently, each randomly sampled urine chemistry undergoes speciation analysis using the well-established Joint Expert Speciation System (JESS) code to calculate critical values, such as ionic strength and relative supersaturation. The Kassemi model utilizes this information to predict the mean and maximum stone size. We close the assessment loop by using a transfer function that estimates the rate of stone formation from combining the relative supersaturation and both the mean and maximum free stone growth sizes. The transfer function is established by a simulation analysis which combines population stone formation rates and Poisson regression. Training this transfer function requires using the output of the aforementioned assessment steps with inputs from known non-stone-former and known stone-former urine chemistries. Established in a Monte Carlo
Roelfs, David J; Shor, Eran; Blank, Aharon; Schwartz, Joseph E
2015-05-01
Individual-level unemployment has been consistently linked to poor health and higher mortality, but some scholars have suggested that the negative effect of job loss may be lower during times and in places where aggregate unemployment rates are high. We review three logics associated with this moderation hypothesis: health selection, social isolation, and unemployment stigma. We then test whether aggregate unemployment rates moderate the individual-level association between unemployment and all-cause mortality. We use six meta-regression models (each using a different measure of the aggregate unemployment rate) based on 62 relative all-cause mortality risk estimates from 36 studies (from 15 nations). We find that the magnitude of the individual-level unemployment-mortality association is approximately the same during periods of high and low aggregate-level unemployment. Model coefficients (exponentiated) were 1.01 for the crude unemployment rate (P = .27), 0.94 for the change in unemployment rate from the previous year (P = .46), 1.01 for the deviation of the unemployment rate from the 5-year running average (P = .87), 1.01 for the deviation of the unemployment rate from the 10-year running average (P = .73), 1.01 for the deviation of the unemployment rate from the overall average (measured as a continuous variable; P = .61), and showed no variation across unemployment levels when the deviation of the unemployment rate from the overall average was measured categorically. Heterogeneity between studies was significant (P unemployment experiences change when macroeconomic conditions change. Efforts to ameliorate the negative social and economic consequences of unemployment should continue to focus on the individual and should be maintained regardless of periodic changes in macroeconomic conditions. Copyright © 2015 Elsevier Inc. All rights reserved.
Estimation of transition probabilities of credit ratings
Peng, Gan Chew; Hin, Pooi Ah
2015-12-01
The present research is based on the quarterly credit ratings of ten companies over 15 years taken from the database of the Taiwan Economic Journal. The components in the vector mi (mi1, mi2,⋯, mi10) may first be used to denote the credit ratings of the ten companies in the i-th quarter. The vector mi+1 in the next quarter is modelled to be dependent on the vector mi via a conditional distribution which is derived from a 20-dimensional power-normal mixture distribution. The transition probability Pkl (i ,j ) for getting mi+1,j = l given that mi, j = k is then computed from the conditional distribution. It is found that the variation of the transition probability Pkl (i ,j ) as i varies is able to give indication for the possible transition of the credit rating of the j-th company in the near future.
Lillehammer, Marie; Odegård, Jørgen; Meuwissen, Theo H E
2009-03-19
The combination of a sire model and a random regression term describing genotype by environment interactions may lead to biased estimates of genetic variance components because of heterogeneous residual variance. In order to test different models, simulated data with genotype by environment interactions, and dairy cattle data assumed to contain such interactions, were analyzed. Two animal models were compared to four sire models. Models differed in their ability to handle heterogeneous variance from different sources. Including an individual effect with a (co)variance matrix restricted to three times the sire (co)variance matrix permitted the modeling of the additive genetic variance not covered by the sire effect. This made the ability of sire models to handle heterogeneous genetic variance approximately equivalent to that of animal models. When residual variance was heterogeneous, a different approach to account for the heterogeneity of variance was needed, for example when using dairy cattle data in order to prevent overestimation of genetic heterogeneity of variance. Including environmental classes can be used to account for heterogeneous residual variance.
Directory of Open Access Journals (Sweden)
E. Brito-Rocha
Full Text Available Abstract Individual leaf area (LA is a key variable in studies of tree ecophysiology because it directly influences light interception, photosynthesis and evapotranspiration of adult trees and seedlings. We analyzed the leaf dimensions (length – L and width – W of seedlings and adults of seven Neotropical rainforest tree species (Brosimum rubescens, Manilkara maxima, Pouteria caimito, Pouteria torta, Psidium cattleyanum, Symphonia globulifera and Tabebuia stenocalyx with the objective to test the feasibility of single regression models to estimate LA of both adults and seedlings. In southern Bahia, Brazil, a first set of data was collected between March and October 2012. From the seven species analyzed, only two (P. cattleyanum and T. stenocalyx had very similar relationships between LW and LA in both ontogenetic stages. For these two species, a second set of data was collected in August 2014, in order to validate the single models encompassing adult and seedlings. Our results show the possibility of development of models for predicting individual leaf area encompassing different ontogenetic stages for tropical tree species. The development of these models was more dependent on the species than the differences in leaf size between seedlings and adults.
Zhang, Y J; Xue, F X; Bai, Z P
2017-03-06
The impact of maternal air pollution exposure on offspring health has received much attention. Precise and feasible exposure estimation is particularly important for clarifying exposure-response relationships and reducing heterogeneity among studies. Temporally-adjusted land use regression (LUR) models are exposure assessment methods developed in recent years that have the advantage of having high spatial-temporal resolution. Studies on the health effects of outdoor air pollution exposure during pregnancy have been increasingly carried out using this model. In China, research applying LUR models was done mostly at the model construction stage, and findings from related epidemiological studies were rarely reported. In this paper, the sources of heterogeneity and research progress of meta-analysis research on the associations between air pollution and adverse pregnancy outcomes were analyzed. The methods of the characteristics of temporally-adjusted LUR models were introduced. The current epidemiological studies on adverse pregnancy outcomes that applied this model were systematically summarized. Recommendations for the development and application of LUR models in China are presented. This will encourage the implementation of more valid exposure predictions during pregnancy in large-scale epidemiological studies on the health effects of air pollution in China.
Use of binary logistic regression technique with MODIS data to estimate wild fire risk
Fan, Hong; Di, Liping; Yang, Wenli; Bonnlander, Brian; Li, Xiaoyan
2007-11-01
Many forest fires occur across the globe each year, which destroy life and property, and strongly impact ecosystems. In recent years, wildland fires and altered fire disturbance regimes have become a significant management and science problem affecting ecosystems and wildland/urban interface cross the United States and global. In this paper, we discuss the estimation of 504 probability models for forecasting fire risk for 14 fuel types, 12 months, one day/week/month in advance, which use 19 years of historical fire data in addition to meteorological and vegetation variables. MODIS land products are utilized as a major data source, and a logistical binary regression was adopted to solve fire forecast probability. In order to better modeling the change of fire risk along with the transition of seasons, some spatial and temporal stratification strategies were applied. In order to explore the possibilities of real time prediction, the Matlab distributing computing toolbox was used to accelerate the prediction. Finally, this study give an evaluation and validation of predict based on the ground truth collected. Validating results indicate these fire risk models have achieved nearly 70% accuracy of prediction and as well MODIS data are potential data source to implement near real-time fire risk prediction.
A land use regression model for estimating the NO2 concentration in Shanghai, China.
Meng, Xia; Chen, Li; Cai, Jing; Zou, Bin; Wu, Chang-Fu; Fu, Qingyan; Zhang, Yan; Liu, Yang; Kan, Haidong
2015-02-01
Limited by data accessibility, few exposure assessment studies of air pollutants have been conducted in China. There is an urgent need to develop models for assessing the intra-urban concentration of key air pollutants in Chinese cities. In this study, a land use regression (LUR) model was established to estimate NO2 during 2008-2011 in Shanghai. Four predictor variables were left in the final LUR model: the length of major road within the 2-km buffer around monitoring sites, the number of industrial sources (excluding power plants) within a 10-km buffer, the agricultural land area within a 5-km buffer, and the population counts. The model R(2) and the leave-one-out-cross-validation (LOOCV) R(2) of the NO2 LUR models were 0.82 and 0.75, respectively. The prediction surface of the NO2 concentration based on the LUR model was of high spatial resolution. The 1-year predicted concentration based on the ratio and the difference methods fitted well with the measured NO2 concentration. The LUR model of NO2 outperformed the kriging and inverse distance weighed (IDW) interpolation methods in Shanghai. Our findings suggest that the LUR model may provide a cost-effective method of air pollution exposure assessment in a developing country.
Risk factor selection in rate making: EM adaptive LASSO for zero-inflated poisson regression models.
Tang, Yanlin; Xiang, Liya; Zhu, Zhongyi
2014-06-01
Risk factor selection is very important in the insurance industry, which helps precise rate making and studying the features of high-quality insureds. Zero-inflated data are common in insurance, such as the claim frequency data, and zero-inflation makes the selection of risk factors quite difficult. In this article, we propose a new risk factor selection approach, EM adaptive LASSO, for a zero-inflated Poisson regression model, which combines the EM algorithm and adaptive LASSO penalty. Under some regularity conditions, we show that, with probability approaching 1, important factors are selected and the redundant factors are excluded. We investigate the finite sample performance of the proposed method through a simulation study and the analysis of car insurance data from SAS Enterprise Miner database.
19 CFR 159.38 - Rates for estimated duties.
2010-04-01
... duties. For purposes of calculating estimated duties, the port director shall use the rate or rates... 19 Customs Duties 2 2010-04-01 2010-04-01 false Rates for estimated duties. 159.38 Section 159.38 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF...
Outlier Detection in Regression Using an Iterated One-Step Approximation to the Huber-Skip Estimator
DEFF Research Database (Denmark)
Johansen, Søren; Nielsen, Bent
2013-01-01
In regression we can delete outliers based upon a preliminary estimator and reestimate the parameters by least squares based upon the retained observations. We study the properties of an iteratively defined sequence of estimators based on this idea. We relate the sequence to the Huber......-skip estimator. We provide a stochastic recursion equation for the estimation error in terms of a kernel, the previous estimation error and a uniformly small error term. The main contribution is the analysis of the solution of the stochastic recursion equation as a fixed point, and the results...
Liou, Jyun-you; Smith, Elliot H.; Bateman, Lisa M.; McKhann, Guy M., II; Goodman, Robert R.; Greger, Bradley; Davis, Tyler S.; Kellis, Spencer S.; House, Paul A.; Schevon, Catherine A.
2017-08-01
Objective. Epileptiform discharges, an electrophysiological hallmark of seizures, can propagate across cortical tissue in a manner similar to traveling waves. Recent work has focused attention on the origination and propagation patterns of these discharges, yielding important clues to their source location and mechanism of travel. However, systematic studies of methods for measuring propagation are lacking. Approach. We analyzed epileptiform discharges in microelectrode array recordings of human seizures. The array records multiunit activity and local field potentials at 400 micron spatial resolution, from a small cortical site free of obstructions. We evaluated several computationally efficient statistical methods for calculating traveling wave velocity, benchmarking them to analyses of associated neuronal burst firing. Main results. Over 90% of discharges met statistical criteria for propagation across the sampled cortical territory. Detection rate, direction and speed estimates derived from a multiunit estimator were compared to four field potential-based estimators: negative peak, maximum descent, high gamma power, and cross-correlation. Interestingly, the methods that were computationally simplest and most efficient (negative peak and maximal descent) offer non-inferior results in predicting neuronal traveling wave velocities compared to the other two, more complex methods. Moreover, the negative peak and maximal descent methods proved to be more robust against reduced spatial sampling challenges. Using least absolute deviation in place of least squares error minimized the impact of outliers, and reduced the discrepancies between local field potential-based and multiunit estimators. Significance. Our findings suggest that ictal epileptiform discharges typically take the form of exceptionally strong, rapidly traveling waves, with propagation detectable across millimeter distances. The sequential activation of neurons in space can be inferred from clinically
Simulated data supporting inbreeding rate estimates from incomplete pedigrees
Miller, Mark P.
2017-01-01
This data release includes:(1) The data from simulations used to illustrate the behavior of inbreeding rate estimators. Estimating inbreeding rates is particularly difficult for natural populations because parentage information for many individuals may be incomplete. Our analyses illustrate the behavior of a newly-described inbreeding rate estimator that outperforms previously described approaches in the scientific literature.(2) Python source code ("analytical expressions", "computer simulations", and "empricial data set") that can be used to analyze these data.
Roberts, Steven; Martin, Michael
Most investigations of the adverse health effects of multiple air pollutants analyse the time series involved by simultaneously entering the multiple pollutants into a Poisson log-linear model. Concerns have been raised about this type of analysis, and it has been stated that new methodology or models should be developed for investigating the adverse health effects of multiple air pollutants. In this paper, we introduce the use of the lasso for this purpose and compare its statistical properties to those of ridge regression and the Poisson log-linear model. Ridge regression has been used in time series analyses on the adverse health effects of multiple air pollutants but its properties for this purpose have not been investigated. A series of simulation studies was used to compare the performance of the lasso, ridge regression, and the Poisson log-linear model. In these simulations, realistic mortality time series were generated with known air pollution mortality effects permitting the performance of the three models to be compared. Both the lasso and ridge regression produced more accurate estimates of the adverse health effects of the multiple air pollutants than those produced using the Poisson log-linear model. This increase in accuracy came at the expense of increased bias. Ridge regression produced more accurate estimates than the lasso, but the lasso produced more interpretable models. The lasso and ridge regression offer a flexible way of obtaining more accurate estimation of pollutant effects than that provided by the standard Poisson log-linear model.
Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.
2013-01-01
In a traditional regression-discontinuity design (RDD), units are assigned to treatment on the basis of a cutoff score and a continuous assignment variable. The treatment effect is measured at a single cutoff location along the assignment variable. This article introduces the multivariate regression-discontinuity design (MRDD), where multiple…
Moment-bases estimation of smooth transition regression models with endogenous variables
W.D. Areosa (Waldyr Dutra); M.J. McAleer (Michael); M.C. Medeiros (Marcelo)
2008-01-01
textabstractNonlinear regression models have been widely used in practice for a variety of time series and cross-section datasets. For purposes of analyzing univariate and multivariate time series data, in particular, Smooth Transition Regression (STR) models have been shown to be very useful for re
Gibbons, A.; Thomas, B. F.; Famiglietti, J. S.
2014-12-01
Global groundwater dependence is likely to increase with continued population growth and climate-driven freshwater redistribution. Recent groundwater quantity studies have estimated large-scale aquifer depletion rates using monthly water storage variations from NASA's Gravity Recovery and Climate Experiment (GRACE) mission. These innovative approaches currently fail to evaluate groundwater quality, integral to assess the availability of potable groundwater resources. We present multivariate relationships to predict total dissolved solid (TDS) concentrations as a function of GRACE-derived variations in water table depth, dominant land use, and other physical parameters in two important aquifer systems in the United States: the High Plains aquifer and the Central Valley aquifer. Model evaluations were performed using goodness of fit procedures and cross validation to identify general model forms. Results of this work demonstrate the potential to characterize global groundwater potability using remote sensing.
Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators
Lounici, Karim
2008-01-01
We derive the l∞ convergence rate simultaneously for Lasso and Dantzig estimators in a high-dimensional linear regression model under a mutual coherence assumption on the Gram matrix of the design and two different assumptions on the noise: Gaussian noise and general noise with finite variance. Then we prove that simultaneously the thresholded Lasso and Dantzig estimators with a proper choice of the threshold enjoy a sign concentration property provided that the non-zero components of the tar...
Data-driven fuel consumption estimation: A multivariate adaptive regression spline approach
Energy Technology Data Exchange (ETDEWEB)
Chen, Yuche; Zhu, Lei; Gonder, Jeffrey; Young, Stanley; Walkowicz, Kevin
2017-10-01
Providing guidance and information to drivers to help them make fuel-efficient route choices remains an important and effective strategy in the near term to reduce fuel consumption from the transportation sector. One key component in implementing this strategy is a fuel-consumption estimation model. In this paper, we developed a mesoscopic fuel consumption estimation model that can be implemented into an eco-routing system. Our proposed model presents a framework that utilizes large-scale, real-world driving data, clusters road links by free-flow speed and fits one statistical model for each of cluster. This model includes predicting variables that were rarely or never considered before, such as free-flow speed and number of lanes. We applied the model to a real-world driving data set based on a global positioning system travel survey in the Philadelphia-Camden-Trenton metropolitan area. Results from the statistical analyses indicate that the independent variables we chose influence the fuel consumption rates of vehicles. But the magnitude and direction of the influences are dependent on the type of road links, specifically free-flow speeds of links. A statistical diagnostic is conducted to ensure the validity of the models and results. Although the real-world driving data we used to develop statistical relationships are specific to one region, the framework we developed can be easily adjusted and used to explore the fuel consumption relationship in other regions.
Estimating forest conversion rates with annual forest inventory data
Paul C. Van Deusen; Francis A. Roesch
2009-01-01
The rate of land-use conversion from forest to nonforest or natural forest to forest plantation is of interest for forest certification purposes and also as part of the process of assessing forest sustainability. Conversion rates can be estimated from remeasured inventory plots in general, but the emphasis here is on annual inventory data. A new estimator is proposed...
WAVELET BASED SPECTRAL CORRELATION METHOD FOR DPSK CHIP RATE ESTIMATION
Institute of Scientific and Technical Information of China (English)
Li Yingxiang; Xiao Xianci; Tai Hengming
2004-01-01
A wavelet-based spectral correlation algorithm to detect and estimate BPSK signal chip rate is proposed. Simulation results show that the proposed method can correctly estimate the BPSK signal chip rate, which may be corrupted by the quadratic characteristics of the spectral correlation function, in a low SNR environment.
Baghi, Quentin; Bergé, Joël; Christophe, Bruno; Touboul, Pierre; Rodrigues, Manuel
2016-01-01
We present a Gaussian regression method for time series with missing data and stationary residuals of unknown power spectral density (PSD). The missing data are efficiently estimated by their conditional expectation as in universal Kriging, based on the circulant approximation of the complete data covariance. After initialization with an autoregessive fit of the noise, a few iterations of estimation/reconstruction steps are performed until convergence of the regression and PSD estimates, in a way similar to the expectation-conditional-maximization algorithm. The estimation can be performed for an arbitrary PSD provided that it is sufficiently smooth. The algorithm is developed in the framework of the MICROSCOPE space mission whose goal is to test the weak equivalence principle (WEP) with a precision of $10^{-15}$. We show by numerical simulations that the developed method allows us to meet three major requirements: to maintain the targeted precision of the WEP test in spite of the loss of data, to calculate a...
National Aeronautics and Space Administration — The application of the Bayesian theory of managing uncertainty and complexity to regression and classification in the form of Relevance Vector Machine (RVM), and to...
Ni, Karl; Nguyen, Truong Q.
2007-09-01
A stochastic framework combining classification with nonlinear regression is proposed. The performance evaluation is tested in terms of a patch-based image superresolution problem. Assuming a multi-variate Gaussian mixture model for the distribution of all image content, unsupervised probabilistic clustering via expectation maximization allows segmentation of the domain. Subsequently, for the regression component of the algorithm, a modified support vector regression provides per class nonlinear regression while appropriately weighting the relevancy of training points during training. Relevancy is determined by probabilistic values from clustering. Support vector machines, an established convex optimization problem, provide the foundation for additional formulations of learning the kernel matrix via semi-definite programming problems and quadratically constrained quadratic programming problems.
Bertrand-Krajewski, J L
2004-01-01
In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity Tvalues with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.
Adaptive Estimation of Intravascular Shear Rate Based on Parameter Optimization
Nitta, Naotaka; Takeda, Naoto
2008-05-01
The relationships between the intravascular wall shear stress, controlled by flow dynamics, and the progress of arteriosclerosis plaque have been clarified by various studies. Since the shear stress is determined by the viscosity coefficient and shear rate, both factors must be estimated accurately. In this paper, an adaptive method for improving the accuracy of quantitative shear rate estimation was investigated. First, the parameter dependence of the estimated shear rate was investigated in terms of the differential window width and the number of averaged velocity profiles based on simulation and experimental data, and then the shear rate calculation was optimized. The optimized result revealed that the proposed adaptive method of shear rate estimation was effective for improving the accuracy of shear rate calculation.
Maas, Iris L; Nolte, Sandra; Walter, Otto B; Berger, Thomas; Hautzinger, Martin; Hohagen, Fritz; Lutz, Wolfgang; Meyer, Björn; Schröder, Johanna; Späth, Christina; Klein, Jan Philipp; Moritz, Steffen; Rose, Matthias
2017-02-01
To compare treatment effect estimates obtained from a regression discontinuity (RD) design with results from an actual randomized controlled trial (RCT). Data from an RCT (EVIDENT), which studied the effect of an Internet intervention on depressive symptoms measured with the Patient Health Questionnaire (PHQ-9), were used to perform an RD analysis, in which treatment allocation was determined by a cutoff value at baseline (PHQ-9 = 10). A linear regression model was fitted to the data, selecting participants above the cutoff who had received the intervention (n = 317) and control participants below the cutoff (n = 187). Outcome was PHQ-9 sum score 12 weeks after baseline. Robustness of the effect estimate was studied; the estimate was compared with the RCT treatment effect. The final regression model showed a regression coefficient of -2.29 [95% confidence interval (CI): -3.72 to -.85] compared with a treatment effect found in the RCT of -1.57 (95% CI: -2.07 to -1.07). Although the estimates obtained from two designs are not equal, their confidence intervals overlap, suggesting that an RD design can be a valid alternative for RCTs. This finding is particularly important for situations where an RCT may not be feasible or ethical as is often the case in clinical research settings. Copyright © 2016 Elsevier Inc. All rights reserved.
Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel
2012-01-01
In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…
Cho, M.A.; Skidmore, A.K.; Corsi, F.; Wieren, van S.E.; Sobhan, I.
2007-01-01
The main objective was to determine whether partial least squares (PLS) regression improves grass/herb biomass estimation when compared with hyperspectral indices, that is normalised difference vegetation index (NDVI) and red-edge position (REP). To achieve this objective, fresh green grass/herb bio
CSIR Research Space (South Africa)
Bencherif, H
2010-09-01
Full Text Available The present reports on the use of a multi-regression model adapted at Reunion University for temperature and ozone trend estimates. Depending on the location of the observing site, the studied geophysical signal is broken down in form of a sum...
Chi, Olivia L.; Dow, Aaron W.
2014-01-01
This study focuses on how matching, a method of preprocessing data prior to estimation and analysis, can be used to reduce imbalance between treatment and control group in regression discontinuity design. To examine the effects of academic probation on student outcomes, researchers replicate and expand upon research conducted by Lindo, Sanders,…
Lee, Sang-Hyun; McKeen, Stuart A.; Sailor, David J.
2014-10-01
A statistical regression method is presented for estimating hourly anthropogenic heat flux (AHF) using an anthropogenic pollutant emission inventory for use in mesoscale meteorological and air-quality modeling. Based on bottom-up AHF estimated from detailed energy consumption data and anthropogenic pollutant emissions of carbon monoxide (CO) and nitrogen oxides (NOx) in the US National Emission Inventory year 2005 (NEI-2005), a robust regression relation between the AHF and the pollutant emissions is obtained for Houston. This relation is a combination of two power functions (Y = aXb) relating CO and NOx emissions to AHF, giving a determinant coefficient (R2) of 0.72. The AHF for Houston derived from the regression relation has high temporal (R = 0.91) and spatial (R = 0.83) correlations with the bottom-up AHF. Hourly AHF for the whole US in summer is estimated by applying the regression relation to the NEI-2005 summer pollutant emissions with a high spatial resolution of 4-km. The summer daily mean AHF range 10-40 W m-2 on a 4 × 4 km2 grid scale with maximum heat fluxes of 50-140 W m-2 for major US cities. The AHFs derived from the regression relations between the bottom-up AHF and either CO or NOx emissions show a small difference of less than 5% (4.7 W m-2) in city-scale daily mean AHF, and similar R2 statistics, compared to results from their combination. Thus, emissions of either species can be used to estimate AHF in the US cities. An hourly AHF inventory at 4 × 4 km2 resolution over the entire US based on the combined regression is derived and made publicly available for use in mesoscale numerical modeling.
Asquith, William H.; Thompson, David B.
2008-01-01
The U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, investigated a refinement of the regional regression method and developed alternative equations for estimation of peak-streamflow frequency for undeveloped watersheds in Texas. A common model for estimation of peak-streamflow frequency is based on the regional regression method. The current (2008) regional regression equations for 11 regions of Texas are based on log10 transformations of all regression variables (drainage area, main-channel slope, and watershed shape). Exclusive use of log10-transformation does not fully linearize the relations between the variables. As a result, some systematic bias remains in the current equations. The bias results in overestimation of peak streamflow for both the smallest and largest watersheds. The bias increases with increasing recurrence interval. The primary source of the bias is the discernible curvilinear relation in log10 space between peak streamflow and drainage area. Bias is demonstrated by selected residual plots with superimposed LOWESS trend lines. To address the bias, a statistical framework based on minimization of the PRESS statistic through power transformation of drainage area is described and implemented, and the resulting regression equations are reported. Compared to log10-exclusive equations, the equations derived from PRESS minimization have PRESS statistics and residual standard errors less than the log10 exclusive equations. Selected residual plots for the PRESS-minimized equations are presented to demonstrate that systematic bias in regional regression equations for peak-streamflow frequency estimation in Texas can be reduced. Because the overall error is similar to the error associated with previous equations and because the bias is reduced, the PRESS-minimized equations reported here provide alternative equations for peak-streamflow frequency estimation.
Estimation of central shapes of error distributions in linear regression problems
National Research Council Canada - National Science Library
Lai, P Y; Lee, Stephen M. S
2013-01-01
.... Both methods are motivated by the well-known Hill estimator, which has been extensively studied in the related problem of estimating tail indices, but substitute reciprocals of small L p residuals...
Prolonged decay of molecular rate estimates for metazoan mitochondrial DNA
Directory of Open Access Journals (Sweden)
Martyna Molak
2015-03-01
Full Text Available Evolutionary timescales can be estimated from genetic data using the molecular clock, often calibrated by fossil or geological evidence. However, estimates of molecular rates in mitochondrial DNA appear to scale negatively with the age of the clock calibration. Although such a pattern has been observed in a limited range of data sets, it has not been studied on a large scale in metazoans. In addition, there is uncertainty over the temporal extent of the time-dependent pattern in rate estimates. Here we present a meta-analysis of 239 rate estimates from metazoans, representing a range of timescales and taxonomic groups. We found evidence of time-dependent rates in both coding and non-coding mitochondrial markers, in every group of animals that we studied. The negative relationship between the estimated rate and time persisted across a much wider range of calibration times than previously suggested. This indicates that, over long time frames, purifying selection gives way to mutational saturation as the main driver of time-dependent biases in rate estimates. The results of our study stress the importance of accounting for time-dependent biases in estimating mitochondrial rates regardless of the timescale over which they are inferred.
Zhongyuan Geng; Xue Zhai
2015-01-01
This paper applies the Panel Smooth Transition Regression (PSTR) model to simulate the effects of the interest rate and reserve requirement ratio on bank risk in China. The results reveal the nonlinearity embedded in the interest rate, reserve requirement ratio, and bank risk nexus. Both the interest rate and reserve requirement ratio exert a positive impact on bank risk for the low regime and a negative impact for the high regime. The interest rate performs a significant effect while the res...
Zhongyuan Geng; Xue Zhai
2015-01-01
This paper applies the Panel Smooth Transition Regression (PSTR) model to simulate the effects of the interest rate and reserve requirement ratio on bank risk in China. The results reveal the nonlinearity embedded in the interest rate, reserve requirement ratio, and bank risk nexus. Both the interest rate and reserve requirement ratio exert a positive impact on bank risk for the low regime and a negative impact for the high regime. The interest rate performs a significant effect while the res...
Penalized estimation for competing risks regression with applications to high-dimensional covariates
DEFF Research Database (Denmark)
Ambrogi, Federico; Scheike, Thomas H.
2016-01-01
and better response to the therapy. Although in the last years the number of contributions for coping with high and ultra-high-dimensional data in standard survival analysis have increased (Witten and Tibshirani, 2010. Survival analysis with high-dimensional covariates. Statistical Methods in Medical...... to the binomial model in the package timereg (Scheike and Martinussen, 2006. Dynamic Regression models for survival data New York: Springer), available through CRAN.......High-dimensional regression has become an increasingly important topic for many research fields. For example, biomedical research generates an increasing amount of data to characterize patients' bio-profiles (e.g. from a genomic high-throughput assay). The increasing complexity...
SYMBOL-RATE ESTIMATION BASED ON TEMPLATE MATCHING RULES
Institute of Scientific and Technical Information of China (English)
Ren Chunhui; Wei Ping; Xiao Xianci
2006-01-01
For non-cooperative communication, the symbol-rate estimation of digital communication signal is an important problem to be solved. In this letter, A new algorithm for the symbol-rate estimation of single-tone digitally modulated signal (i.e. MPSK/QAM) is proposed. Firstly a section from the received signal is cut as the template, and then the signal is matched sectionwise by making use of the signal selfsimilarity. So a signal containing the information of symbol jumping is got, and the symbol-rate can be estimated by DFT (Discrete Fourier Transformation). The validity of the new method has been verified by experiments.
Huiliang, Wang; Zening, Wu; Caihong, Hu; Xinzhong, Du
2015-09-01
Nonpoint source (NPS) pollution is considered as the main reason for water quality deterioration; thus, to quantify the NPS loads reliably is the key to implement watershed management practices. In this study, water quality and NPS loads from a watershed with limited data availability were studied in a mountainous area in China. Instantaneous water discharge was measured through the velocity-area method, and samples were taken for water quality analysis in both flood and nonflood days in 2010. The streamflow simulated by Hydrological Simulation Program-Fortran (HSPF) from 1995 to 2013 and a regression model were used to estimate total annual loads of various water quality parameters. The concentrations of total phosphorus (TP) and total nitrogen (TN) were much higher during the flood seasons, but the concentrations of ammonia nitrogen (NH3-N) and nitrate nitrogen (NO3-N) were lower during the flood seasons. Nevertheless, only TP concentration was positively correlated with the flow rate. The fluctuation of annual load from this watershed was significant. Statistical results indicated the significant contribution of pollutant fluxes during flood seasons to annual fluxes. The loads of TP, TN, NH3-N, and NO3-N in the flood seasons were accounted for 58-85, 60-82, 63-88, 64-81% of the total annual loads, respectively. This study presented a new method for estimation of the water and NPS loads in the watershed with limited data availability, which simplified data collection to watershed model and overcame the scale problem of field experiment method.
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P valuelinear regression P value). The statistical power of CAT test decreased, while the result of linear regression analysis remained the same when population size was reduced by 100 times and AMI incidence rate remained unchanged. The two statistical methods have their advantages and disadvantages. It is necessary to choose statistical method according the fitting degree of data, or comprehensively analyze the results of two methods.
Musa, Rosliza; Ali, Zalila; Baharum, Adam; Nor, Norlida Mohd
2017-08-01
The linear regression model assumes that all random error components are identically and independently distributed with constant variance. Hence, each data point provides equally precise information about the deterministic part of the total variation. In other words, the standard deviations of the error terms are constant over all values of the predictor variables. When the assumption of constant variance is violated, the ordinary least squares estimator of regression coefficient lost its property of minimum variance in the class of linear and unbiased estimators. Weighted least squares estimation are often used to maximize the efficiency of parameter estimation. A procedure that treats all of the data equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. Optimizing the weighted fitting criterion to find the parameter estimates allows the weights to determine the contribution of each observation to the final parameter estimates. This study used polynomial model with weighted least squares estimation to investigate paddy production of different paddy lots based on paddy cultivation characteristics and environmental characteristics in the area of Kedah and Perlis. The results indicated that factors affecting paddy production are mixture fertilizer application cycle, average temperature, the squared effect of average rainfall, the squared effect of pest and disease, the interaction between acreage with amount of mixture fertilizer, the interaction between paddy variety and NPK fertilizer application cycle and the interaction between pest and disease and NPK fertilizer application cycle.
Directory of Open Access Journals (Sweden)
Ristya Widi Endah Yani
2008-12-01
Full Text Available Background: Bootstrap is a computer simulation-based method that provides estimation accuracy in estimating inferential statistical parameters. Purpose: This article describes a research using secondary data (n = 30 aimed to elucidate bootstrap method as the estimator of linear regression test based on the computer programs MINITAB 13, SPSS 13, and MacroMINITAB. Methods: Bootstrap regression methods determine ˆ β and Yˆ value from OLS (ordinary least square, ε i = Yi −Yˆi value, determine how many repetition for bootstrap (B, take n sample by replacement from ε i to ε (i , Yi = Yˆi + ε (i value, ˆ β value from sample bootstrap at i vector. If the amount of repetition less than, B a recalculation should be back to take n sample by using replacement from ε i . Otherwise, determine ˆ β from “bootstrap” methods as the average ˆ β value from the result of B times sample taken. Result: The result has similar result compared to linear regression equation with OLS method (α = 5%. The resulting regression equation for caries was = 1.90 + 2.02 (OHI-S, indicating that every one increase of OHI-S unit will result in caries increase of 2.02 units. Conclusion: This was conducted with B as many as 10,500 with 10 times iterations.
Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.
2013-01-01
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
Shobo, Yetty; Wong, Jen D.; Bell, Angie
2014-01-01
Regression discontinuity (RD), an "as good as randomized," research design is increasingly prominent in education research in recent years; the design gets eligible quasi-experimental designs as close as possible to experimental designs by using a stated threshold on a continuous baseline variable to assign individuals to a…
2014-09-01
driving simulation and ecologically valid subject pool to which the simple linear regression algorithm was applied. Table 2 Average squared...Bones PJ, Jones RD. Detection of lapses in responsiveness from the EEG. Journal of Neural Engineering. 2011;8(1):1–15. Perez CA, Palma A, Holzmann
Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.
2009-01-01
This paper introduces a generalization of the regression-discontinuity design (RDD). Traditionally, RDD is considered in a two-dimensional framework, with a single assignment variable and cutoff. Treatment effects are measured at a single location along the assignment variable. However, this represents a specialized (and straight-forward)…
Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.
2012-01-01
In a traditional regression-discontinuity design (RDD), units are assigned to treatment and comparison conditions solely on the basis of a single cutoff score on a continuous assignment variable. The discontinuity in the functional form of the outcome at the cutoff represents the treatment effect, or the average treatment effect at the cutoff.…
Data on individual daily feed intake, bi-weekly BW, and carcass composition were obtained on 1,212 crossbred steers, in Cycle VII of the Germplasm Evaluation Project at the U.S. Meat Animal Research Center. Within animal regressions of cumulative feed intake and BW on linear and quadratic days on fe...
Directory of Open Access Journals (Sweden)
Kuo-Hsin Tseng
2015-04-01
Full Text Available Accurate estimation of lithium-ion battery life is essential to assure the reliable operation of the energy supply system. This study develops regression models for battery prognostics using statistical methods. The resultant regression models can not only monitor a battery’s degradation trend but also accurately predict its remaining useful life (RUL at an early stage. Three sets of test data are employed in the training stage for regression models. Another set of data is then applied to the regression models for validation. The fully discharged voltage (Vdis and internal resistance (R are adopted as aging parameters in two different mathematical models, with polynomial and exponential functions. A particle swarm optimization (PSO process is applied to search for optimal coefficients of the regression models. Simulations indicate that the regression models using Vdis and R as aging parameters can build a real state of health profile more accurately than those using cycle number, N. The Monte Carlo method is further employed to make the models adaptive. The subsequent results, however, show that this results in an insignificant improvement of the battery life prediction. A reasonable speculation is that the PSO process already yields the major model coefficients.
A Regression-based User Calibration Framework for Real-time Gaze Estimation
Arar, Nuri Murat; Gao, Hua; Thiran, Jean-Philippe
2016-01-01
Eye movements play a very significant role in human computer interaction (HCI) as they are natural and fast, and contain important cues for human cognitive state and visual attention. Over the last two decades, many techniques have been proposed to accurately estimate the gaze. Among these, video-based remote eye trackers have attracted much interest since they enable non-intrusive gaze estimation. To achieve high estimation accuracies for remote systems, user calibration is inevitable in ord...
Directory of Open Access Journals (Sweden)
T. Nataraja Moorthy
2015-05-01
Full Text Available The human foot has been studied for a variety of reasons, i.e., for forensic as well as non-forensic purposes by anatomists, forensic scientists, anthropologists, physicians, podiatrists, and numerous other groups. An aspect of human identification that has received scant attention from forensic anthropologists is the study of human feet and the footprints made by the feet. The present study, conducted during 2013-2014, aimed to derive population specific regression equations to estimate stature from the footprint anthropometry of indigenous adult Bidayuhs in the east of Malaysia. The study sample consisted of 480 bilateral footprints collected using a footprint kit from 240 Bidayuhs (120 males and 120 females, who consented to taking part in the study. Their ages ranged from 18 to 70 years. Stature was measured using a portable body meter device (SECA model 206. The data were analyzed using PASW Statistics version 20. In this investigation, better results were obtained in terms of correlation coefficient (R between stature and various footprint measurements and regression analysis in estimating the stature. The (R values showed a positive and statistically significant (p < 0.001 relationship between the two parameters. The correlation coefficients in the pooled sample (0.861–0.882 were comparatively higher than those of an individual male (0.762-0.795 and female (0.722-0.765. This study provided regression equations to estimate stature from footprints in the Bidayuh population. The result showed that the regression equations without sex indicators performed significantly better than models with gender indications. The regression equations derived for a pooled sample can be used to estimate stature, even when the sex of the footprint is unknown, as in real crime scenes.
Directory of Open Access Journals (Sweden)
Piyawat Wuttichaikitcharoen
2014-08-01
Full Text Available Predicting sediment yield is necessary for good land and water management in any river basin. However, sometimes, the sediment data is either not available or is sparse, which renders estimating sediment yield a daunting task. The present study investigates the factors influencing suspended sediment yield using the principal component analysis (PCA. Additionally, the regression relationships for estimating suspended sediment yield, based on the selected key factors from the PCA, are developed. The PCA shows six components of key factors that can explain at least up to 86.7% of the variation of all variables. The regression models show that basin size, channel network characteristics, land use, basin steepness and rainfall distribution are the key factors affecting sediment yield. The validation of regression relationships for estimating suspended sediment yield shows the error of estimation ranging from −55% to +315% and −59% to +259% for suspended sediment yield and for area-specific suspended sediment yield, respectively. The proposed relationships may be considered useful for predicting suspended sediment yield in ungauged basins of Northern Thailand that have geologic, climatic and hydrologic conditions similar to the study area.
Selection of the Linear Regression Model According to the Parameter Estimation
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
In this paper, based on the theory of parameter estimation, we give a selection method and ,in a sense of a good character of the parameter estimation,we think that it is very reasonable. Moreover,we offera calculation method of selection statistic and an applied example.
Simplifying cardiovascular risk estimation using resting heart rate.
LENUS (Irish Health Repository)
Cooney, Marie Therese
2010-09-01
Elevated resting heart rate (RHR) is a known, independent cardiovascular (CV) risk factor, but is not included in risk estimation systems, including Systematic COronary Risk Evaluation (SCORE). We aimed to derive risk estimation systems including RHR as an extra variable and assess the value of this addition.
Identification and Estimation of Exchange Rate Models with Unobservable Fundamentals
Chambers, M.J.; McCrorie, J.R.
2004-01-01
This paper is concerned with issues of model specification, identification, and estimation in exchange rate models with unobservable fundamentals.We show that the model estimated by Gardeazabal, Reg´ulez and V´azquez (International Economic Review, 1997) is not identified and demonstrate how to spec
Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis
Directory of Open Access Journals (Sweden)
Carlos Augusto Zangrando Toneli
2011-09-01
Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.
Maximum likelihood estimation for Cox's regression model under nested case-control sampling
DEFF Research Database (Denmark)
Scheike, Thomas; Juul, Anders
2004-01-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards...... model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin...
Estimating the normal background rate of species extinction.
De Vos, Jurriaan M; Joppa, Lucas N; Gittleman, John L; Stephens, Patrick R; Pimm, Stuart L
2015-04-01
A key measure of humanity's global impact is by how much it has increased species extinction rates. Familiar statements are that these are 100-1000 times pre-human or background extinction levels. Estimating recent rates is straightforward, but establishing a background rate for comparison is not. Previous researchers chose an approximate benchmark of 1 extinction per million species per year (E/MSY). We explored disparate lines of evidence that suggest a substantially lower estimate. Fossil data yield direct estimates of extinction rates, but they are temporally coarse, mostly limited to marine hard-bodied taxa, and generally involve genera not species. Based on these data, typical background loss is 0.01 genera per million genera per year. Molecular phylogenies are available for more taxa and ecosystems, but it is debated whether they can be used to estimate separately speciation and extinction rates. We selected data to address known concerns and used them to determine median extinction estimates from statistical distributions of probable values for terrestrial plants and animals. We then created simulations to explore effects of violating model assumptions. Finally, we compiled estimates of diversification-the difference between speciation and extinction rates for different taxa. Median estimates of extinction rates ranged from 0.023 to 0.135 E/MSY. Simulation results suggested over- and under-estimation of extinction from individual phylogenies partially canceled each other out when large sets of phylogenies were analyzed. There was no evidence for recent and widespread pre-human overall declines in diversity. This implies that average extinction rates are less than average diversification rates. Median diversification rates were 0.05-0.2 new species per million species per year. On the basis of these results, we concluded that typical rates of background extinction may be closer to 0.1 E/MSY. Thus, current extinction rates are 1,000 times higher than natural
Borodachev, S. M.
2016-06-01
The simple derivation of recursive least squares (RLS) method equations is given as special case of Kalman filter estimation of a constant system state under changing observation conditions. A numerical example illustrates application of RLS to multicollinearity problem.
Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida
2016-10-01
bivariate analyses were applied to better characterize each vulnerability parameter. Multiple corresponding analyses revealed strong relationships between the "Distance to channel or bridges", "Structural building type", "Building footprint" and the observed damage. Logistic regression enabled quantification of the contribution of each explanatory parameter to potential damage, and determination of the significant parameters that express the damage susceptibility of a building. The model was applied 200 times on different calibration and validation data sets in order to examine performance. Results show that 90% of these tests have a success rate of more than 67%. Probabilities (at building scale) of experiencing different damage levels during a future event similar to the 8 February 2013 flash flood are the major outcomes of this study.
Strand, L. D.; Mcnamara, R. P.
1976-01-01
The feasibility of a system capable of rapidly and directly measuring the low-frequency (motor characteristics length bulk mode) combustion response characteristics of solid propellants has been investigated. The system consists of a variable frequency oscillatory driver device coupled with an improved version of the JPL microwave propellant regression rate measurement system. The ratio of the normalized regression rate and pressure amplitudes and their relative phase are measured as a function of varying pressure level and frequency. Test results with a well-characterized PBAN-AP propellant formulation were found to compare favorably with the results of more conventional stability measurement techniques.
Directory of Open Access Journals (Sweden)
Lujin Hu
2016-08-01
Full Text Available Heavy air pollution, especially fine particulate matter (PM2.5, poses serious challenges to environmental sustainability in Beijing. Epidemiological studies and the identification of measures for preventing serious air pollution both require accurate PM2.5 spatial distribution data. Land use regression (LUR models are promising for estimating the spatial distribution of PM2.5 at a high spatial resolution. However, typical LUR models have a limited sampling point explanation rate (SPER, i.e., the rate of the sampling points with reasonable predicted concentrations to the total number of sampling points and accuracy. Hence, self-adaptive revised LUR models are proposed in this paper for improving the SPER and accuracy of typical LUR models. The self-adaptive revised LUR model combines a typical LUR model with self-adaptive LUR model groups. The typical LUR model was used to estimate the PM2.5 concentrations, and the self-adaptive LUR model groups were constructed for all of the sampling points removed from the typical LUR model because they were beyond the prediction data range, which was from 60% of the minimum observation to 120% of the maximum observation. The final results were analyzed using three methods, including an accuracy analysis, and were compared with typical LUR model results and the spatial variations in Beijing. The accuracy satisfied the demands of the analysis, and the accuracies at the different monitoring sites indicated spatial variations in the accuracy of the self-adaptive revised LUR model. The accuracy was high in the central area and low in suburban areas. The comparison analysis showed that the self-adaptive LUR model increased the SPER from 75% to 90% and increased the accuracy (based on the root-mean-square error from 20.643 μg/m3 to 17.443 μg/m3 for the PM2.5 concentrations during the winter of 2014 in Beijing. The spatial variation analysis for Beijing showed that the PM2.5 concentrations were low in the north
Estimation of the Dose and Dose Rate Effectiveness Factor
Chappell, L.; Cucinotta, F. A.
2013-01-01
Current models to estimate radiation risk use the Life Span Study (LSS) cohort that received high doses and high dose rates of radiation. Transferring risks from these high dose rates to the low doses and dose rates received by astronauts in space is a source of uncertainty in our risk calculations. The solid cancer models recommended by BEIR VII [1], UNSCEAR [2], and Preston et al [3] is fitted adequately by a linear dose response model, which implies that low doses and dose rates would be estimated the same as high doses and dose rates. However animal and cell experiments imply there should be curvature in the dose response curve for tumor induction. Furthermore animal experiments that directly compare acute to chronic exposures show lower increases in tumor induction than acute exposures. A dose and dose rate effectiveness factor (DDREF) has been estimated and applied to transfer risks from the high doses and dose rates of the LSS cohort to low doses and dose rates such as from missions in space. The BEIR VII committee [1] combined DDREF estimates using the LSS cohort and animal experiments using Bayesian methods for their recommendation for a DDREF value of 1.5 with uncertainty. We reexamined the animal data considered by BEIR VII and included more animal data and human chromosome aberration data to improve the estimate for DDREF. Several experiments chosen by BEIR VII were deemed inappropriate for application to human risk models of solid cancer risk. Animal tumor experiments performed by Ullrich et al [4], Alpen et al [5], and Grahn et al [6] were analyzed to estimate the DDREF. Human chromosome aberration experiments performed on a sample of astronauts within NASA were also available to estimate the DDREF. The LSS cohort results reported by BEIR VII were combined with the new radiobiology results using Bayesian methods.
Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of
Lo, Ching F.
1999-01-01
The integration of Radial Basis Function Networks and Back Propagation Neural Networks with the Multiple Linear Regression has been accomplished to map nonlinear response surfaces over a wide range of independent variables in the process of the Modem Design of Experiments. The integrated method is capable to estimate the precision intervals including confidence and predicted intervals. The power of the innovative method has been demonstrated by applying to a set of wind tunnel test data in construction of response surface and estimation of precision interval.
Estimating stutter rates for Y-STR alleles
DEFF Research Database (Denmark)
Andersen, Mikkel Meyer; Olofsson, Jill; Mogensen, Helle Smidt;
2011-01-01
Stutter peaks are artefacts that arise during PCR amplification of short tandem repeats. Stutter peaks are especially important in forensic case work with DNA mixtures. The aim of the study was primarily to estimate the stutter rates of the AmpF/STR Yfiler kit. We found that the stutter rates...
Estimated Interest Rate Rules: Do they Determine Determinacy Properties?
DEFF Research Database (Denmark)
Jensen, Henrik
2011-01-01
I demonstrate that econometric estimations of nominal interest rate rules may tell little, if anything, about an economy's determinacy properties. In particular, correct inference about the interest-rate response to inflation provides no information about determinacy. Instead, it could reveal...
Energy Technology Data Exchange (ETDEWEB)
Koo, Young Do; Yoo, Kwae Hwan; Na, Man Gyun [Dept. of Nuclear Engineering, Chosun University, Gwangju (Korea, Republic of)
2017-06-15
Residual stress is a critical element in determining the integrity of parts and the lifetime of welded structures. It is necessary to estimate the residual stress of a welding zone because residual stress is a major reason for the generation of primary water stress corrosion cracking in nuclear power plants. That is, it is necessary to estimate the distribution of the residual stress in welding of dissimilar metals under manifold welding conditions. In this study, a cascaded support vector regression (CSVR) model was presented to estimate the residual stress of a welding zone. The CSVR model was serially and consecutively structured in terms of SVR modules. Using numerical data obtained from finite element analysis by a subtractive clustering method, learning data that explained the characteristic behavior of the residual stress of a welding zone were selected to optimize the proposed model. The results suggest that the CSVR model yielded a better estimation performance when compared with a classic SVR model.
On parameter estimation in the physics lab based on inverting a slope regression coefficient
Jacquet, W; Sijbers, J
2012-01-01
Measurement uncertainty is a non trivial aspect of the laboratory component of most undergraduate physics courses. Confusion about the application of statistical tools calls for the elaboration of guidelines and the elimination of inconsistencies were possible. Linear regression is one of the fundamental statistical tool often used in a first year physics laboratory setting. In what follows we present an argument that leads to an unambiguous choice of (a) variable(s) to be used as predictor(s) and variable to be predicted.
Estimating the causes of traffic accidents using logistic regression and discriminant analysis.
Karacasu, Murat; Ergül, Barış; Altin Yavuz, Arzu
2014-01-01
Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results.
Directory of Open Access Journals (Sweden)
Kohei Arai
2016-10-01
Full Text Available Method for Near Infrared: NIR reflectance estimation with visible camera data based on regression for Normalized Vegetation Index: NDVI estimation is proposed together with its application for insect damage detection of rice paddy fields. Through experiments at rice paddy fields which is situated at Saga Prefectural Agriculture Research Institute SPARI in Saga city, Kyushu, Japan, it is found that there is high correlation between NIR reflectance and Green color reflectance. Therefore, it is possible to estimate NIR reflectance with visible camera data which results in possibility of estimation of NDVI with drone mounted visible camera data. As is well known that the protein content in rice crops is highly correlated with NIR intensity, or reflectance of rice leaves, it is possible to estimate rice crop quality with drone based visible camera data.
Image-based human age estimation by manifold learning and locally adjusted robust regression.
Guo, Guodong; Fu, Yun; Dyer, Charles R; Huang, Thomas S
2008-07-01
Estimating human age automatically via facial image analysis has lots of potential real-world applications, such as human computer interaction and multimedia communication. However, it is still a challenging problem for the existing computer vision systems to automatically and effectively estimate human ages. The aging process is determined by not only the person's gene, but also many external factors, such as health, living style, living location, and weather conditions. Males and females may also age differently. The current age estimation performance is still not good enough for practical use and more effort has to be put into this research direction. In this paper, we introduce the age manifold learning scheme for extracting face aging features and design a locally adjusted robust regressor for learning and prediction of human ages. The novel approach improves the age estimation accuracy significantly over all previous methods. The merit of the proposed approaches for image-based age estimation is shown by extensive experiments on a large internal age database and the public available FG-NET database.
Tvedebrink, Torben; Eriksen, Poul Svante; Asplund, Maria; Mogensen, Helle Smidt; Morling, Niels
2012-03-01
We discuss the model for estimating drop-out probabilities presented by Tvedebrink et al. [7] and the concerns, that have been raised. The criticism of the model has demonstrated that the model is not perfect. However, the model is very useful for advanced forensic genetic work, where allelic drop-out is occurring. With this discussion, we hope to improve the drop-out model, so that it can be used for practical forensic genetics and stimulate further discussions. We discuss how to estimate drop-out probabilities when using a varying number of PCR cycles and other experimental conditions. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
DEFF Research Database (Denmark)
Tvedebrink, Torben; Eriksen, Poul Svante; Asplund, Maria
2012-01-01
We discuss the model for estimating drop-out probabilities presented by Tvedebrink et al. [7] and the concerns, that have been raised. The criticism of the model has demonstrated that the model is not perfect. However, the model is very useful for advanced forensic genetic work, where allelic dro...
Harding, Brian J; Gehrels, Thomas W; Makela, Jonathan J
2014-02-01
The Earth's thermosphere plays a critical role in driving electrodynamic processes in the ionosphere and in transferring solar energy to the atmosphere, yet measurements of thermospheric state parameters, such as wind and temperature, are sparse. One of the most popular techniques for measuring these parameters is to use a Fabry-Perot interferometer to monitor the Doppler width and breadth of naturally occurring airglow emissions in the thermosphere. In this work, we present a technique for estimating upper-atmospheric winds and temperatures from images of Fabry-Perot fringes captured by a CCD detector. We estimate instrument parameters from fringe patterns of a frequency-stabilized laser, and we use these parameters to estimate winds and temperatures from airglow fringe patterns. A unique feature of this technique is the model used for the laser and airglow fringe patterns, which fits all fringes simultaneously and attempts to model the effects of optical defects. This technique yields accurate estimates for winds, temperatures, and the associated uncertainties in these parameters, as we show with a Monte Carlo simulation.
DEFF Research Database (Denmark)
Fauser, Patrik; Thomsen, Marianne; Pistocchi, Alberto
2010-01-01
This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume che...
Linear regressive model structures for estimation and prediction of compartmental diffusive systems
Vries, D.; Keesman, K.J.; Zwart, H.
2006-01-01
Abstract In input-output relations of (compartmental) diffusive systems, physical parameters appear non-linearly, resulting in the use of (constrained) non-linear parameter estimation techniques with its short-comings regarding global optimality and computational effort. Given a LTI system in state
Linear regressive model structures for estimation and prediction of compartmental diffusive systems
Vries, D.; Keesman, K.J.; Zwart, H.J.
2006-01-01
In input-output relations of (compartmental) diffusive systems, physical parameters appear non-linearly, resulting in the use of (constrained) non-linear parameter estimation techniques with its short-comings regarding global optimality and computational effort. Given a LTI system in state space for
Wang, Meng; Brunekreef, Bert; Gehring, Ulrike; Szpiro, Adam; Hoek, Gerard; Beelen, Rob
2016-01-01
BACKGROUND: Leave-one-out cross-validation that fails to account for variable selection does not properly reflect prediction accuracy when the number of training sites is small. The impact on health effect estimates has rarely been studied. METHODS: We randomly generated ten training and test sets f
Normalization Ridge Regression in Practice II: The Estimation of Multiple Feedback Linkages.
Bulcock, J. W.
The use of the two-stage least squares (2 SLS) procedure for estimating nonrecursive social science models is often impractical when multiple feedback linkages are required. This is because 2 SLS is extremely sensitive to multicollinearity. The standard statistical solution to the multicollinearity problem is a biased, variance reduced procedure…
On penalized likelihood estimation for a non-proportional hazards regression model.
Devarajan, Karthik; Ebrahimi, Nader
2013-07-01
In this paper, a semi-parametric generalization of the Cox model that permits crossing hazard curves is described. A theoretical framework for estimation in this model is developed based on penalized likelihood methods. It is shown that the optimal solution to the baseline hazard, baseline cumulative hazard and their ratio are hyperbolic splines with knots at the distinct failure times.
Estimating Unbiased Treatment Effects in Education Using a Regression Discontinuity Design
Directory of Open Access Journals (Sweden)
William C. Smith
2014-08-01
Full Text Available The ability of regression discontinuity (RD designs to provide an unbiased treatment effect while overcoming the ethical concerns plagued by Random Control Trials (RCTs make it a valuable and useful approach in education evaluation. RD is the only explicitly recognized quasi-experimental approach identified by the Institute of Education Statistics to meet the prerequisites of a causal relationship. Unfortunately, the statistical complexity of the RD design has limited its application in education research. This article provides a less technical introduction to RD for education researchers and practitioners. Using visual analysis to aide conceptual understanding, the article walks readers through the essential steps of a Sharp RD design using hypothetical, but realistic, district intervention data and provides additional resources for further exploration.
Estimation of Saturation Flow Rates at Signalized Intersections
Directory of Open Access Journals (Sweden)
Chang-qiao Shao
2012-01-01
Full Text Available The saturation flow rate is a fundamental parameter to measure the intersection capacity and time the traffic signals. However, it is revealed that traditional methods which are mainly developed using the average value of observed queue discharge headways to estimate the saturation headway might lead to underestimate saturation flow rate. The goal of this paper is to study the stochastic nature of queue discharge headways and to develop a more accurate estimate method for saturation headway and saturation flow rate. Based on the surveyed data, the characteristics of queue discharge headways and the estimation method of saturated flow rate are studied. It is found that the average value of queue discharge headways is greater than the median value and that the skewness of the headways is positive. Normal distribution tests were conducted before and after a log transformation of the headways. The goodness-of-fit test showed that for some surveyed sites, the queue discharge headways can be fitted by the normal distribution and for other surveyed sites, the headways can be fitted by lognormal distribution. According to the queue discharge headway characteristics, the median value of queue discharge headways is suggested to estimate the saturation headway and a new method of estimation saturation flow rates is developed.
Katpatal, Y. B.; Paranjpe, S. V.; Kadu, M.
2014-12-01
Effective Watershed management requires authentic data of surface runoff potential for which several methods and models are in use. Generally, non availability of field data calls for techniques based on remote observations. Soil Conservation Services Curve Number (SCS CN) method is an important method which utilizes information generated from remote sensing for estimation of runoff. Several attempts have been made to validate the runoff values generated from SCS CN method by comparing the results obtained from other methods. In the present study, runoff estimation through SCS CN method has been performed using IRS LISS IV data for the Venna Basin situated in the Central India. The field data was available for Venna Basin. The Land use/land cover and soil layers have been generated for the entire watershed using the satellite data and Geographic Information System (GIS). The Venna basin have been divided into intercepted catchment and free catchment. Run off values have been estimated using field data through regression analysis. The runoff values estimated using SCS CN method have been compared with yield values generated using data collected from the tank gauge stations and data from the discharge stations. The correlation helps in validation of the results obtained from the SCS CN method and its applicability in Indian conditions. Key Words: SCS CN Method, Regression Analysis, Land Use / Land cover, Runoff, Remote Sensing, GIS.
Directory of Open Access Journals (Sweden)
Zhongyuan Geng
2015-01-01
Full Text Available This paper applies the Panel Smooth Transition Regression (PSTR model to simulate the effects of the interest rate and reserve requirement ratio on bank risk in China. The results reveal the nonlinearity embedded in the interest rate, reserve requirement ratio, and bank risk nexus. Both the interest rate and reserve requirement ratio exert a positive impact on bank risk for the low regime and a negative impact for the high regime. The interest rate performs a significant effect while the reserve requirement ratio shows an insignificant effect on bank risk on a statistical basis for both the high and low regimes.
Directory of Open Access Journals (Sweden)
Kelly W Jones
Full Text Available Deforestation and conversion of native habitats continues to be the leading driver of biodiversity and ecosystem service loss. A number of conservation policies and programs are implemented--from protected areas to payments for ecosystem services (PES--to deter these losses. Currently, empirical evidence on whether these approaches stop or slow land cover change is lacking, but there is increasing interest in conducting rigorous, counterfactual impact evaluations, especially for many new conservation approaches, such as PES and REDD, which emphasize additionality. In addition, several new, globally available and free high-resolution remote sensing datasets have increased the ease of carrying out an impact evaluation on land cover change outcomes. While the number of conservation evaluations utilizing 'matching' to construct a valid control group is increasing, the majority of these studies use simple differences in means or linear cross-sectional regression to estimate the impact of the conservation program using this matched sample, with relatively few utilizing fixed effects panel methods--an alternative estimation method that relies on temporal variation in the data. In this paper we compare the advantages and limitations of (1 matching to construct the control group combined with differences in means and cross-sectional regression, which control for observable forms of bias in program evaluation, to (2 fixed effects panel methods, which control for observable and time-invariant unobservable forms of bias, with and without matching to create the control group. We then use these four approaches to estimate forest cover outcomes for two conservation programs: a PES program in Northeastern Ecuador and strict protected areas in European Russia. In the Russia case we find statistically significant differences across estimators--due to the presence of unobservable bias--that lead to differences in conclusions about effectiveness. The Ecuador case
Jones, Kelly W; Lewis, David J
2015-01-01
Deforestation and conversion of native habitats continues to be the leading driver of biodiversity and ecosystem service loss. A number of conservation policies and programs are implemented--from protected areas to payments for ecosystem services (PES)--to deter these losses. Currently, empirical evidence on whether these approaches stop or slow land cover change is lacking, but there is increasing interest in conducting rigorous, counterfactual impact evaluations, especially for many new conservation approaches, such as PES and REDD, which emphasize additionality. In addition, several new, globally available and free high-resolution remote sensing datasets have increased the ease of carrying out an impact evaluation on land cover change outcomes. While the number of conservation evaluations utilizing 'matching' to construct a valid control group is increasing, the majority of these studies use simple differences in means or linear cross-sectional regression to estimate the impact of the conservation program using this matched sample, with relatively few utilizing fixed effects panel methods--an alternative estimation method that relies on temporal variation in the data. In this paper we compare the advantages and limitations of (1) matching to construct the control group combined with differences in means and cross-sectional regression, which control for observable forms of bias in program evaluation, to (2) fixed effects panel methods, which control for observable and time-invariant unobservable forms of bias, with and without matching to create the control group. We then use these four approaches to estimate forest cover outcomes for two conservation programs: a PES program in Northeastern Ecuador and strict protected areas in European Russia. In the Russia case we find statistically significant differences across estimators--due to the presence of unobservable bias--that lead to differences in conclusions about effectiveness. The Ecuador case illustrates that
2006-03-01
14 T. Capers Jones states that using functional points requires a certain amount of subjectivity that can lead to complications with the...Network, (June 2004). 29 July 2005 http://ssrn.com/abstract=569875 Foss, Tron, Erik Stensrud, Barbara Kitchenham, and Ingunn Myrtveit. “A...Company-Specific Data,” Information and Software Technology, 42:1009-1016 (2000). Jones, T. Capers . Estimating Software Costs. New York: McGraw-Hill
Outcrossing rates and relatedness estimates in pecan (Carya illinoinensis) populations.
Rüter, B; Hamrick, J L; Wood, B W
2000-01-01
Estimates of single and multilocus outcrossing rates as well as relatedness among progeny of individual seed trees were obtained for 14 populations of pecan [Carya illinoinensis (Wangenh.) K. Koch]. Mean outcrossing estimates were not significantly different from 1.0 and relatedness values indicate that most progeny within families are half sibs. Biparental inbreeding was insignificant in all study sites, and inbreeding coefficients indicated that populations were close to inbreeding equilibrium.
Validation and Implementation of Uncertainty Estimates of Calculated Transition Rates
Directory of Open Access Journals (Sweden)
Jörgen Ekman
2014-05-01
Full Text Available Uncertainties of calculated transition rates in LS-allowed electric dipole transitions in boron-like O IV and carbon-like Fe XXI are estimated using an approach in which differences in line strengths calculated in length and velocity gauges are utilized. Estimated uncertainties are compared and validated against several high-quality theoretical data sets in O IV, and implemented in large scale calculations in Fe XXI.
A Bayesian approach to estimating the prehepatic insulin secretion rate
DEFF Research Database (Denmark)
Andersen, Kim Emil; Højbjerre, Malene
the time courses of insulin and C-peptide subsequently are used as known forcing functions. In this work we adopt a Bayesian graphical model to describe the unied model simultaneously. We develop a model that also accounts for both measurement error and process variability. The parameters are estimated......The prehepatic insulin secretion rate of the pancreatic $beta$-cells is not directly measurable, since part of the secreted insulin is absorbed by the liver prior to entering the blood stream. However, C-peptide is co-secreted equimolarly and is not absorbed by the liver, allowing...... for the estimation of the prehepatic insulin secretion rate. We consider a stochastic differential equation model that combines both insulin and C-peptide concentrations in plasma to estimate the prehepatic insulin secretion rate. Previously this model has been analysed in an iterative deterministic set-up, where...
Fang, Sheng; Guo, Hua
2013-01-01
The parallel magnetic resonance imaging (parallel imaging) technique reduces the MR data acquisition time by using multiple receiver coils. Coil sensitivity estimation is critical for the performance of parallel imaging reconstruction. Currently, most coil sensitivity estimation methods are based on linear interpolation techniques. Such methods may result in Gibbs-ringing artifact or resolution loss, when the resolution of coil sensitivity data is limited. To solve the problem, we proposed a nonlinear coil sensitivity estimation method based on steering kernel regression, which performs a local gradient guided interpolation to the coil sensitivity. The in vivo experimental results demonstrate that this method can effectively suppress Gibbs ringing artifact in coil sensitivity and reduces both noise and residual aliasing artifact level in SENSE reconstruction.
Directory of Open Access Journals (Sweden)
Julio Cesar de Oliveira
2014-04-01
Full Text Available MODerate resolution Imaging Spectroradiometer (MODIS data are largely used in multitemporal analysis of various Earth-related phenomena, such as vegetation phenology, land use/land cover change, deforestation monitoring, and time series analysis. In general, the MODIS products used to undertake multitemporal analysis are composite mosaics of the best pixels over a certain period of time. However, it is common to find bad pixels in the composition that affect the time series analysis. We present a filtering methodology that considers the pixel position (location in space and time (position in the temporal data series to define a new value for the bad pixel. This methodology, called Window Regression (WR, estimates the value of the point of interest, based on the regression analysis of the data selected by a spatial-temporal window. The spatial window is represented by eight pixels neighboring the pixel under evaluation, and the temporal window selects a set of dates close to the date of interest (either earlier or later. Intensities of noises were simulated over time and space, using the MOD13Q1 product. The method presented and other techniques (4253H twice, Mean Value Iteration (MVI and Savitzky–Golay were evaluated using the Mean Absolute Percentage Error (MAPE and Akaike Information Criteria (AIC. The tests revealed the consistently superior performance of the Window Regression approach to estimate new Normalized Difference Vegetation Index (NDVI values irrespective of the intensity of the noise simulated.
Messier, Kyle P; Campbell, Ted; Bradley, Philip J; Serre, Marc L
2015-08-18
Radon ((222)Rn) is a naturally occurring chemically inert, colorless, and odorless radioactive gas produced from the decay of uranium ((238)U), which is ubiquitous in rocks and soils worldwide. Exposure to (222)Rn is likely the second leading cause of lung cancer after cigarette smoking via inhalation; however, exposure through untreated groundwater is also a contributing factor to both inhalation and ingestion routes. A land use regression (LUR) model for groundwater (222)Rn with anisotropic geological and (238)U based explanatory variables is developed, which helps elucidate the factors contributing to elevated (222)Rn across North Carolina. The LUR is also integrated into the Bayesian Maximum Entropy (BME) geostatistical framework to increase accuracy and produce a point-level LUR-BME model of groundwater (222)Rn across North Carolina including prediction uncertainty. The LUR-BME model of groundwater (222)Rn results in a leave-one out cross-validation r(2) of 0.46 (Pearson correlation coefficient = 0.68), effectively predicting within the spatial covariance range. Modeled results of (222)Rn concentrations show variability among intrusive felsic geological formations likely due to average bedrock (238)U defined on the basis of overlying stream-sediment (238)U concentrations that is a widely distributed consistently analyzed point-source data.
Magnetometer-Only Attitude and Rate Estimates for Spinning Spacecraft
Challa, M.; Natanson, G.; Ottenstein, N.
2000-01-01
A deterministic algorithm and a Kalman filter for gyroless spacecraft are used independently to estimate the three-axis attitude and rates of rapidly spinning spacecraft using only magnetometer data. In-flight data from the Wide-Field Infrared Explorer (WIRE) during its tumble, and the Fast Auroral Snapshot Explorer (FAST) during its nominal mission mode are used to show that the algorithms can successfully estimate the above in spite of the high rates. Results using simulated data are used to illustrate the importance of accurate and frequent data.
A comparison of small-area hospitalisation rates, estimated morbidity and hospital access.
Shulman, H; Birkin, M; Clarke, G P
2015-11-01
Published data on hospitalisation rates tend to reveal marked spatial variations within a city or region. Such variations may simply reflect corresponding variations in need at the small-area level. However, they might also be a consequence of poorer accessibility to medical facilities for certain communities within the region. To help answer this question it is important to compare these variable hospitalisation rates with small-area estimates of need. This paper first maps hospitalisation rates at the small-area level across the region of Yorkshire in the UK to show the spatial variations present. Then the Health Survey of England is used to explore the characteristics of persons with heart disease, using chi-square and logistic regression analysis. Using the most significant variables from this analysis the authors build a spatial microsimulation model of morbidity for heart disease for the Yorkshire region. We then compare these estimates of need with the patterns of hospitalisation rates seen across the region.
Directory of Open Access Journals (Sweden)
Tosun Erdi
2017-01-01
Full Text Available This study was aimed at estimating the variation of several engine control parameters within the rotational speed-load map, using regression analysis and artificial neural network techniques. Duration of injection, specific fuel consumption, exhaust gas at turbine inlet, and within the catalytic converter brick were chosen as the output parameters for the models, while engine speed and brake mean effective pressure were selected as independent variables for prediction. Measurements were performed on a turbocharged direct injection spark ignition engine fueled with gasoline. A three-layer feed-forward structure and back-propagation algorithm was used for training the artificial neural network. It was concluded that this technique is capable of predicting engine parameters with better accuracy than linear and non-linear regression techniques.
Exchange Rates and Monetary Fundamentals: What Do We Learn from Linear and Nonlinear Regressions?
Directory of Open Access Journals (Sweden)
Guangfeng Zhang
2014-01-01
Full Text Available This paper revisits the association between exchange rates and monetary fundamentals with the focus on both linear and nonlinear approaches. With the monthly data of Euro/US dollar and Japanese yen/US dollar, our linear analysis demonstrates the monetary model is a long-run description of exchange rate movements, and our nonlinear modelling suggests the error correction model describes the short-run adjustment of deviations of exchange rates, and monetary fundamentals are capable of explaining exchange rate dynamics under an unrestricted framework.
Kügler, S D; Hoecker, M
2014-01-01
Context: In astronomy, new approaches to process and analyze the exponentially increasing amount of data are inevitable. While classical approaches (e.g. template fitting) are fine for objects of well-known classes, alternative techniques have to be developed to determine those that do not fit. Therefore a classification scheme should be based on individual properties instead of fitting to a global model and therefore loose valuable information. An important issue when dealing with large data sets is the outlier detection which at the moment is often treated problem-orientated. Aims: In this paper we present a method to statistically estimate the redshift z based on a similarity approach. This allows us to determine redshifts in spectra in emission as well as in absorption without using any predefined model. Additionally we show how an estimate of the redshift based on single features is possible. As a consequence we are e.g. able to filter objects which show multiple redshift components. We propose to apply ...
Estimating Source Recurrence Rates for Probabilistic Tsunami Hazard Analysis (PTHA)
Geist, E. L.; Parsons, T.
2004-12-01
A critical factor in probabilistic tsunami hazard analysis (PTHA) is estimating the average recurrence rate for tsunamigenic sources. Computational PTHA involves aggregating runup values derived from numerical simulations for many far-field and local sources, primarily earthquakes, each with a specified probability of occurrence. Computational PTHA is the primary method used in the ongoing FEMA pilot study at Seaside, Oregon. For a Poissonian arrival time model, the probability for a given source is dependent on a single parameter: the mean inter-event time of the source. In other probability models, parameters such as aperiodicity are also included. In this study, we focus on methods to determine the recurrence rates for large, shallow subduction zone earthquakes. For earthquakes below about M=8, recurrence rates can be obtained from modified Gutenberg-Richter distributions that are constrained by the tectonic moment rate for individual subduction zones. However, significant runup from far-field sources is commonly associated with the largest magnitude earthquakes, for which the recurrence rates are poorly constrained by the tail of empirical frequency-magnitude relationships. For these earthquakes, paleoseismic evidence of great earthquakes can be used to establish recurrence rates. Because the number of geologic horizons representing great earthquakes along a particular subduction zone is limited, special techniques are needed to account for open intervals before the first and after the last observed events. Uncertainty in age dates for the horizons also has to be included in estimating recurrence rates and aperiodicity. A Monte Carlo simulation is performed in which a random sample of earthquake times is drawn from a specified probability distribution with varying average recurrence rates and aperiodicities. A recurrence rate can be determined from the mean rate of all random samples that fit the observations, or a range of rates can be carried through the
Complex source rate estimation for atmospheric transport and dispersion models
Energy Technology Data Exchange (ETDEWEB)
Edwards, L.L.
1993-09-13
The accuracy associated with assessing the environmental consequences of an accidental atmospheric release of radioactivity is highly dependent on our knowledge of the source release rate which is generally poorly known. This paper reports on a technique that integrates the radiological measurements with atmospheric dispersion modeling for more accurate source term estimation. We construct a minimum least squares methodology for solving the inverse problem with no a priori information about the source rate.
Williams-Sether, Tara
2015-08-06
Annual peak-flow frequency data from 231 U.S. Geological Survey streamflow-gaging stations in North Dakota and parts of Montana, South Dakota, and Minnesota, with 10 or more years of unregulated peak-flow record, were used to develop regional regression equations for exceedance probabilities of 0.5, 0.20, 0.10, 0.04, 0.02, 0.01, and 0.002 using generalized least-squares techniques. Updated peak-flow frequency estimates for 262 streamflow-gaging stations were developed using data through 2009 and log-Pearson Type III procedures outlined by the Hydrology Subcommittee of the Interagency Advisory Committee on Water Data. An average generalized skew coefficient was determined for three hydrologic zones in North Dakota. A StreamStats web application was developed to estimate basin characteristics for the regional regression equation analysis. Methods for estimating a weighted peak-flow frequency for gaged sites and ungaged sites are presented.
Un programme simple de regression non-lineaire pondere adapte aux estimations de biomasse forestiere
Bergez, Jacques-Eric,; BISCH, J.L.; Cabanettes, Alain; Pages, L.
1988-01-01
On présente un programme permettant l’ajustement de données expérimentales aux modèles non-linéaires : Y= a * X1α * X2β +b ou Y= a * X1α + b particulièrement utile dans le domaine des estimations de biomasse forestière (Y = biomasse totale d’un arbre ; Xl = diamètre à 1,30 m ; X2 = hauteur totale). Le calcul intègre à la fois : - la recherche des valeurs optimales des exposants α et β; - la possibilité de pondérer les résidus de la régression par une fonction puissance de la variable e...
The effect of PLS regression in PLS path model estimation when multicollinearity is present
DEFF Research Database (Denmark)
Nielsen, Rikke; Kristensen, Kai; Eskildsen, Jacob
PLS path modelling has previously been found to be robust to multicollinearity both between latent variables and between manifest variables of a common latent variable (see e.g. Cassel et al. (1999), Kristensen, Eskildsen (2005), Westlund et al. (2008)). However, most of the studies investigate...... models with relatively few variables and very simple dependence structures compared to the models that are often estimated in practical settings. A recent study by Nielsen et al. (2009) found that when model structure is more complex, PLS path modelling is not as robust to multicollinearity between...... latent variables as previously assumed. A difference in the standard error of path coefficients of as much as 83% was found between moderate and severe levels of multicollinearity. Large differences were found not only for large path coefficients, but also for small path coefficients and in some cases...
ATTITUDE RATE ESTIMATION BY GPS DOPPLER SIGNAL PROCESSING
Institute of Scientific and Technical Information of China (English)
He Side; Milos Doroslovacki; Guo Zhenyu; Zhang Yufeng
2003-01-01
A method is presented for near-Earth spacecraft or aviation vehicle's attitude rate estimation by using relative Doppler frequency shift of the Global Positioning System (GPS)carrier. It comprises two GPS receiving antennas, a signal processing circuit and an algorithm.The whole system is relatively simple, the cost and wcight, as well as power consumption, are very low.
Optical range and range rate estimation for teleoperator systems
Shields, N. L., Jr.; Kirkpatrick, M., III; Malone, T. B.; Huggins, C. T.
1974-01-01
Range and range rate are crucial parameters which must be available to the operator during remote controlled orbital docking operations. A method was developed for the estimation of both these parameters using an aided television system. An experiment was performed to determine the human operator's capability to measure displayed image size using a fixed reticle or movable cursor as the television aid. The movable cursor was found to yield mean image size estimation errors on the order of 2.3 per cent of the correct value. This error rate was significantly lower than that for the fixed reticle. Performance using the movable cursor was found to be less sensitive to signal-to-noise ratio variation than was that for the fixed reticle. The mean image size estimation errors for the movable cursor correspond to an error of approximately 2.25 per cent in range suggesting that the system has some merit. Determining the accuracy of range rate estimation using a rate controlled cursor will require further experimentation.
Lidar method to estimate emission rates from extended sources
Currently, point measurements, often combined with models, are the primary means by which atmospheric emission rates are estimated from extended sources. However, these methods often fall short in their spatial and temporal resolution and accuracy. In recent years, lidar has emerged as a suitable to...
A Maximum Information Rate Quaternion Filter for Spacecraft Attitude Estimation
Reijneveld, J.; Maas, A.; Choukroun, D.; Kuiper, J.M.
2011-01-01
Building on previous works, this paper introduces a novel continuous-time stochastic optimal linear quaternion estimator under the assumptions of rate gyro measurements and of vector observations of the attitude. A quaternion observation model, which observation matrix is rank degenerate, is reduced
Estimating Ads’ Click through Rate with Recurrent Neural Network
Directory of Open Access Journals (Sweden)
Chen Qiao-Hong
2016-01-01
Full Text Available With the development of the Internet, online advertising spreads across every corner of the world, the ads' click through rate (CTR estimation is an important method to improve the online advertising revenue. Compared with the linear model, the nonlinear models can study much more complex relationships between a large number of nonlinear characteristics, so as to improve the accuracy of the estimation of the ads’ CTR. The recurrent neural network (RNN based on Long-Short Term Memory (LSTM is an improved model of the feedback neural network with ring structure. The model overcomes the problem of the gradient of the general RNN. Experiments show that the RNN based on LSTM exceeds the linear models, and it can effectively improve the estimation effect of the ads’ click through rate.
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
Directory of Open Access Journals (Sweden)
Iulian Voicu
2014-01-01
Full Text Available Characterizing fetal wellbeing with a Doppler ultrasound device requires computation of a score based on fetal parameters. In order to analyze the parameters derived from the fetal heart rate correctly, an accuracy of 0.25 beats per minute is needed. Simultaneously with the lowest false negative rate and the highest sensitivity, we investigated whether various Doppler techniques ensure this accuracy. We found that the accuracy was ensured if directional Doppler signals and autocorrelation estimation were used. Our best estimator provided sensitivity of 95.5%, corresponding to an improvement of 14% compared to the standard estimator.
Tutorial: Survival Estimation for Cox Regression Models with Time-Varying Coe?cients Using SAS and R
Directory of Open Access Journals (Sweden)
Laine Thomas
2014-10-01
Full Text Available Survival estimates are an essential compliment to multivariable regression models for time-to-event data, both for prediction and illustration of covariate effects. They are easily obtained under the Cox proportional-hazards model. In populations defined by an initial, acute event, like myocardial infarction, or in studies with long-term followup, the proportional-hazards assumption of constant hazard ratios is frequently violated. One alternative is to fit an interaction between covariates and a prespecified function of time, implemented as a time-dependent covariate. This effectively creates a time-varying coefficient that is easily estimated in software such as SAS and R. However, the usual programming statements for survival estimation are not directly applicable. Unique data manipulation and syntax is required, but is not well documented for either software. This paper offers a tutorial in survival estimation for the time-varying coefficient model, implemented in SAS and R. We provide a macro coxtvc to facilitate estimation in SAS where the current functionality is more limited. The macro is validated in simulated data and illustrated in an application.
Estimation of Aerosol Optical Depth at Different Wavelengths by Multiple Regression Method
Tan, Fuyi; Lim, Hwee San; Abdullah, Khiruddin; Holben, Brent
2015-01-01
This study aims to investigate and establish a suitable model that can help to estimate aerosol optical depth (AOD) in order to monitor aerosol variations especially during non-retrieval time. The relationship between actual ground measurements (such as air pollution index, visibility, relative humidity, temperature, and pressure) and AOD obtained with a CIMEL sun photometer was determined through a series of statistical procedures to produce an AOD prediction model with reasonable accuracy. The AOD prediction model calibrated for each wavelength has a set of coefficients. The model was validated using a set of statistical tests. The validated model was then employed to calculate AOD at different wavelengths. The results show that the proposed model successfully predicted AOD at each studied wavelength ranging from 340 nm to 1020 nm. To illustrate the application of the model, the aerosol size determined using measure AOD data for Penang was compared with that determined using the model. This was done by examining the curvature in the ln [AOD]-ln [wavelength] plot. Consistency was obtained when it was concluded that Penang was dominated by fine mode aerosol in 2012 and 2013 using both measured and predicted AOD data. These results indicate that the proposed AOD prediction model using routine measurements as input is a promising tool for the regular monitoring of aerosol variation during non-retrieval time.
A Regression Equation for the Estimation of Maximum Oxygen Uptake in Nepalese Adult Females
Chatterjee, Pinaki; Banerjee, Alok K; Das, Paulomi; Debnath, Parimal
2010-01-01
Purpose Validity of the 20-meter multi stage shuttle run test (20-m MST) has not been studied in Nepalese population. The purpose of this study was to validate the applicability of the 20-m MST in Nepalese adult females. Methods Forty female college students (age range, 20.42 ~24.75 years) from different colleges of Nepal were recruited for the study. Direct estimation of VO2 max comprised treadmill exercise followed by expired gas analysis by scholander micro-gas analyzer whereas VO2 max was indirectly predicted by the 20-m MST. Results The difference between the mean (±SD) VO2 max values of direct measurement (VO2 max = 32.78 +/-2.88 ml/kg/min) and the 20-m MST (SPVO2 max = 32.53 + /-3.36 ml/kg/min) was statistically insignificant (P>0.1). Highly significant correlation (r=0.94, PVO2 max. Limits of agreement analysis also suggest that the 20-m MST can be applied for the studied population. Conclusion The results of limits of agreement analysis suggest that the application of the present form of the 20-m MST may be justified in the studied population. However, for better prediction of VO2 max, a new equation has been computed based on the present data to be used for female college students of Nepal. PMID:22375191
Estimation of aerosol optical depth at different wavelengths by multiple regression method.
Tan, Fuyi; Lim, Hwee San; Abdullah, Khiruddin; Holben, Brent
2016-02-01
This study aims to investigate and establish a suitable model that can help to estimate aerosol optical depth (AOD) in order to monitor aerosol variations especially during non-retrieval time. The relationship between actual ground measurements (such as air pollution index, visibility, relative humidity, temperature, and pressure) and AOD obtained with a CIMEL sun photometer was determined through a series of statistical procedures to produce an AOD prediction model with reasonable accuracy. The AOD prediction model calibrated for each wavelength has a set of coefficients. The model was validated using a set of statistical tests. The validated model was then employed to calculate AOD at different wavelengths. The results show that the proposed model successfully predicted AOD at each studied wavelength ranging from 340 nm to 1020 nm. To illustrate the application of the model, the aerosol size determined using measure AOD data for Penang was compared with that determined using the model. This was done by examining the curvature in the ln [AOD]-ln [wavelength] plot. Consistency was obtained when it was concluded that Penang was dominated by fine mode aerosol in 2012 and 2013 using both measured and predicted AOD data. These results indicate that the proposed AOD prediction model using routine measurements as input is a promising tool for the regular monitoring of aerosol variation during non-retrieval time.
Mean square convergence rates for maximum quasi-likelihood estimator
Directory of Open Access Journals (Sweden)
Arnoud V. den Boer
2015-03-01
Full Text Available In this note we study the behavior of maximum quasilikelihood estimators (MQLEs for a class of statistical models, in which only knowledge about the first two moments of the response variable is assumed. This class includes, but is not restricted to, generalized linear models with general link function. Our main results are related to guarantees on existence, strong consistency and mean square convergence rates of MQLEs. The rates are obtained from first principles and are stronger than known a.s. rates. Our results find important application in sequential decision problems with parametric uncertainty arising in dynamic pricing.
A Pulse Rate Estimation Algorithm Using PPG and Smartphone Camera.
Siddiqui, Sarah Ali; Zhang, Yuan; Feng, Zhiquan; Kos, Anton
2016-05-01
The ubiquitous use and advancement in built-in smartphone sensors and the development in big data processing have been beneficial in several fields including healthcare. Among the basic vitals monitoring, pulse rate monitoring is the most important healthcare necessity. A multimedia video stream data acquired by built-in smartphone camera can be used to estimate it. In this paper, an algorithm that uses only smartphone camera as a sensor to estimate pulse rate using PhotoPlethysmograph (PPG) signals is proposed. The results obtained by the proposed algorithm are compared with the actual pulse rate and the maximum error found is 3 beats per minute. The standard deviation in percentage error and percentage accuracy is found to be 0.68 % whereas the average percentage error and percentage accuracy is found to be 1.98 % and 98.02 % respectively.
Estimating risk factors of urban malaria in Blantyre, Malawi：A spatial regression analysis
Institute of Scientific and Technical Information of China (English)
Lawrence N Kazembe; Don P Mathanga
2016-01-01
Objective: To estimate risk factors of urban malaria in Blantyre, Malawi, with the goal of understanding the epidemiology and ecology of the disease, and informing malaria elimination policies for African urban cities that have markedly low prevalence of malaria. Methods: We used a case-control study design, with cases being children under the age of five years diagnosed with malaria, and matched controls obtained at hospital and communities. The data were obtained from Ndirande health facility catchment area. We then fitted a multivariate spatial logistic model of malaria risk. Covariate and risk factors in the model included child-specific, household and environmental risk factor (nearness to garden, standing water, river and swamps). The spatial component was assumed to follow a Gaussian process and model fitted using Bayesian inference. Results: Our findings showed that children who visited rural areas were 6 times more likely to have malaria than those who did not [odds ratio (OR)=6.66, 95%confidence interval (CI):4.79–9.61]. The risk of malaria increased with age of the child (OR=1.01, 95%CI:1.003–1.020), but reduced with high socio-economic status compared to lower status (OR=0.39, 95%CI:0.25–0.54 for the highest level and OR=0.67, 95%CI:0.47–0.94 for the medium level). Although nearness to a garden, river and standing water showed increased risk, these effects were not significant. Furthermore, significant spatial clusters of risk emerged, which does suggest other factors do explain malaria risk vari-ability apart from those established above. Conclusions: As malaria in urban areas is highly fuelled by rural-urban migration, emphasis should be to optimize information, education and communication prevention strategies, particularly targeting children from lower socio-economic position.
Estimating risk factors of urban malaria in Blantyre, Malawi: A spatial regression analysis
Institute of Scientific and Technical Information of China (English)
Lawrence N.Kazembe; Don P.Mathanga
2016-01-01
Objective: To estimate risk factors of urban malaria in Blantyre, Malawi, with the goal of understanding the epidemiology and ecology of the disease, and informing malaria elimination policies for African urban cities that have markedly low prevalence of malaria.Methods: We used a case-control study design, with cases being children under the age of five years diagnosed with malaria, and matched controls obtained at hospital and communities. The data were obtained from Ndirande health facility catchment area. We then fitted a multivariate spatial logistic model of malaria risk. Covariate and risk factors in the model included child-specific, household and environmental risk factor(nearness to garden, standing water, river and swamps). The spatial component was assumed to follow a Gaussian process and model fitted using Bayesian inference.Results: Our findings showed that children who visited rural areas were 6 times more likely to have malaria than those who did not [odds ratio(OR) = 6.66, 95% confidence interval(CI): 4.79–9.61]. The risk of malaria increased with age of the child(OR = 1.01,95% CI: 1.003–1.020), but reduced with high socio-economic status compared to lower status(OR = 0.39, 95% CI: 0.25–0.54 for the highest level and OR = 0.67, 95% CI: 0.47–0.94 for the medium level). Although nearness to a garden, river and standing water showed increased risk, these effects were not significant. Furthermore, significant spatial clusters of risk emerged, which does suggest other factors do explain malaria risk variability apart from those established above.Conclusions: As malaria in urban areas is highly fuelled by rural-urban migration,emphasis should be to optimize information, education and communication prevention strategies, particularly targeting children from lower socio-economic position.
Institute of Scientific and Technical Information of China (English)
YANG Xiao-Hua; WANG Fu-Min; HUANG Jing-Feng; WANG Jian-Wen; WANG Ren-Chao; SHEN Zhang-Quan; WANG Xiu-Zhen
2009-01-01
The radial basis function (RBF) emerged as a variant of artificial neural network.Generalized regression neural network (GRNN) is one type of RBF,and its principal advantages are that it can quickly learn and rapidly converge to the optimal regression surface with large number of data sets.Hyperspectral reflectance (350 to 2 500 nm) data were recorded at two different rice sites in two experiment fields with two cultivars,three nitrogen treatments and one plant density (45 plants m-2).Stepwise multivariable regression model (SMR) and RBF were used to compare their predictability for the leaf area index (LAI) and green leaf chlorophyll density (GLCD) of rice based on reflectance (R) and its three different transformations,the first derivative reflectance (D1),the second derivative reflectance (D2) and the log-transformed reflectance (LOG).GRNN based on D1 was the best model for the prediction of rice LAI and GLCD.The relationships between different transformations of reflectance and rice parameters could be further improved when RBF was employed.Owing to its strong capacity for nonlinear mapping and good robustness,GRNN could maximize the sensitivity to chlorophyll content using D1.It is concluded that RBF may provide a useful exploratory and predictive tool for the estimation of rice biophysical parameters.
Skou, Peter B; Berg, Thilo A; Aunsbjerg, Stina D; Thaysen, Dorrit; Rasmussen, Morten A; van den Berg, Frans
2017-03-01
Reuse of process water in dairy ingredient production-and food processing in general-opens the possibility for sustainable water regimes. Membrane filtration processes are an attractive source of process water recovery since the technology is already utilized in the dairy industry and its use is expected to grow considerably. At Arla Foods Ingredients (AFI), permeate from a reverse osmosis polisher filtration unit is sought to be reused as process water, replacing the intake of potable water. However, as for all dairy and food producers, the process water quality must be monitored continuously to ensure food safety. In the present investigation we found urea to be the main organic compound, which potentially could represent a microbiological risk. Near infrared spectroscopy (NIRS) in combination with multivariate modeling has a long-standing reputation as a real-time measurement technology in quality assurance. Urea was quantified Using NIRS and partial least squares regression (PLS) in the concentration range 50-200 ppm (RMSEP = 12 ppm, R(2 )= 0.88) in laboratory settings with potential for on-line application. A drawback of using NIRS together with PLS is that uncertainty estimates are seldom reported but essential to establishing real-time risk assessment. In a multivariate regression setting, sample-specific prediction errors are needed, which complicates the uncertainty estimation. We give a straightforward strategy for implementing an already developed, but seldom used, method for estimating sample-specific prediction uncertainty. We also suggest an improvement. Comparing independent reference analyses with the sample-specific prediction error estimates showed that the method worked on industrial samples when the model was appropriate and unbiased, and was simple to implement.
Bardsley, Nicholas; Büchs, Milena; Schnepf, Sylke V
2017-01-01
Consumption surveys often record zero purchases of a good because of a short observation window. Measures of distribution are then precluded and only mean consumption rates can be inferred. We show that Propensity Score Matching can be applied to recover the distribution of consumption rates. We demonstrate the method using the UK National Travel Survey, in which c.40% of motorist households purchase no fuel. Estimated consumption rates are plausible judging by households' annual mileages, and highly skewed. We apply the same approach to estimate CO2 emissions and outcomes of a carbon cap or tax. Reliance on means apparently distorts analysis of such policies because of skewness of the underlying distributions. The regressiveness of a simple tax or cap is overstated, and redistributive features of a revenue-neutral policy are understated.
Peer Rated Therapeutic Talent and Affective Sensitivity: A Multiple Regression Approach.
Jackson, Eugene
1985-01-01
Used peer rated measures of Warmth, Understanding and Openness to predict scores on the Kagan Affective Sensitivity Scale-E80 among 66 undergraduates who had participated in interpersonal skills training groups. Results indicated that, as an additively composite index of Therapeutic Talent, they were positively correlated with affective…
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Directory of Open Access Journals (Sweden)
Carla Maria Abido Valentini
2008-03-01
Full Text Available Many research groups have being studying the contribution of tropical forests to the global carbon cycle, and theclimatic consequences of substituting the forests for pastures. Considering that soil CO2 efflux is the greater component of the carboncycle of the biosphere, this work found an equation for estimating the soil CO2 efflux of an area of the Transition Forest, using a modelof multiple regression for time series data of temperature and soil moisture. The study was carried out in the northwest of MatoGrosso, Brazil (11°24.75’S; 55°19.50’W, in a transition forest between cerrado and AmazonForest, 50 km far from Sinop county.Each month, throughout one year, it was measured soil CO2 efflux, temperature and soil moisture. The annual average of soil CO2 efflux was 7.5 ± 0.6 (mean ± SE ì mol m-2 s-1, the annual mean soil temperature was 25,06 ± 0.12 (mean ± SE ºC. The study indicatedthat the humidity had high influence on soil CO2 efflux; however the results were more significant using a multiple regression modelthat estimated the logarithm of soil CO2 efflux, considering time, soil moisture and the interaction between time duration and theinverse of soil temperature. .
Directory of Open Access Journals (Sweden)
Maria Gabriela Campolina Diniz Peixoto
2014-05-01
Full Text Available The objective of this work was to compare random regression models for the estimation of genetic parameters for Guzerat milk production, using orthogonal Legendre polynomials. Records (20,524 of test-day milk yield (TDMY from 2,816 first-lactation Guzerat cows were used. TDMY grouped into 10-monthly classes were analyzed for additive genetic effect and for environmental and residual permanent effects (random effects, whereas the contemporary group, calving age (linear and quadratic effects and mean lactation curve were analized as fixed effects. Trajectories for the additive genetic and permanent environmental effects were modeled by means of a covariance function employing orthogonal Legendre polynomials ranging from the second to the fifth order. Residual variances were considered in one, four, six, or ten variance classes. The best model had six residual variance classes. The heritability estimates for the TDMY records varied from 0.19 to 0.32. The random regression model that used a second-order Legendre polynomial for the additive genetic effect, and a fifth-order polynomial for the permanent environmental effect is adequate for comparison by the main employed criteria. The model with a second-order Legendre polynomial for the additive genetic effect, and that with a fourth-order for the permanent environmental effect could also be employed in these analyses.
Directory of Open Access Journals (Sweden)
Y. Plancherel
2013-07-01
Full Text Available Quantifying oceanic anthropogenic carbon uptake by monitoring interior dissolved inorganic carbon (DIC concentrations is complicated by the influence of natural variability. The "eMLR method" aims to address this issue by using empirical regression fits of the data instead of the data themselves, inferring the change in anthropogenic carbon in time by difference between predictions generated by the regressions at each time. The advantages of the method are that it provides in principle a means to filter out natural variability, which theoretically becomes the regression residuals, and a way to deal with sparsely and unevenly distributed data. The degree to which these advantages are realized in practice is unclear, however. The ability of the eMLR method to recover the anthropogenic carbon signal is tested here using a global circulation and biogeochemistry model in which the true signal is known. Results show that regression model selection is particularly important when the observational network changes in time. When the observational network is fixed, the likelihood that co-located systematic misfits between the empirical model and the underlying, yet unknown, true model cancel is greater, improving eMLR results. Changing the observational network modifies how the spatio-temporal variance pattern is captured by the respective datasets, resulting in empirical models that are dynamically or regionally inconsistent, leading to systematic errors. In consequence, the use of regression formulae that change in time to represent systematically best-fit models at all times does not guarantee the best estimates of anthropogenic carbon change if the spatial distributions of the stations emphasize hydrographic features differently in time. Other factors, such as a balanced and representative station coverage, vertical continuity of the regression formulae consistent with the hydrographic context and resiliency of the spatial distribution of the residual
LENUS (Irish Health Repository)
Francis, Dawn L
2011-03-01
The adenoma detection rate (ADR) is a quality benchmark for colonoscopy. Many practices find it difficult to determine the ADR because it requires a combination of endoscopic and histologic findings. It may be possible to apply a conversion factor to estimate the ADR from the polyp detection rate (PDR).
Rate of Penetration Optimization using Moving Horizon Estimation
Directory of Open Access Journals (Sweden)
Dan Sui
2016-07-01
Full Text Available Increase of drilling safety and reduction of drilling operation costs, especially improvement of drilling efficiency, are two important considerations in the oil and gas industry. The rate of penetration (ROP, alternatively called as drilling speed is a critical drilling parameter to evaluate and improve drilling safety and efficiency. ROP estimation has an important role in drilling optimization as well as interpretation of all stages of the well life cycle. In this paper, we use a moving horizon estimation (MHE method to estimate ROP as well as other drilling parameters. In the MHE formulation the states are estimated by a forward simulation with a pre-estimating observer. Moreover, it considers the constraints of states/outputs in the MHE problem. It is shown that the estimation error is with input-to-state stability. Furthermore, the ROP optimization (to achieve minimum drilling cost/drilling energy concerning with the efficient hole cleaning condition and downhole environmental stability is presented. The performance of the methodology is demonstrated by one case study.
Heart rate variability regression and risk of sudden unexpected death in epilepsy.
Galli, Alessio; Lombardi, Federico
2017-02-01
The exact mechanisms of sudden unexpected death in epilepsy remain elusive, despite there is consensus that SUDEP is associated with severe derangements in the autonomic control to vital functions as breathing and heart rate regulation. Heart rate variability (HRV) has been advocated as biomarker of autonomic control to the heart. Cardiac dysautonomia has been found in diseases where other branches of the autonomous nervous system are damaged, as Parkinson disease and multiple system atrophy. In this perspective, an impaired HRV not only is a risk factor for sudden cardiac death mediated by arrhythmias, but also a potential biomarker for monitoring a progressive decline of the autonomous nervous system. This slope may lead to an acute imbalance of the regulatory pathways of vital functions after seizure and then to SUDEP. Copyright © 2016 Elsevier Ltd. All rights reserved.
Institute of Scientific and Technical Information of China (English)
L.P. Karjalainen; M.C. Somani; S.F. Medina
2004-01-01
The analysis of numerous experimental equations published in the literature reveals a wide scatter in the predictions for the static recrystallization kinetics of steels. The powers of the deformation variables, strain and strain rate, similarly as the power of the grain size vary in these equations. These differences are highlighted and the typical values are compared between torsion and compression tests. Potential errors in physical simulation testing are discussed.
Carbon Nanotube Growth Rate Regression using Support Vector Machines and Artificial Neural Networks
2014-03-27
rates are realized by this faster search. 1.3 Assumptions The machine learning approach used for extracting optimal growth parameters assumes the catalyst...and high strength polymers. [25] All carbon to carbon bonds are filled in a CNT so they are chemically inert and stable in acids, bases and solvents ...research in maximizing CNT length. SWNTs of 18.5 cm in length were obtained by using an ethanol precursor and an iron molybdenum catalyst [10]. Also, by
Energy Technology Data Exchange (ETDEWEB)
Nakagawa, S. [Maizuru National College of Technology, Kyoto (Japan); Kenmoku, Y.; Sakakibara, T. [Toyohashi University of Technology, Aichi (Japan); Kawamoto, T. [Shizuoka University, Shizuoka (Japan). Faculty of Engineering
1996-10-27
Study is under way for a more accurate solar radiation quantity prediction for the enhancement of solar energy utilization efficiency. Utilizing the technique of roughly estimating the day`s clearness index from forecast weather, the forecast weather (constituted of weather conditions such as `clear,` `cloudy,` etc., and adverbs or adjectives such as `afterward,` `temporary,` and `intermittent`) has been quantified relative to the clearness index. This index is named the `weather index` for the purpose of this article. The error high in rate in the weather index relates to cloudy days, which means a weather index falling in 0.2-0.5. It has also been found that there is a high correlation between the clearness index and the north-south wind direction component. A multiple regression analysis has been carried out, under the circumstances, for the estimation of clearness index from the maximum temperature and the north-south wind direction component. As compared with estimation of the clearness index on the basis only of the weather index, estimation using the weather index and maximum temperature achieves a 3% improvement throughout the year. It has also been learned that estimation by use of the weather index and north-south wind direction component enables a 2% improvement for summer and a 5% or higher improvement for winter. 2 refs., 6 figs., 4 tabs.
Kahane, Leo H
2007-01-01
Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:
Glomerular filtration rate in cows estimated by a prediction formula.
Murayama, Isao; Miyano, Anna; Sato, Tsubasa; Iwama, Ryosuke; Satoh, Hiroshi; Ichijyo, Toshihiro; Sato, Shigeru; Furuhama, Kazuhisa
2014-12-01
To testify the relevance of Jacobsson's equation for estimating bovine glomerular filtration rate (GFR), we prepared an integrated formula based on its equation using clinically healthy dairy (n=99) and beef (n=63) cows, and cows with reduced renal function (n=15). The isotonic, nonionic, contrast medium iodixanol was utilized as a test tracer. The GFR values estimated from the integrated formula were well consistent with those from the standard multisample method in each cow strain, and the Holstein equation prepared by a single blood sample in Holstein dairy cows. The basal reference GFR value in healthy dairy cows was significantly higher than that in healthy beef cows, presumably due to a breed difference or physiological state difference. It is concluded that the validity for the application of Jacobsson's equation to estimate bovine GFR is proven and it can be used in bovine practices.
Directory of Open Access Journals (Sweden)
N. Zahir
2015-12-01
Full Text Available Lake Urmia is one of the most important ecosystems of the country which is on the verge of elimination. Many factors contribute to this crisis among them is the precipitation, paly important roll. Precipitation has many forms one of them is in the form of snow. The snow on Sahand Mountain is one of the main and important sources of the Lake Urmia’s water. Snow Depth (SD is vital parameters for estimating water balance for future year. In this regards, this study is focused on SD parameter using Special Sensor Microwave/Imager (SSM/I instruments on board the Defence Meteorological Satellite Program (DMSP F16. The usual statistical methods for retrieving SD include linear and non-linear ones. These methods used least square procedure to estimate SD model. Recently, kernel base methods widely used for modelling statistical problem. From these methods, the support vector regression (SVR is achieved the high performance for modelling the statistical problem. Examination of the obtained data shows the existence of outlier in them. For omitting these outliers, wavelet denoising method is applied. After the omission of the outliers it is needed to select the optimum bands and parameters for SVR. To overcome these issues, feature selection methods have shown a direct effect on improving the regression performance. We used genetic algorithm (GA for selecting suitable features of the SSMI bands in order to estimate SD model. The results for the training and testing data in Sahand mountain is [R²_TEST=0.9049 and RMSE= 6.9654] that show the high SVR performance.
Simple estimate of fission rate during JCO criticality accident
Energy Technology Data Exchange (ETDEWEB)
Oyamatsu, Kazuhiro [Faculty of Studies on Contemporary Society, Aichi Shukutoku Univ., Nagakute, Aichi (Japan)
2000-03-01
The fission rate during JCO criticality accident is estimated from fission-product (FP) radioactivities in a uranium solution sample taken from the preparation basin 20 days after the accident. The FP radioactivity data are taken from a report by JAERI released in the Accident Investigation Committee. The total fission number is found quite dependent on the FP radioactivities and estimated to be about 4x10{sup 16} per liter, or 2x10{sup 18} per 16 kgU (assuming uranium concentration 278.9 g/liter). On the contrary, the time dependence of the fission rate is rather insensitive to the FP radioactivities. Hence, it is difficult to determine the fission number in the initial burst from the radioactivity data. (author)
Bener, A; Hussain, S J; Al-Malki, M a; Shotar, M M; Al-Said, M F; Jadaan, K S
2010-03-01
Smeed's equation is a widely used model for prediction of traffic fatalities but has been inadequate for use in developing countries. We applied regression analysis to time-series data on vehicles, exponential models for fatality prediction, producing an average absolute error of 20.9% for Qatar, 10.9% for population and traffic fatalities in the United Arab Emirates (UAE), Jordan and Qatar. The data were fitted to Jordan and 5.5% for the UAE. We found a strong linear relationship between gross domestic product and fatality rate.
Estimating Divergence Times and Substitution Rates in Rhizobia.
Chriki-Adeeb, Rim; Chriki, Ali
2016-01-01
Accurate estimation of divergence times of soil bacteria that form nitrogen-fixing associations with most leguminous plants is challenging because of a limited fossil record and complexities associated with molecular clocks and phylogenetic diversity of root nodule bacteria, collectively called rhizobia. To overcome the lack of fossil record in bacteria, divergence times of host legumes were used to calibrate molecular clocks and perform phylogenetic analyses in rhizobia. The 16S rRNA gene and intergenic spacer region remain among the favored molecular markers to reconstruct the timescale of rhizobia. We evaluate the performance of the random local clock model and the classical uncorrelated lognormal relaxed clock model, in combination with four tree models (coalescent constant size, birth-death, birth-death incomplete sampling, and Yule processes) on rhizobial divergence time estimates. Bayes factor tests based on the marginal likelihoods estimated from the stepping-stone sampling analyses strongly favored the random local clock model in combination with Yule process. Our results on the divergence time estimation from 16S rRNA gene and intergenic spacer region sequences are compatible with age estimates based on the conserved core genes but significantly older than those obtained from symbiotic genes, such as nodIJ genes. This difference may be due to the accelerated evolutionary rates of symbiotic genes compared to those of other genomic regions not directly implicated in nodulation processes.
Institute of Scientific and Technical Information of China (English)
MAZhangyong; YANYongqing; ZHAOChunming; YOUXiaohu
2003-01-01
In this paper, an improved channel esti-mation algorithm based on tracking the level crossing rate (LCR) for fading rate is proposed in the CDMA systems with the continuous pilot channel. By using a simple LCRestimator, the Doppler-shift can be calculated approxi-mately, thus the observation length of the channel estima-tion can be adjusted dynamically. The procedure is pre-sented which includes the iterative algorithm for the time varying channel. Moreover, computer simulation results show that the algorithm achieves good tradeoff between the noise compression capability and the channel tracking performance.
Phylogenetic estimates of diversification rate are affected by molecular rate variation.
Duchêne, D A; Hua, X; Bromham, L
2017-10-01
Molecular phylogenies are increasingly being used to investigate the patterns and mechanisms of macroevolution. In particular, node heights in a phylogeny can be used to detect changes in rates of diversification over time. Such analyses rest on the assumption that node heights in a phylogeny represent the timing of diversification events, which in turn rests on the assumption that evolutionary time can be accurately predicted from DNA sequence divergence. But there are many influences on the rate of molecular evolution, which might also influence node heights in molecular phylogenies, and thus affect estimates of diversification rate. In particular, a growing number of studies have revealed an association between the net diversification rate estimated from phylogenies and the rate of molecular evolution. Such an association might, by influencing the relative position of node heights, systematically bias estimates of diversification time. We simulated the evolution of DNA sequences under several scenarios where rates of diversification and molecular evolution vary through time, including models where diversification and molecular evolutionary rates are linked. We show that commonly used methods, including metric-based, likelihood and Bayesian approaches, can have a low power to identify changes in diversification rate when molecular substitution rates vary. Furthermore, the association between the rates of speciation and molecular evolution rate can cause the signature of a slowdown or speedup in speciation rates to be lost or misidentified. These results suggest that the multiple sources of variation in molecular evolutionary rates need to be considered when inferring macroevolutionary processes from phylogenies. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
Functional response models to estimate feeding rates of wading birds
Collazo, J.A.; Gilliam, J.F.; Miranda-Castro, L.
2010-01-01
Forager (predator) abundance may mediate feeding rates in wading birds. Yet, when modeled, feeding rates are typically derived from the purely prey-dependent Holling Type II (HoII) functional response model. Estimates of feeding rates are necessary to evaluate wading bird foraging strategies and their role in food webs; thus, models that incorporate predator dependence warrant consideration. Here, data collected in a mangrove swamp in Puerto Rico in 1994 were reanalyzed, reporting feeding rates for mixed-species flocks after comparing fits of the HoII model, as used in the original work, to the Beddington-DeAngelis (BD) and Crowley-Martin (CM) predator-dependent models. Model CM received most support (AIC c wi = 0.44), but models BD and HoII were plausible alternatives (AIC c ??? 2). Results suggested that feeding rates were constrained by predator abundance. Reductions in rates were attributed to interference, which was consistent with the independently observed increase in aggression as flock size increased (P rates. However, inferences derived from the HoII model, as used in the original work, were sound. While Holling's Type II and other purely prey-dependent models have fostered advances in wading bird foraging ecology, evaluating models that incorporate predator dependence could lead to a more adequate description of data and processes of interest. The mechanistic bases used to derive models used here lead to biologically interpretable results and advance understanding of wading bird foraging ecology.
Estimation of multiple transmission rates for epidemics in heterogeneous populations.
Cook, Alex R; Otten, Wilfred; Marion, Glenn; Gibson, Gavin J; Gilligan, Christopher A
2007-12-18
One of the principal challenges in epidemiological modeling is to parameterize models with realistic estimates for transmission rates in order to analyze strategies for control and to predict disease outcomes. Using a combination of replicated experiments, Bayesian statistical inference, and stochastic modeling, we introduce and illustrate a strategy to estimate transmission parameters for the spread of infection through a two-phase mosaic, comprising favorable and unfavorable hosts. We focus on epidemics with local dispersal and formulate a spatially explicit, stochastic set of transition probabilities using a percolation paradigm for a susceptible-infected (S-I) epidemiological model. The S-I percolation model is further generalized to allow for multiple sources of infection including external inoculum and host-to-host infection. We fit the model using Bayesian inference and Markov chain Monte Carlo simulation to successive snapshots of damping-off disease spreading through replicated plant populations that differ in relative proportions of favorable and unfavorable hosts and with time-varying rates of transmission. Epidemiologically plausible parametric forms for these transmission rates are compared by using the deviance information criterion. Our results show that there are four transmission rates for a two-phase system, corresponding to each combination of infected donor and susceptible recipient. Knowing the number and magnitudes of the transmission rates allows the dominant pathways for transmission in a heterogeneous population to be identified. Finally, we show how failure to allow for multiple transmission rates can overestimate or underestimate the rate of spread of epidemics in heterogeneous environments, which could lead to marked failure or inefficiency of control strategies.
bz-rates: A Web Tool to Estimate Mutation Rates from Fluctuation Analysis.
Gillet-Markowska, Alexandre; Louvel, Guillaume; Fischer, Gilles
2015-09-02
Fluctuation analysis is the standard experimental method for measuring mutation rates in micro-organisms. The appearance of mutants is classically described by a Luria-Delbrück distribution composed of two parameters: the number of mutations per culture (m) and the differential growth rate between mutant and wild-type cells (b). A precise estimation of these two parameters is a prerequisite to the calculation of the mutation rate. Here, we developed bz-rates, a Web tool to calculate mutation rates that provides three useful advances over existing Web tools. First, it allows taking into account b, the differential growth rate between mutant and wild-type cells, in the estimation of m with the generating function. Second, bz-rates allows the user to take into account a deviation from the Luria-Delbrück distribution called z, the plating efficiency, in the estimation of m. Finally, the Web site provides a graphical visualization of the goodness-of-fit between the experimental data and the model. bz-rates is accessible at http://www.lcqb.upmc.fr/bzrates.
Estimation for Traffic Arrival Rate and Service Rate of Primary Users in Cognitive Radio Networks
Institute of Scientific and Technical Information of China (English)
Xiaolong Yang; Xuezhi Tan∗
2015-01-01
In order to estimate the traffic arrival rate and service rate parameters of primary users in cognitive radio networks, a hidden Markov model estimation algorithm ( HMM⁃EA) is proposed, which can provide better estimation performance than the energy detection estimation algorithm ( ED⁃EA ) . Firstly, spectrum usage behaviors of primary users are described by establishing a preemptive priority queue model, by which a real state transition probability matrix is derived. Secondly, cooperative detection is utilized to detect the real state of primary users and emission matrix is derived by considering both detection and false alarm probability. Then, a hidden Markov model is built based on the previous two steps, and evaluated through the forward⁃backward algorithm. Finally, the simulations results verify that the HMM⁃EA algorithm outperforms the ED⁃EA in terms of convergence performance, and therefore the secondary user is able to access the unused channel with the least busy probability in real time.
Demirturk Kocasarac, Husniye; Sinanoglu, Alper; Noujeim, Marcel; Helvacioglu Yigit, Dilek; Baydemir, Canan
2016-05-01
For forensic age estimation, radiographic assessment of third molar mineralization is important between 14 and 21 years which coincides with the legal age in most countries. The spheno-occipital synchondrosis (SOS) is an important growth site during development, and its use for age estimation is beneficial when combined with other markers. In this study, we aimed to develop a regression model to estimate and narrow the age range based on the radiologic assessment of third molar and SOS in a Turkish subpopulation. Panoramic radiographs and cone beam CT scans of 349 subjects (182 males, 167 females) with age between 8 and 25 were evaluated. Four-stage system was used to evaluate the fusion degree of SOS, and Demirjian's eight stages of development for calcification for third molars. The Pearson correlation indicated a strong positive relationship between age and third molar calcification for both sexes (r = 0.850 for females, r = 0.839 for males, P age and SOS fusion for females (r = 0.814), but a moderate relationship was found for males (r = 0.599), P age determination formula using these scores was established.
Owolabi, Taoreed O.; Akande, Kabiru O.; Olatunji, Sunday O.; Alqahtani, Abdullah; Aldhafferi, Nahier
2016-10-01
Magnetic refrigeration (MR) technology stands a good chance of replacing the conventional gas compression system (CGCS) of refrigeration due to its unique features such as high efficiency, low cost as well as being environmental friendly. Its operation involves the use of magnetocaloric effect (MCE) of a magnetic material caused by application of magnetic field. Manganite-based material demonstrates maximum MCE at its magnetic ordering temperature known as Curie temperature (TC). Consequently, manganite-based material with TC around room temperature is essentially desired for effective utilization of this technology. The TC of manganite-based materials can be adequately altered to a desired value through doping with appropriate foreign materials. In order to determine a manganite with TC around room temperature and to circumvent experimental challenges therein, this work proposes a model that can effectively estimates the TC of manganite-based material doped with different materials with the aid of support vector regression (SVR) hybridized with gravitational search algorithm (GSA). Implementation of GSA algorithm ensures optimum selection of SVR hyper-parameters for improved performance of the developed model using lattice distortions as the descriptors. The result of the developed model is promising and agrees excellently with the experimental results. The outstanding estimates of the proposed model suggest its potential in promoting room temperature magnetic refrigeration through quick estimation of the effect of dopants on TC so as to obtain manganite that works well around the room temperature.
Institute of Scientific and Technical Information of China (English)
Jinhong YOU; CHEN Min; Gemai CHEN
2004-01-01
Consider a semiparametric regression model with linear time series errors Yκ = x′κβ + g(tκ) + εκ,1 ≤ k ≤ n, where Yκ's are responses, xκ= (xκ1,xκ2,…,xκp)′and tκ ∈ T( ) R are fixed design points, β = (β1,β2,…… ,βp)′ is an unknown parameter vector, g(.) is an unknown bounded real-valued function defined on a compact subset T of the real line R, and εκ is a linear process given by εκ = ∑∞j=0 ψjeκ-j, ψ0 = 1, where ∑∞j=0 |ψj| ＜∞, and ej, j = 0,±1,±2,…, are I.I.d, random variables. In this paper we establish the asymptotic normality of the least squares estimator ofβ, a smooth estimator of g(·), and estimators of the autocovariance and autocorrelation functions of the linear process εκ.
Directory of Open Access Journals (Sweden)
Taoreed O. Owolabi
2016-10-01
Full Text Available Magnetic refrigeration (MR technology stands a good chance of replacing the conventional gas compression system (CGCS of refrigeration due to its unique features such as high efficiency, low cost as well as being environmental friendly. Its operation involves the use of magnetocaloric effect (MCE of a magnetic material caused by application of magnetic field. Manganite-based material demonstrates maximum MCE at its magnetic ordering temperature known as Curie temperature (TC. Consequently, manganite-based material with TC around room temperature is essentially desired for effective utilization of this technology. The TC of manganite-based materials can be adequately altered to a desired value through doping with appropriate foreign materials. In order to determine a manganite with TC around room temperature and to circumvent experimental challenges therein, this work proposes a model that can effectively estimates the TC of manganite-based material doped with different materials with the aid of support vector regression (SVR hybridized with gravitational search algorithm (GSA. Implementation of GSA algorithm ensures optimum selection of SVR hyper-parameters for improved performance of the developed model using lattice distortions as the descriptors. The result of the developed model is promising and agrees excellently with the experimental results. The outstanding estimates of the proposed model suggest its potential in promoting room temperature magnetic refrigeration through quick estimation of the effect of dopants on TC so as to obtain manganite that works well around the room temperature.
Directory of Open Access Journals (Sweden)
Jingyan Song
2011-07-01
Full Text Available The star centroid estimation is the most important operation, which directly affects the precision of attitude determination for star sensors. This paper presents a theoretical study of the systematic error introduced by the star centroid estimation algorithm. The systematic error is analyzed through a frequency domain approach and numerical simulations. It is shown that the systematic error consists of the approximation error and truncation error which resulted from the discretization approximation and sampling window limitations, respectively. A criterion for choosing the size of the sampling window to reduce the truncation error is given in this paper. The systematic error can be evaluated as a function of the actual star centroid positions under different Gaussian widths of star intensity distribution. In order to eliminate the systematic error, a novel compensation algorithm based on the least squares support vector regression (LSSVR with Radial Basis Function (RBF kernel is proposed. Simulation results show that when the compensation algorithm is applied to the 5-pixel star sampling window, the accuracy of star centroid estimation is improved from 0.06 to 6 × 10−5 pixels.
Estimation of construction and demolition waste using waste generation rates in Chennai, India.
Ram, V G; Kalidindi, Satyanarayana N
2017-06-01
A large amount of construction and demolition waste is being generated owing to rapid urbanisation in Indian cities. A reliable estimate of construction and demolition waste generation is essential to create awareness about this stream of solid waste among the government bodies in India. However, the required data to estimate construction and demolition waste generation in India are unavailable or not explicitly documented. This study proposed an approach to estimate construction and demolition waste generation using waste generation rates and demonstrated it by estimating construction and demolition waste generation in Chennai city. The demolition waste generation rates of primary materials were determined through regression analysis using waste generation data from 45 case studies. Materials, such as wood, electrical wires, doors, windows and reinforcement steel, were found to be salvaged and sold on the secondary market. Concrete and masonry debris were dumped in either landfills or unauthorised places. The total quantity of construction and demolition debris generated in Chennai city in 2013 was estimated to be 1.14 million tonnes. The proportion of masonry debris was found to be 76% of the total quantity of demolition debris. Construction and demolition debris forms about 36% of the total solid waste generated in Chennai city. A gross underestimation of construction and demolition waste generation in some earlier studies in India has also been shown. The methodology proposed could be utilised by government bodies, policymakers and researchers to generate reliable estimates of construction and demolition waste in other developing countries facing similar challenges of limited data availability.
Molecular-clock methods for estimating evolutionary rates and timescales.
Ho, Simon Y W; Duchêne, Sebastián
2014-12-01
The molecular clock presents a means of estimating evolutionary rates and timescales using genetic data. These estimates can lead to important insights into evolutionary processes and mechanisms, as well as providing a framework for further biological analyses. To deal with rate variation among genes and among lineages, a diverse range of molecular-clock methods have been developed. These methods have been implemented in various software packages and differ in their statistical properties, ability to handle different models of rate variation, capacity to incorporate various forms of calibrating information and tractability for analysing large data sets. Choosing a suitable molecular-clock model can be a challenging exercise, but a number of model-selection techniques are available. In this review, we describe the different forms of evolutionary rate heterogeneity and explain how they can be accommodated in molecular-clock analyses. We provide an outline of the various clock methods and models that are available, including the strict clock, local clocks, discrete clocks and relaxed clocks. Techniques for calibration and clock-model selection are also described, along with methods for handling multilocus data sets. We conclude our review with some comments about the future of molecular clocks.
Juan Collados-Lara, Antonio; Pardo-Iguzquiza, Eulogio; Pulido-Velazquez, David
2016-04-01
The estimation of Snow Water Equivalent (SWE) is essential for an appropriate assessment of the available water resources in Alpine catchment. The hydrologic regime in these areas is dominated by the storage of water in the snowpack, which is discharged to rivers throughout the melt season. An accurate estimation of the resources will be necessary for an appropriate analysis of the system operation alternatives using basin scale management models. In order to obtain an appropriate estimation of the SWE we need to know the spatial distribution snowpack and snow density within the Snow Cover Area (SCA). Data for these snow variables can be extracted from in-situ point measurements and air-borne/space-borne remote sensing observations. Different interpolation and simulation techniques have been employed for the estimation of the cited variables. In this paper we propose to estimate snowpack from a reduced number of ground-truth data (1 or 2 campaigns per year with 23 observation point from 2000-2014) and MODIS satellite-based observations in the Sierra Nevada Mountain (Southern Spain). Regression based methodologies has been used to study snowpack distribution using different kind of explicative variables: geographic, topographic, climatic. 40 explicative variables were considered: the longitude, latitude, altitude, slope, eastness, northness, radiation, maximum upwind slope and some mathematical transformation of each of them [Ln(v), (v)^-1; (v)^2; (v)^0.5). Eight different structure of regression models have been tested (combining 1, 2, 3 or 4 explicative variables). Y=B0+B1Xi (1); Y=B0+B1XiXj (2); Y=B0+B1Xi+B2Xj (3); Y=B0+B1Xi+B2XjXl (4); Y=B0+B1XiXk+B2XjXl (5); Y=B0+B1Xi+B2Xj+B3Xl (6); Y=B0+B1Xi+B2Xj+B3XlXk (7); Y=B0+B1Xi+B2Xj+B3Xl+B4Xk (8). Where: Y is the snow depth; (Xi, Xj, Xl, Xk) are the prediction variables (any of the 40 variables); (B0, B1, B2, B3) are the coefficients to be estimated. The ground data are employed to calibrate the multiple regressions. In
Redefinition and global estimation of basal ecosystem respiration rate
DEFF Research Database (Denmark)
Yuan, Wenping; Luo, Yiqi; Li, Xianglan;
2011-01-01
Basal ecosystem respiration rate (BR), the ecosystem respiration rate at a given temperature, is a common and important parameter in empirical models for quantifying ecosystem respiration (ER) globally. Numerous studies have indicated that BR varies in space. However, many empirical ER models still...... use a global constant BR largely due to the lack of a functional description for BR. In this study, we redefined BR to be ecosystem respiration rate at the mean annual temperature. To test the validity of this concept, we conducted a synthesis analysis using 276 site-years of eddy covariance data...... use efficiency GPP model (i.e., EC-LUE) was applied to estimate global GPP, BR and ER with input data from MERRA (Modern Era Retrospective-Analysis for Research and Applications) and MODIS (Moderate resolution Imaging Spectroradiometer). The global ER was 103 Pg C yr −1, with the highest respiration...
Suresh, Arumuganainar; Choi, Hong Lim
2011-10-01
Swine waste land application has increased due to organic fertilization, but excess application in an arable system can cause environmental risk. Therefore, in situ characterizations of such resources are important prior to application. To explore this, 41 swine slurry samples were collected from Korea, and wide differences were observed in the physico-biochemical properties. However, significant (Pspecific gravity (SG), electrical conductivity (EC), total solids (TS) and pH. The different combinations of hydrometer, EC meter, drying oven and pH meter were found useful to estimate Mn, Fe, Ca, K, Al, Na, N and 5-day biochemical oxygen demands (BOD₅) at improved R² values of 0.83, 0.82, 0.77, 0.75, 0.67, 0.47, 0.88 and 0.70, respectively. The results from this study suggest that multiple property regressions can facilitate the prediction of micronutrients and organic matter much better than a single property regression for livestock waste. Copyright © 2011 Elsevier Ltd. All rights reserved.
Battaglin, William A.; Ulery, Randy L.; Winterstein, Thomas; Welborn, Toby
2003-01-01
In the State of Texas, surface water (streams, canals, and reservoirs) and ground water are used as sources of public water supply. Surface-water sources of public water supply are susceptible to contamination from point and nonpoint sources. To help protect sources of drinking water and to aid water managers in designing protective yet cost-effective and risk-mitigated monitoring strategies, the Texas Commission on Environmental Quality and the U.S. Geological Survey developed procedures to assess the susceptibility of public water-supply source waters in Texas to the occurrence of 227 contaminants. One component of the assessments is the determination of susceptibility of surface-water sources to nonpoint-source contamination. To accomplish this, water-quality data at 323 monitoring sites were matched with geographic information system-derived watershed- characteristic data for the watersheds upstream from the sites. Logistic regression models then were developed to estimate the probability that a particular contaminant will exceed a threshold concentration specified by the Texas Commission on Environmental Quality. Logistic regression models were developed for 63 of the 227 contaminants. Of the remaining contaminants, 106 were not modeled because monitoring data were available at less than 10 percent of the monitoring sites; 29 were not modeled because there were less than 15 percent detections of the contaminant in the monitoring data; 27 were not modeled because of the lack of any monitoring data; and 2 were not modeled because threshold values were not specified.
Soares dos Santos, T.; Mendes, D.; Rodrigues Torres, R.
2016-01-01
Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANNs) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon; northeastern Brazil; and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model output and observed monthly precipitation. We used general circulation model (GCM) experiments for the 20th century (RCP historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANNs significantly outperform the MLR downscaling of monthly precipitation variability.
Indian Academy of Sciences (India)
P Shivakumara; G Hemantha Kumar; D S Guru; P Nagabhushan
2005-02-01
When a document is scanned either mechanically or manually for digitization, it often suffers from some degree of skew or tilt. Skew-angle detection plays an important role in the ﬁeld of document analysis systems and OCR in achieving the expected accuracy. In this paper, we consider skew estimation of Roman script. The method uses the boundary growing approach to extract the lowermost and uppermost coordinates of pixels of characters of text lines present in the document, which can be subjected to linear regression analysis (LRA) to determine the skew angle of a skewed document. Further, the proposed technique works ﬁne for scaled text binary documents also. The technique works based on the assumption that the space between the text lines is greater than the space between the words and characters. Finally, in order to evaluate the performance of the proposed methodology we compare the experimental results with those of well-known existing methods.
Directory of Open Access Journals (Sweden)
Branimir Milosavljević
2015-01-01
Full Text Available This paper determines, by experiments, the CO emissions at idle running with 1,785 vehicles powered by spark ignition engine, in order to verify the correctness of emissions values with a representative sample of vehicles in Serbia. The permissible emissions limits were considered for three (3 fitted binary logistic regression (BLR models, and the key reason for such analysis is finding the predictors that can have a crucial influence on the accuracy of the estimation whether such vehicles have correct emissions or not. Having summarized the research results, we found out that vehicles produced in Serbia (hereinafter referred to as “domestic vehicles” cause more pollution than imported cars (hereinafter referred to as “foreign vehicles”, although domestic vehicles are of lower average age and mileage. Another trend was observed: low-power vehicles and vehicles produced before 1992 are potentially more serious polluters.
Likelihood of tree topologies with fossils and diversification rate estimation.
Didier, Gilles; Fau, Marine; Laurin, Michel
2017-04-18
Since the diversification process cannot be directly observed at the human scale, it has to be studied from the information available, namely the extant taxa and the fossil record. In this sense, phylogenetic trees including both extant taxa and fossils are the most complete representations of the diversification process that one can get. Such phylogenetic trees can be reconstructed from molecular and morphological data, to some extent. Among the temporal information of such phylogenetic trees, fossil ages are by far the most precisely known (divergence times are inferences calibrated mostly with fossils). We propose here a method to compute the likelihood of a phylogenetic tree with fossils in which the only considered time information is the fossil ages, and apply it to the estimation of the diversification rates from such data. Since it is required in our computation, we provide a method for determining the probability of a tree topology under the standard diversification model.Testing 21 our approach on simulated data shows that the maximum likelihood rate estimates from the phylogenetic tree topology and the fossil dates are almost as accurate as those obtained by taking into account all the data, including the divergence times. Moreover, they are substantially more accurate than the estimates obtained only from the exact divergence times (without taking into account the fossil record).We also provide an empirical example composed of 50 Permo-carboniferous eupelycosaur (early synapsid) taxa ranging in age from about 315 Ma (Late Carboniferous) to 270 Ma (shortly after the end of the Early Permian). Our analyses suggest a speciation (cladogenesis, or birth) rate of about 0.1 per lineage and per My, a marginally lower extinction rate, and a considerable hidden paleobiodiversity of early synapsids. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email
Can we estimate bacterial growth rates from ribosomal RNA content?
Energy Technology Data Exchange (ETDEWEB)
Kemp, P.F.
1995-12-31
Several studies have demonstrated a strong relationship between the quantity of RNA in bacterial cells and their growth rate under laboratory conditions. It may be possible to use this relationship to provide information on the activity of natural bacterial communities, and in particular on growth rate. However, if this approach is to provide reliably interpretable information, the relationship between RNA content and growth rate must be well-understood. In particular, a requisite of such applications is that the relationship must be universal among bacteria, or alternately that the relationship can be determined and measured for specific bacterial taxa. The RNA-growth rate relationship has not been used to evaluate bacterial growth in field studies, although RNA content has been measured in single cells and in bulk extracts of field samples taken from coastal environments. These measurements have been treated as probable indicators of bacterial activity, but have not yet been interpreted as estimators of growth rate. The primary obstacle to such interpretations is a lack of information on biological and environmental factors that affect the RNA-growth rate relationship. In this paper, the available data on the RNA-growth rate relationship in bacteria will be reviewed, including hypotheses regarding the regulation of RNA synthesis and degradation as a function of growth rate and environmental factors; i.e. the basic mechanisms for maintaining RNA content in proportion to growth rate. An assessment of the published laboratory and field data, the current status of this research area, and some of the remaining questions will be presented.
Wavelet-based Poisson rate estimation using the Skellam distribution
Hirakawa, Keigo; Baqai, Farhan; Wolfe, Patrick J.
2009-02-01
Owing to the stochastic nature of discrete processes such as photon counts in imaging, real-world data measurements often exhibit heteroscedastic behavior. In particular, time series components and other measurements may frequently be assumed to be non-iid Poisson random variables, whose rate parameter is proportional to the underlying signal of interest-witness literature in digital communications, signal processing, astronomy, and magnetic resonance imaging applications. In this work, we show that certain wavelet and filterbank transform coefficients corresponding to vector-valued measurements of this type are distributed as sums and differences of independent Poisson counts, taking the so-called Skellam distribution. While exact estimates rarely admit analytical forms, we present Skellam mean estimators under both frequentist and Bayes models, as well as computationally efficient approximations and shrinkage rules, that may be interpreted as Poisson rate estimation method performed in certain wavelet/filterbank transform domains. This indicates a promising potential approach for denoising of Poisson counts in the above-mentioned applications.
Estimation of evapotranspiration rate in irrigated lands using stable isotopes
Umirzakov, Gulomjon; Windhorst, David; Forkutsa, Irina; Brauer, Lutz; Frede, Hans-Georg
2013-04-01
Agriculture in the Aral Sea basin is the main consumer of water resources and due to the current agricultural management practices inefficient water usage causes huge losses of freshwater resources. There is huge potential to save water resources in order to reach a more efficient water use in irrigated areas. Therefore, research is required to reveal the mechanisms of hydrological fluxes in irrigated areas. This paper focuses on estimation of evapotranspiration which is one of the crucial components in the water balance of irrigated lands. Our main objective is to estimate the rate of evapotranspiration on irrigated lands and partitioning of evaporation into transpiration using stable isotopes measurements. Experiments has done in 2 different soil types (sandy and sandy loam) irrigated areas in Ferghana Valley (Uzbekistan). Soil samples were collected during the vegetation period. The soil water from these samples was extracted via a cryogenic extraction method and analyzed for the isotopic ratio of the water isotopes (2H and 18O) based on a laser spectroscopy method (DLT 100, Los Gatos USA). Evapotranspiration rates were estimated with Isotope Mass Balance method. The results of evapotranspiration obtained using isotope mass balance method is compared with the results of Catchment Modeling Framework -1D model results which has done in the same area and the same time.
An Empirical Calibration of Star Formation Rate Estimators
Rosa-Gonzalez, D; Terlevich, R J
2002-01-01
(Abridged) The observational determination of the behaviour of the star formation rate (SFR) with look-back time or redshift has two main weaknesses: 1- the large uncertainty of the dust/extinction corrections, and 2- that systematic errors may be introduced by the fact that the SFR is estimated using different methods at different redshifts. To assess the possible systematic differences among the different SFR estimators and the role of dust, we have compared SFR estimates using H$\\alpha$, SFR(H$\\alpha$), [OII]$\\lambda$3727\\AA, SFR(OII), UV, SFR(UV) and FIR, SFR(FIR) luminosities of a sample comprising the 31 nearby star forming galaxies having high quality photometric data in the UV, optical and FIR. We review the different "standard" methods for the estimation of the SFR and find that while the standard method provides good agreement between SFR(H$\\alpha$) and SFR(FIR), both SFR(OII) and SFR(UV) are systematically higher than SFR(FIR), irrespective of the extinction law. We show that the excess in the SFR(...
Institute of Scientific and Technical Information of China (English)
Tarquinio; Mateus; Magalhães
2016-01-01
Background:Biomass regression equations are claimed to yield the most accurate biomass estimates than biomass expansion factors (BEFs). Yet, national and regional biomass estimates are general y calculated based on BEFs, especial y when using national forest inventory data. Comparison of regression equations based and BEF-based biomass estimates are scarce. Thus, this study was intended to compare these two commonly used methods for estimating tree and forest biomass with regard to errors and biases. Methods:The data were col ected in 2012 and 2014. In 2012, a two-phase sampling design was used to fit tree component biomass regression models and determine tree BEFs. In 2014, additional trees were fel ed outside sampling plots to estimate the biases associated with regression equation based and BEF-based biomass estimates;those estimates were then compared in terms of the fol owing sources of error: plot selection and variability, biomass model, model parameter estimates, and residual variability around model prediction. Results:The regression equation based below-, aboveground and whole tree biomass stocks were, approximately, 7.7, 8.5 and 8.3%larger than the BEF-based ones. For the whole tree biomass stock, the percentage of the total error attributed to first phase (random plot selection and variability) was 90 and 88%for regression-and BEF-based estimates, respectively, being the remaining attributed to biomass models (regression and BEF models, respectively). The percent bias of regression equation based and BEF-based biomass estimates for the whole tree biomass stock were−2.7 and 5.4%, respectively. The errors due to model parameter estimates, those due to residual variability around model prediction, and the percentage of the total error attributed to biomass model were larger for BEF models (than for regression models), except for stem and stem wood components. Conclusions:The regression equation based biomass stocks were found to be slightly larger
Directory of Open Access Journals (Sweden)
Tarquinio Mateus Magalhães
2015-10-01
Full Text Available Background Biomass regression equations are claimed to yield the most accurate biomass estimates than biomass expansion factors (BEFs. Yet, national and regional biomass estimates are generally calculated based on BEFs, especially when using national forest inventory data. Comparison of regression equations based and BEF-based biomass estimates are scarce. Thus, this study was intended to compare these two commonly used methods for estimating tree and forest biomass with regard to errors and biases. Methods The data were collected in 2012 and 2014. In 2012, a two-phase sampling design was used to fit tree component biomass regression models and determine tree BEFs. In 2014, additional trees were felled outside sampling plots to estimate the biases associated with regression equation based and BEF-based biomass estimates; those estimates were then compared in terms of the following sources of error: plot selection and variability, biomass model, model parameter estimates, and residual variability around model prediction. Results The regression equation based below-, aboveground and whole tree biomass stocks were, approximately, 7.7, 8.5 and 8.3 % larger than the BEF-based ones. For the whole tree biomass stock, the percentage of the total error attributed to first phase (random plot selection and variability was 90 and 88 % for regression- and BEF-based estimates, respectively, being the remaining attributed to biomass models (regression and BEF models, respectively. The percent bias of regression equation based and BEF-based biomass estimates for the whole tree biomass stock were −2.7 and 5.4 %, respectively. The errors due to model parameter estimates, those due to residual variability around model prediction, and the percentage of the total error attributed to biomass model were larger for BEF models (than for regression models, except for stem and stem wood components. Conclusions The regression equation based biomass stocks were found to
THE RATES OF CONVERGENCE OF M-ESTIMATORS FOR PARTLY LINEAR MODELS IN DEPENDENT CASES
Institute of Scientific and Technical Information of China (English)
SHIPEIDE; CHENXIRU
1996-01-01
Consider the partly linear model Yi=X'iβ0+g0(Ti)+ei,where{(Ti,Xi)}∞1 is a strictly stationary sequence of random variables,the e'is are i.i.d. random errors, the Yi's are realvalued responses,β0 is a d-vector of parameters,Xi is a d-vector of explanatory variables,Ti is another explanatory variable ranging over a nondegenerate compact interval. Based on a segment of observations(T1,X'1,Y1)…,(Tn,X'n,Yn)，this article investigates the rates of convergence of the M-eatimators for β0 and g0 obtained from the minimixation problem ∑n i=1ρ（Yi-X'iβ-gn(Ti)）=min β∈R2,sn∈Fn where Fn is a space of B-spline fuctions of order m+1 and ρ(·) is a function chosen suitably.Under some regularity condtions, it is shown that the estimator of go achieves the optimal global rate of convergence of estimators for nonparametric regrssion,and the estimator of β0 is asymptotically uormal.The M-estimators here include regression quantile estimators,L1-estimators,Lp-norm estimators, Huber's type M-estimators and usual least squares estimators.Applications of the asymptotic theory to testing the hypothesis H0:A'β0=β are also discussed, Where β ia a given vector and Ais a known d×d0 matrix with rank d0.
Estimation of uncertainty for fatigue growth rate at cryogenic temperatures
Nyilas, Arman; Weiss, Klaus P.; Urbach, Elisabeth; Marcinek, Dawid J.
2014-01-01
Fatigue crack growth rate (FCGR) measurement data for high strength austenitic alloys at cryogenic environment suffer in general from a high degree of data scatter in particular at ΔK regime below 25 MPa√m. Using standard mathematical smoothing techniques forces ultimately a linear relationship at stage II regime (crack propagation rate versus ΔK) in a double log field called Paris law. However, the bandwidth of uncertainty relies somewhat arbitrary upon the researcher's interpretation. The present paper deals with the use of the uncertainty concept on FCGR data as given by GUM (Guidance of Uncertainty in Measurements), which since 1993 is a recommended procedure to avoid subjective estimation of error bands. Within this context, the lack of a true value addresses to evaluate the best estimate by a statistical method using the crack propagation law as a mathematical measurement model equation and identifying all input parameters. Each parameter necessary for the measurement technique was processed using the Gaussian distribution law by partial differentiation of the terms to estimate the sensitivity coefficients. The combined standard uncertainty determined for each term with its computed sensitivity coefficients finally resulted in measurement uncertainty of the FCGR test result. The described procedure of uncertainty has been applied within the framework of ITER on a recent FCGR measurement for high strength and high toughness Type 316LN material tested at 7 K using a standard ASTM proportional compact tension specimen. The determined values of Paris law constants such as C0 and the exponent m as best estimate along with the their uncertainty value may serve a realistic basis for the life expectancy of cyclic loaded members.
Quantitative Model for Estimating Soil Erosion Rates Using 137Cs
Institute of Scientific and Technical Information of China (English)
YANGHAO; GHANGQING; 等
1998-01-01
A quantitative model was developed to relate the amount of 137Cs loss from the soil profile to the rate of soil erosion,According th mass balance model,the depth distribution pattern of 137Cs in the soil profile ,the radioactive decay of 137Cs,sampling year and the difference of 137Cs fallout amount among years were taken into consideration.By introducing typical depth distribution functions of 137Cs into the model ,detailed equations for the model were got for different soil,The model shows that the rate of soil erosion is mainly controlled by the depth distrbution pattern of 137Cs ,the year of sampling,and the percentage reduction in total 137Cs,The relationship between the rate of soil loss and 137Cs depletion i neither linear nor logarithmic,The depth distribution pattern of 137Cs is a major factor for estimating the rate of soil loss,Soil erosion rate is directly related with the fraction of 137Cs content near the soil surface. The influences of the radioactive decay of 137Cs,sampling year and 137Cs input fraction are not large compared with others.
Effects of systematic sampling on satellite estimates of deforestation rates
Energy Technology Data Exchange (ETDEWEB)
Steininger, M K; Godoy, F; Harper, G, E-mail: msteininger@conservation.or [Center for Applied Biodiversity Science-Conservation International, 2011 Crystal Drive Suite 500, Arlington, VA 22202 (United States)
2009-09-15
Options for satellite monitoring of deforestation rates over large areas include the use of sampling. Sampling may reduce the cost of monitoring but is also a source of error in estimates of areas and rates. A common sampling approach is systematic sampling, in which sample units of a constant size are distributed in some regular manner, such as a grid. The proposed approach for the 2010 Forest Resources Assessment (FRA) of the UN Food and Agriculture Organization (FAO) is a systematic sample of 10 km wide squares at every 1 deg. intersection of latitude and longitude. We assessed the outcome of this and other systematic samples for estimating deforestation at national, sub-national and continental levels. The study is based on digital data on deforestation patterns for the five Amazonian countries outside Brazil plus the Brazilian Amazon. We tested these schemes by varying sample-unit size and frequency. We calculated two estimates of sampling error. First we calculated the standard errors, based on the size, variance and covariance of the samples, and from this calculated the 95% confidence intervals (CI). Second, we calculated the actual errors, based on the difference between the sample-based estimates and the estimates from the full-coverage maps. At the continental level, the 1 deg., 10 km scheme had a CI of 21% and an actual error of 8%. At the national level, this scheme had CIs of 126% for Ecuador and up to 67% for other countries. At this level, increasing sampling density to every 0.25 deg. produced a CI of 32% for Ecuador and CIs of up to 25% for other countries, with only Brazil having a CI of less than 10%. Actual errors were within the limits of the CIs in all but two of the 56 cases. Actual errors were half or less of the CIs in all but eight of these cases. These results indicate that the FRA 2010 should have CIs of smaller than or close to 10% at the continental level. However, systematic sampling at the national level yields large CIs unless the
Directory of Open Access Journals (Sweden)
Lokuge Dona Manori Nimanthika Lokuge
2015-06-01
Full Text Available Due to high prevalence of dietary diseases and malnutrition in Sri Lanka, it is essential to assess food consumption patterns. Because pulses are a major source of nutrients, this paper employed the Linear Approximation of the Almost Ideal Demand System (LA/AIDS to estimate price and expenditure elasticities for six types of pulses, by utilizing the Household Income and Expenditure Survey, 2006/07. The infrequency of purchases, a typical problem encountered in LA/AIDS estimation is circumvented by using a probit regression in the first stage, to capture the effect of demographic factors, in consumption choice. Results reveal that the buying decision of pulses is influenced by the sector (rural, urban and estate, household size, education level, presence of children, prevalence of blood pressure and diabetes. All pulses types except dhal are highly responsive to their own prices. Dhal is identified as the most prominent choice among all other alternatives and hence, it is distinguished as a necessity whereas, the rest show luxurious behavior, with the income. Because dhal is an import product, consumption choices of dhal may be severely affected by any action which exporting countries introduce, while rest of the pulses will be affected by both price and income oriented policies.
Levy, Ilan; Levin, Noam; Yuval; Schwartz, Joel D; Kark, Jeremy D
2015-03-17
Land use regression (LUR) models rely on air pollutant measurements for their development, and are therefore limited to recent periods where such measurements are available. Here we propose an approach to overcome this gap and calculate LUR models several decades before measurements were available. We first developed a LUR model for NOx using annual averages of NOx at all available air quality monitoring sites in Israel between 1991 and 2011 with time as one of the independent variables. We then reconstructed historical spatial data (e.g., road network) from historical topographic maps to apply the model's prediction to each year from 1961 to 2011. The model's predictions were then validated against independent estimates about the national annual NOx emissions from on-road vehicles in a top-down approach. The model's cross validated R2 was 0.74, and the correlation between the model's annual averages and the national annual NOx emissions between 1965 and 2011 was 0.75. Information about the road network and population are persistent predictors in many LUR models. The use of available historical data about these predictors to resolve the spatial variability of air pollutants together with complementary national estimates on the change in pollution levels over time enable historical reconstruction of exposures.
Inverse method for estimating respiration rates from decay time series
Directory of Open Access Journals (Sweden)
D. C. Forney
2012-03-01
Full Text Available Long-term organic matter decomposition experiments typically measure the mass lost from decaying organic matter as a function of time. These experiments can provide information about the dynamics of carbon dioxide input to the atmosphere and controls on natural respiration processes. Decay slows down with time, suggesting that organic matter is composed of components (pools with varied lability. Yet it is unclear how the appropriate rates, sizes, and number of pools vary with organic matter type, climate, and ecosystem. To better understand these relations, it is necessary to properly extract the decay rates from decomposition data. Here we present a regularized inverse method to identify an optimally-fitting distribution of decay rates associated with a decay time series. We motivate our study by first evaluating a standard, direct inversion of the data. The direct inversion identifies a discrete distribution of decay rates, where mass is concentrated in just a small number of discrete pools. It is consistent with identifying the best fitting "multi-pool" model, without prior assumption of the number of pools. However we find these multi-pool solutions are not robust to noise and are over-parametrized. We therefore introduce a method of regularized inversion, which identifies the solution which best fits the data but not the noise. This method shows that the data are described by a continuous distribution of rates which we find is well approximated by a lognormal distribution, and consistent with the idea that decomposition results from a continuum of processes at different rates. The ubiquity of the lognormal distribution suggest that decay may be simply described by just two parameters; a mean and a variance of log rates. We conclude by describing a procedure that estimates these two lognormal parameters from decay data. Matlab codes for all numerical methods and procedures are provided.
Inverse method for estimating respiration rates from decay time series
Directory of Open Access Journals (Sweden)
D. C. Forney
2012-09-01
Full Text Available Long-term organic matter decomposition experiments typically measure the mass lost from decaying organic matter as a function of time. These experiments can provide information about the dynamics of carbon dioxide input to the atmosphere and controls on natural respiration processes. Decay slows down with time, suggesting that organic matter is composed of components (pools with varied lability. Yet it is unclear how the appropriate rates, sizes, and number of pools vary with organic matter type, climate, and ecosystem. To better understand these relations, it is necessary to properly extract the decay rates from decomposition data. Here we present a regularized inverse method to identify an optimally-fitting distribution of decay rates associated with a decay time series. We motivate our study by first evaluating a standard, direct inversion of the data. The direct inversion identifies a discrete distribution of decay rates, where mass is concentrated in just a small number of discrete pools. It is consistent with identifying the best fitting "multi-pool" model, without prior assumption of the number of pools. However we find these multi-pool solutions are not robust to noise and are over-parametrized. We therefore introduce a method of regularized inversion, which identifies the solution which best fits the data but not the noise. This method shows that the data are described by a continuous distribution of rates, which we find is well approximated by a lognormal distribution, and consistent with the idea that decomposition results from a continuum of processes at different rates. The ubiquity of the lognormal distribution suggest that decay may be simply described by just two parameters: a mean and a variance of log rates. We conclude by describing a procedure that estimates these two lognormal parameters from decay data. Matlab codes for all numerical methods and procedures are provided.
Baker, R.J.; Baehr, A.L.; Lahvis, M.A.
2000-01-01
An open microcosm method for quantifying microbial respiration and estimating biodegradation rates of hydrocarbons in gasoline-contaminated sediment samples has been developed and validated. Stainless-steel bioreactors are filled with soil or sediment samples, and the vapor-phase composition (concentrations of oxygen (O2), nitrogen (N2), carbon dioxide (CO2), and selected hydrocarbons) is monitored over time. Replacement gas is added as the vapor sample is taken, and selection of the replacement gas composition facilitates real-time decision-making regarding environmental conditions within the bioreactor. This capability allows for maintenance of field conditions over time, which is not possible in closed microcosms. Reaction rates of CO2 and O2 are calculated from the vapor-phase composition time series. Rates of hydrocarbon biodegradation are either measured directly from the hydrocarbon mass balance, or estimated from CO2 and O2 reaction rates and assumed reaction stoichiometries. Open microcosm experiments using sediments spiked with toluene and p-xylene were conducted to validate the stoichiometric assumptions. Respiration rates calculated from O2 consumption and from CO2 production provide estimates of toluene and p- xylene degradation rates within about ??50% of measured values when complete mineralization stoichiometry is assumed. Measured values ranged from 851.1 to 965.1 g m-3 year-1 for toluene, and 407.2-942.3 g m-3 year-1 for p- xylene. Contaminated sediment samples from a gasoline-spill site were used in a second set of microcosm experiments. Here, reaction rates of O2 and CO2 were measured and used to estimate hydrocarbon respiration rates. Total hydrocarbon reaction rates ranged from 49.0 g m-3 year-1 in uncontaminated (background) to 1040.4 g m-3 year-1 for highly contaminated sediment, based on CO2 production data. These rate estimates were similar to those obtained independently from in situ CO2 vertical gradient and flux determinations at the
DEFF Research Database (Denmark)
Garde, Eva; Heide-Jørgensen, Mads Peter; Ditlevsen, Susanne
2012-01-01
Ages of marine mammals have traditionally been estimated by counting dentinal growth layers in teeth. However, this method is difficult to use on narwhals (Monodon monoceros) because of their special tooth structures. Alternative methods are therefore needed. The aspartic acid racemization (AAR......) technique has been used in age estimation studies of cetaceans, including narwhals. The purpose of this study was to estimate a species-specific racemization rate for narwhals by regressing aspartic acid D/L ratios in eye lens nuclei against growth layer groups in tusks (n=9). Two racemization rates were...... rate and (D/L)0 value be used in future AAR age estimation studies of narwhals, but also recommend the collection of tusks and eyes of narwhals for further improving the (D/L)0 and 2kAsp estimates obtained in this study....
Directory of Open Access Journals (Sweden)
Casey P Durand
Full Text Available INTRODUCTION: Statistical interactions are a common component of data analysis across a broad range of scientific disciplines. However, the statistical power to detect interactions is often undesirably low. One solution is to elevate the Type 1 error rate so that important interactions are not missed in a low power situation. To date, no study has quantified the effects of this practice on power in a linear regression model. METHODS: A Monte Carlo simulation study was performed. A continuous dependent variable was specified, along with three types of interactions: continuous variable by continuous variable; continuous by dichotomous; and dichotomous by dichotomous. For each of the three scenarios, the interaction effect sizes, sample sizes, and Type 1 error rate were varied, resulting in a total of 240 unique simulations. RESULTS: In general, power to detect the interaction effect was either so low or so high at α = 0.05 that raising the Type 1 error rate only served to increase the probability of including a spurious interaction in the model. A small number of scenarios were identified in which an elevated Type 1 error rate may be justified. CONCLUSIONS: Routinely elevating Type 1 error rate when testing interaction effects is not an advisable practice. Researchers are best served by positing interaction effects a priori and accounting for them when conducting sample size calculations.
Empirical Estimation of Term Structure of Interbank Rates in China
Institute of Scientific and Technical Information of China (English)
Min Xiaoping
2006-01-01
Nelson-Siegel model (NS model) and 2 extended NS models were compared by using daily interbank government bond data.Based on the grouping of bonds according to the residual term to maturity, the empirical research proceeded with in-sample and out-of-sample tests. The results show that the 3 models are almost equivalent in estimating interbank term structure of interest rates.Within the term to maturities between 0 and 7 years, the gap of the absolute errors of the 3 models between in-sample and out-of-sample is smaller than 0.2 Yuan, and the absolute values of the in-sample and out-of-sample errors are smaller than 0. 1 Yuan, so the estimation is credible. Within the term to maturities between 7 and 20 years, the gap of the absolute errors of the 3 models between in-sample and out-of-sample is larger than 0. 4 Yuan, and the absolute values of the in-sample and out-of-sample errors are larger than 1.0 Yuan, so the estimation is incredible.
Increasing fMRI sampling rate improves Granger causality estimates.
Directory of Open Access Journals (Sweden)
Fa-Hsuan Lin
Full Text Available Estimation of causal interactions between brain areas is necessary for elucidating large-scale functional brain networks underlying behavior and cognition. Granger causality analysis of time series data can quantitatively estimate directional information flow between brain regions. Here, we show that such estimates are significantly improved when the temporal sampling rate of functional magnetic resonance imaging (fMRI is increased 20-fold. Specifically, healthy volunteers performed a simple visuomotor task during blood oxygenation level dependent (BOLD contrast based whole-head inverse imaging (InI. Granger causality analysis based on raw InI BOLD data sampled at 100-ms resolution detected the expected causal relations, whereas when the data were downsampled to the temporal resolution of 2 s typically used in echo-planar fMRI, the causality could not be detected. An additional control analysis, in which we SINC interpolated additional data points to the downsampled time series at 0.1-s intervals, confirmed that the improvements achieved with the real InI data were not explainable by the increased time-series length alone. We therefore conclude that the high-temporal resolution of InI improves the Granger causality connectivity analysis of the human brain.
Automatic estimation of pressure-dependent rate coefficients
Allen, Joshua W.
2012-01-01
A general framework is presented for accurately and efficiently estimating the phenomenological pressure-dependent rate coefficients for reaction networks of arbitrary size and complexity using only high-pressure-limit information. Two aspects of this framework are discussed in detail. First, two methods of estimating the density of states of the species in the network are presented, including a new method based on characteristic functional group frequencies. Second, three methods of simplifying the full master equation model of the network to a single set of phenomenological rates are discussed, including a new method based on the reservoir state and pseudo-steady state approximations. Both sets of methods are evaluated in the context of the chemically-activated reaction of acetyl with oxygen. All three simplifications of the master equation are usually accurate, but each fails in certain situations, which are discussed. The new methods usually provide good accuracy at a computational cost appropriate for automated reaction mechanism generation. This journal is © the Owner Societies.
Commercial Discount Rate Estimation for Efficiency Standards Analysis
Energy Technology Data Exchange (ETDEWEB)
Fujita, K. Sydny [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
2016-04-13
Underlying each of the Department of Energy's (DOE's) federal appliance and equipment standards are a set of complex analyses of the projected costs and benefits of regulation. Any new or amended standard must be designed to achieve significant additional energy conservation, provided that it is technologically feasible and economically justified (42 U.S.C. 6295(o)(2)(A)). A proposed standard is considered economically justified when its benefits exceed its burdens, as represented by the projected net present value of costs and benefits. DOE performs multiple analyses to evaluate the balance of costs and benefits of commercial appliance and equipment e efficiency standards, at the national and individual building or business level, each framed to capture different nuances of the complex impact of standards on the commercial end user population. The Life-Cycle Cost (LCC) analysis models the combined impact of appliance first cost and operating cost changes on a representative commercial building sample in order to identify the fraction of customers achieving LCC savings or incurring net cost at the considered efficiency levels.1 Thus, the choice of commercial discount rate value(s) used to calculate the present value of energy cost savings within the Life-Cycle Cost model implicitly plays a key role in estimating the economic impact of potential standard levels.2 This report is intended to provide a more in-depth discussion of the commercial discount rate estimation process than can be readily included in standard rulemaking Technical Support Documents (TSDs).
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating e...
Stackelberg, Paul E.; Barbash, Jack E.; Gilliom, Robert J.; Stone, Wesley W.; Wolock, David M.
2012-01-01
Tobit regression models were developed to predict the summed concentration of atrazine [6-chloro-N-ethyl-N'-(1-methylethyl)-1,3,5-triazine-2,4-diamine] and its degradate deethylatrazine [6-chloro-N-(1-methylethyl)-1,3,5,-triazine-2,4-diamine] (DEA) in shallow groundwater underlying agricultural settings across the conterminous United States. The models were developed from atrazine and DEA concentrations in samples from 1298 wells and explanatory variables that represent the source of atrazine and various aspects of the transport and fate of atrazine and DEA in the subsurface. One advantage of these newly developed models over previous national regression models is that they predict concentrations (rather than detection frequency), which can be compared with water quality benchmarks. Model results indicate that variability in the concentration of atrazine residues (atrazine plus DEA) in groundwater underlying agricultural areas is more strongly controlled by the history of atrazine use in relation to the timing of recharge (groundwater age) than by processes that control the dispersion, adsorption, or degradation of these compounds in the saturated zone. Current (1990s) atrazine use was found to be a weak explanatory variable, perhaps because it does not represent the use of atrazine at the time of recharge of the sampled groundwater and because the likelihood that these compounds will reach the water table is affected by other factors operating within the unsaturated zone, such as soil characteristics, artificial drainage, and water movement. Results show that only about 5% of agricultural areas have greater than a 10% probability of exceeding the USEPA maximum contaminant level of 3.0 μg L-1. These models are not developed for regulatory purposes but rather can be used to (i) identify areas of potential concern, (ii) provide conservative estimates of the concentrations of atrazine residues in deeper potential drinking water supplies, and (iii) set priorities
Diversity, disparity, and evolutionary rate estimation for unresolved Yule trees.
Crawford, Forrest W; Suchard, Marc A
2013-05-01
The branching structure of biological evolution confers statistical dependencies on phenotypic trait values in related organisms. For this reason, comparative macroevolutionary studies usually begin with an inferred phylogeny that describes the evolutionary relationships of the organisms of interest. The probability of the observed trait data can be computed by assuming a model for trait evolution, such as Brownian motion, over the branches of this fixed tree. However, the phylogenetic tree itself contributes statistical uncertainty to estimates of rates of phenotypic evolution, and many comparative evolutionary biologists regard the tree as a nuisance parameter. In this article, we present a framework for analytically integrating over unknown phylogenetic trees in comparative evolutionary studies by assuming that the tree arises from a continuous-time Markov branching model called the Yule process. To do this, we derive a closed-form expression for the distribution of phylogenetic diversity (PD), which is the sum of branch lengths connecting the species in a clade. We then present a generalization of PD which is equivalent to the expected trait disparity in a set of taxa whose evolutionary relationships are generated by a Yule process and whose traits evolve by Brownian motion. We find expressions for the distribution of expected trait disparity under a Yule tree. Given one or more observations of trait disparity in a clade, we perform fast likelihood-based estimation of the Brownian variance for unresolved clades. Our method does not require simulation or a fixed phylogenetic tree. We conclude with a brief example illustrating Brownian rate estimation for 12 families in the mammalian order Carnivora, in which the phylogenetic tree for each family is unresolved.
Alwee, Razana; Shamsuddin, Siti Mariyam Hj; Sallehuddin, Roselina
2013-01-01
Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.
Directory of Open Access Journals (Sweden)
Razana Alwee
2013-01-01
Full Text Available Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR and autoregressive integrated moving average (ARIMA to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.
Spahr, Norman E.; Mueller, David K.; Wolock, David M.; Hitt, Kerie J.; Gronberg, JoAnn M.
2010-01-01
Data collected for the U.S. Geological Survey National Water-Quality Assessment program from 1992-2001 were used to investigate the relations between nutrient concentrations and nutrient sources, hydrology, and basin characteristics. Regression models were developed to estimate annual flow-weighted concentrations of total nitrogen and total phosphorus using explanatory variables derived from currently available national ancillary data. Different total-nitrogen regression models were used for agricultural (25 percent or more of basin area classified as agricultural land use) and nonagricultural basins. Atmospheric, fertilizer, and manure inputs of nitrogen, percent sand in soil, subsurface drainage, overland flow, mean annual precipitation, and percent undeveloped area were significant variables in the agricultural basin total nitrogen model. Significant explanatory variables in the nonagricultural total nitrogen model were total nonpoint-source nitrogen input (sum of nitrogen from manure, fertilizer, and atmospheric deposition), population density, mean annual runoff, and percent base flow. The concentrations of nutrients derived from regression (CONDOR) models were applied to drainage basins associated with the U.S. Environmental Protection Agency (USEPA) River Reach File (RF1) to predict flow-weighted mean annual total nitrogen concentrations for the conterminous United States. The majority of stream miles in the Nation have predicted concentrations less than 5 milligrams per liter. Concentrations greater than 5 milligrams per liter were predicted for a broad area extending from Ohio to eastern Nebraska, areas spatially associated with greater application of fertilizer and manure. Probabilities that mean annual total-nitrogen concentrations exceed the USEPA regional nutrient criteria were determined by incorporating model prediction uncertainty. In all nutrient regions where criteria have been established, there is at least a 50 percent probability of exceeding
Divergence of conserved non-coding sequences: rate estimates and relative rate tests.
Wagner, Günter P; Fried, Claudia; Prohaska, Sonja J; Stadler, Peter F
2004-11-01
In many eukaryotic genomes only a small fraction of the DNA codes for proteins, but the non-protein coding DNA harbors important genetic elements directing the development and the physiology of the organisms, like promoters, enhancers, insulators, and micro-RNA genes. The molecular evolution of these genetic elements is difficult to study because their functional significance is hard to deduce from sequence information alone. Here we propose an approach to the study of the rate of evolution of functional non-coding sequences at a macro-evolutionary scale. We identify functionally important non-coding sequences as Conserved Non-Coding Nucleotide (CNCN) sequences from the comparison of two outgroup species. The CNCN sequences so identified are then compared to their homologous sequences in a pair of ingroup species, and we monitor the degree of modification these sequences suffered in the two ingroup lineages. We propose a method to test for rate differences in the modification of CNCN sequences among the two ingroup lineages, as well as a method to estimate their rate of modification. We apply this method to the full sequences of the HoxA clusters from six gnathostome species: a shark, Heterodontus francisci; a basal ray finned fish, Polypterus senegalus; the amphibian, Xenopus tropicalis; as well as three mammalian species, human, rat and mouse. The results show that the evolutionary rate of CNCN sequences is not distinguishable among the three mammalian lineages, while the Xenopus lineage has a significantly increased rate of evolution. Furthermore the estimates of the rate parameters suggest that in the stem lineage of mammals the rate of CNCN sequence evolution was more than twice the rate observed within the placental amniotes clade, suggesting a high rate of evolution of cis-regulatory elements during the origin of amniotes and mammals. We conclude that the proposed methods can be used for testing hypotheses about the rate and pattern of evolution of putative
Directory of Open Access Journals (Sweden)
Hukharnsusatrue, A.
2005-11-01
Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than
Drilling Penetration Rate Estimation using Rock Drillability Characterization Index
Taheri, Abbas; Qao, Qi; Chanda, Emmanuel
2016-10-01
Rock drilling Penetration Rate (PR) is influenced by many parameters including rock properties, machine parameters of the chosen rig and the working process. Five datasets were utilized to quantitatively assess the effect of various rock properties on PR. The datasets consisted of two sets of diamond and percussive drilling and one set of rotary drilling data. A new rating system called Rock Drillability Characterization index (RDCi) is proposed to predict PR for different drilling methods. This drillability model incorporates the uniaxial compressive strength of intact rock, the P-wave velocity and the density of rock. The RDCi system is further applied to predict PR in the diamond rotary drilling, non-coring rotary drilling and percussive drilling. Strong correlations between PR and RDCi values were observed indicating that the developed drillability rating model is relevant and can be utilized to effectively predict the rock drillability in any operating environment. A practical procedure for predicting PR using the RDCi was established. The drilling engineers can follow this procedure to use RDCi as an effective method to estimate drillability.
Directory of Open Access Journals (Sweden)
Elisabet Zamora
Full Text Available BACKGROUND: To compare the prognostic value of estimated glomerular filtration rate, cystatin-C, an alternative renal biomarker, and their combination, in an outpatient population with heart failure. Estimated glomerular filtration rate is routinely used to assess renal function in heart failure patients. We recently demonstrated that the Cockroft-Gault formula is the best among the most commonly used estimated glomerular filtration rate formulas for predicting heart failure prognosis. METHODOLOGY/PRINCIPAL FINDINGS: A total of 879 consecutive patients (72% men, age 70.4 years [P(25-75 60.5-77.2] were studied. The etiology of heart failure was mainly ischemic heart disease (52.7%. The left ventricular ejection fraction was 34% (P(25-75 26-43%. Most patients were New York Heart Association class II (65.8% or III (25.9%. During a median follow-up of 3.46 years (P(25-75 1.85-5.05, 312 deaths were recorded. In an adjusted model, estimated glomerular filtration rate and cystatin-C showed similar prognostic value according to the area under the curve (0.763 and 0.765, respectively. In Cox regression, the multivariable analysis hazard ratios were 0.99 (95% CI: 0.98-1, P = 0.006 and 1.14 (95% CI: 1.02-1.28, P = 0.02 for estimated glomerular filtration rate and cystatin-C, respectively. Reclassification, assessed by the integration discrimination improvement and the net reclassification improvement indices, was poorer with cystatin-C (-0.5 [-1.0;-0.1], P = 0.024 and -4.9 [-8.8;-1.0], P = 0.013, respectively. The value of cystatin-C over estimated glomerular filtration rate for risk-stratification only emerged in patients with moderate renal dysfunction (eGFR 30-60 ml/min/1.73 m(2, chi-square 12.9, P<0.001. CONCLUSIONS/SIGNIFICANCE: Taken together, the results indicate that estimated glomerular filtration rate and cystatin-C have similar long-term predictive values in a real-life ambulatory heart failure population. Cystatin-C seems to
Zou, Kelly H; Carlsson, Martin O; Quinn, Sheila A
2010-10-30
Patient reported outcome and observer evaluative studies in clinical trials and post-hoc analyses often use instruments that measure responses on ordinal-rating or Likert scales. We propose a flexible distributional approach by modeling the change scores from the baseline to the end of the study using independent beta distributions. The two shape parameters of the fitted beta distributions are estimated by matching-moments. Covariates and the interaction terms are included in multivariate beta-regression analyses under generalized linear mixed models. These methods are illustrated on the treatment satisfaction data in an overactive bladder drug study with four treatment arms. Monte-Carlo simulations were conducted to compare the Type 1 errors and statistical powers using a beta likelihood ratio test of the proposed method against its fully nonparametric or parametric alternatives. Copyright © 2010 John Wiley & Sons, Ltd.
Kanungo, D. P.; Sharma, Shaifaly; Pain, Anindya
2014-09-01
The shear strength parameters of soil (cohesion and angle of internal friction) are quite essential in solving many civil engineering problems. In order to determine these parameters, laboratory tests are used. The main objective of this work is to evaluate the potential of Artificial Neural Network (ANN) and Regression Tree (CART) techniques for the indirect estimation of these parameters. Four different models, considering different combinations of 6 inputs, such as gravel %, sand %, silt %, clay %, dry density, and plasticity index, were investigated to evaluate the degree of their effects on the prediction of shear parameters. A performance evaluation was carried out using Correlation Coefficient and Root Mean Squared Error measures. It was observed that for the prediction of friction angle, the performance of both the techniques is about the same. However, for the prediction of cohesion, the ANN technique performs better than the CART technique. It was further observed that the model considering all of the 6 input soil parameters is the most appropriate model for the prediction of shear parameters. Also, connection weight and bias analyses of the best neural network (i.e., 6/2/2) were attempted using Connection Weight, Garson, and proposed Weight-bias approaches to characterize the influence of input variables on shear strength parameters. It was observed that the Connection Weight Approach provides the best overall methodology for accurately quantifying variable importance, and should be favored over the other approaches examined in this study.
Directory of Open Access Journals (Sweden)
Jevrić Lidija R.
2013-01-01
Full Text Available The estimation of retention factors by correlation equations with physico-chemical properties can be of great helpl in chromatographic studies. The retention factors were experimentally measured by RP-HPTLC on impregnated silica gel with paraffin oil using two-component solvent systems. The relationships between solute retention and modifier concentration were described by Snyder’s linear equation. A quantitative structure-retention relationship was developed for a series of s-triazine compounds by the multiple linear regression (MLR analysis. The MLR procedure was used to model the relationships between the molecular descriptors and retention of s-triazine derivatives. The physicochemical molecular descriptors were calculated from the optimized structures. The physico-chemical properties were the lipophilicity (log P, connectivity indices (χ, total energy (Et, water solubility (log W, dissociation constant (pKa, molar refractivity (MR, and Gibbs energy (GibbsE of s-triazines. A high agreement between the experimental and predicted retention parameters was obtained when the dissociation constant and the hydrophilic-lipophilic balance were used as the molecular descriptors. The empirical equations may be successfully used for the prediction of the various chromatographic characteristics of substances, with a similar chemical structure. [Projekat Ministarstva nauke Republike Srbije, br. 31055, br. 172012, br. 172013 i br. 172014
Akita, Yasuyuki; Baldasano, Jose M; Beelen, Rob; Cirach, Marta; de Hoogh, Kees; Hoek, Gerard; Nieuwenhuijsen, Mark; Serre, Marc L; de Nazelle, Audrey
2014-04-15
In recognition that intraurban exposure gradients may be as large as between-city variations, recent air pollution epidemiologic studies have become increasingly interested in capturing within-city exposure gradients. In addition, because of the rapidly accumulating health data, recent studies also need to handle large study populations distributed over large geographic domains. Even though several modeling approaches have been introduced, a consistent modeling framework capturing within-city exposure variability and applicable to large geographic domains is still missing. To address these needs, we proposed a modeling framework based on the Bayesian Maximum Entropy method that integrates monitoring data and outputs from existing air quality models based on Land Use Regression (LUR) and Chemical Transport Models (CTM). The framework was applied to estimate the yearly average NO2 concentrations over the region of Catalunya in Spain. By jointly accounting for the global scale variability in the concentration from the output of CTM and the intraurban scale variability through LUR model output, the proposed framework outperformed more conventional approaches.
Directory of Open Access Journals (Sweden)
Corrado Dimauro
2010-01-01
Full Text Available Two methods of SNPs pre-selection based on single marker regression for the estimation of genomic breeding values (G-EBVs were compared using simulated data provided by the XII QTL-MAS workshop: i Bonferroni correction of the significance threshold and ii Permutation test to obtain the reference distribution of the null hypothesis and identify significant markers at P<0.01 and P<0.001 significance thresholds. From the set of markers significant at P<0.001, random subsets of 50% and 25% markers were extracted, to evaluate the effect of further reducing the number of significant SNPs on G-EBV predictions. The Bonferroni correction method allowed the identification of 595 significant SNPs that gave the best G-EBV accuracies in prediction generations (82.80%. The permutation methods gave slightly lower G-EBV accuracies even if a larger number of SNPs resulted significant (2,053 and 1,352 for 0.01 and 0.001 significance thresholds, respectively. Interestingly, halving or dividing by four the number of SNPs significant at P<0.001 resulted in an only slightly decrease of G-EBV accuracies. The genetic structure of the simulated population with few QTL carrying large effects, might have favoured the Bonferroni method.
Cheng, Yu-Huei
2015-01-01
Primers plays important role in polymerase chain reaction (PCR) experiments, thus it is necessary to select characteristic primers. Unfortunately, manual primer design manners are time-consuming and easy to get human negligence because many PCR constraints must be considered simultaneously. Automatic programs for primer design were developed urgently. In this study, the teaching-learning-based optimization (TLBO), a robust and free of algorithm-specific parameters method, is applied to screen primers conformed primer constraints. The optimal primer frequency (OPF) based on three known melting temperature formulas is estimated by 500 runs for primer design in each different number of generations. We selected optimal primers from fifty random nucleotide sequences of Homo sapiens at NCBI. The results indicate that the SantaLucia's formula is better coupled with the method to get higher optimal primer frequency and shorter CPU-time than the Wallace's formula and the Bolton and McCarthy's formula. Through the regression analysis, we also find the generations are significantly associated with the optimal primer frequency. The results are helpful for developing the novel TLBO-based computational method to design feasible primers.
Estimating glomerular filtration rate preoperatively for patients undergoing hepatectomy
Institute of Scientific and Technical Information of China (English)
Yoshimi Iwasaki; Tokihiko Sawada; Shozo Mori; Yukihiro Iso; Masato Katoh; Kyu Rokkaku; Junji Kita; Mitsugi Shimoda; Keiichi Kubota
2009-01-01
AIM: To compare creatinine clearance (Ccr) with estimated glomerular filtration rate (eGFR) in preoperative renal function tests in patients undergoing hepatectomy. METHODS: The records of 197 patients undergoing hepatectomy between August 2006 and August 2008 were studied, and preoperative Ccr, a three-variable equation for eGFR (eGFR3) and a five-variable equation for eGFR (eGFR5) were calculated. Abnormal values were defined as Ccr < 50 mL/min, eGFR3 and eGFR5 < 60 mL/min per 1.73 m2. The maximum increases in the postoperative serum creatinine (post Cr) level and postoperative rate of increase in the serum Cr level (post Cr rate) were compared. RESULTS: There were 37 patients (18.8%) withabnormal Ccr, 31 (15.7%) with abnormal eGFR3, and 40 (20.3%) with abnormal eGFR5. Although there were no significant differences in the post Cr rate between patients with normal and abnormal Ccr, eGFR3 and eGFR5 values, the post Cr level was significantly higher in patients with eGFR3 and eGFR5 abnormality than in normal patients ( P < 0.0001). Post Cr level tended to be higher in patients with Ccr abnormality ( P = 0.0936 and P = 0.0875, respectively). CONCLUSION: eGFR5 and the simpler eGFR3, rather than Ccr, are recommended as a preoperative renal function test in patients undergoing hepatectomy.
Energy Technology Data Exchange (ETDEWEB)
Dey, Prasenjit; Dad, Ajoy K. [Mechanical Engineering Department, National Institute of Technology, Agartala (India)
2016-12-15
The present study aims to predict the heat transfer characteristics around a square cylinder with different corner radii using multivariate adaptive regression splines (MARS). Further, the MARS-generated objective function is optimized by particle swarm optimization. The data for the prediction are taken from the recently published article by the present authors [P. Dey, A. Sarkar, A.K. Das, Development of GEP and ANN model to predict the unsteady forced convection over a cylinder, Neural Comput. Appl. (2015). Further, the MARS model is compared with artificial neural network and gene expression programming. It has been found that the MARS model is very efficient in predicting the heat transfer characteristics. It has also been found that MARS is more efficient than artificial neural network and gene expression programming in predicting the forced convection data, and also particle swarm optimization can efficiently optimize the heat transfer rate.
Gambling disorder: estimated prevalence rates and risk factors in Macao.
Wu, Anise M S; Lai, Mark H C; Tong, Kwok-Kit
2014-12-01
An excessive, problematic gambling pattern has been regarded as a mental disorder in the Diagnostic and Statistical Manual for Mental Disorders (DSM) for more than 3 decades (American Psychiatric Association [APA], 1980). In this study, its latest prevalence in Macao (one of very few cities with legalized gambling in China and the Far East) was estimated with 2 major changes in the diagnostic criteria, suggested by the 5th edition of DSM (APA, 2013): (a) removing the "Illegal Act" criterion, and (b) lowering the threshold for diagnosis. A random, representative sample of 1,018 Macao residents was surveyed with a phone poll design in January 2013. After the 2 changes were adopted, the present study showed that the estimated prevalence rate of gambling disorder was 2.1% of the Macao adult population. Moreover, the present findings also provided empirical support to the application of these 2 recommended changes when assessing symptoms of gambling disorder among Chinese community adults. Personal risk factors of gambling disorder, namely being male, having low education, a preference for casino gambling, as well as high materialism, were identified.