Yelland, Lisa N; Salter, Amy B; Ryan, Philip
2011-10-15
Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
Rudra, Carole B; Williams, Michelle A; Sheppard, Lianne; Koenig, Jane Q; Schiff, Melissa A; Frederick, Ihunnaya O; Dills, Russell
2010-04-15
Exposure to carbon monoxide (CO) and other ambient air pollutants is associated with adverse pregnancy outcomes. While there are several methods of estimating CO exposure, few have been evaluated against exposure biomarkers. The authors examined the relation between estimated CO exposure and blood carboxyhemoglobin concentration in 708 pregnant western Washington State women (1996-2004). Carboxyhemoglobin was measured in whole blood drawn around 13 weeks' gestation. CO exposure during the month of blood draw was estimated using a regression model containing predictor terms for year, month, street and population densities, and distance to the nearest major road. Year and month were the strongest predictors. Carboxyhemoglobin level was correlated with estimated CO exposure (rho = 0.22, 95% confidence interval (CI): 0.15, 0.29). After adjustment for covariates, each 10% increase in estimated exposure was associated with a 1.12% increase in median carboxyhemoglobin level (95% CI: 0.54, 1.69). This association remained after exclusion of 286 women who reported smoking or being exposed to secondhand smoke (rho = 0.24). In this subgroup, the median carboxyhemoglobin concentration increased 1.29% (95% CI: 0.67, 1.91) for each 10% increase in CO exposure. Monthly estimated CO exposure was moderately correlated with an exposure biomarker. These results support the validity of this regression model for estimating ambient CO exposures in this population and geographic setting.
Estimation of genetic parameters related to eggshell strength using random regression models.
Guo, J; Ma, M; Qu, L; Shen, M; Dou, T; Wang, K
2015-01-01
This study examined the changes in eggshell strength and the genetic parameters related to this trait throughout a hen's laying life using random regression. The data were collected from a crossbred population between 2011 and 2014, where the eggshell strength was determined repeatedly for 2260 hens. Using random regression models (RRMs), several Legendre polynomials were employed to estimate the fixed, direct genetic and permanent environment effects. The residual effects were treated as independently distributed with heterogeneous variance for each test week. The direct genetic variance was included with second-order Legendre polynomials and the permanent environment with third-order Legendre polynomials. The heritability of eggshell strength ranged from 0.26 to 0.43, the repeatability ranged between 0.47 and 0.69, and the estimated genetic correlations between test weeks was high at > 0.67. The first eigenvalue of the genetic covariance matrix accounted for about 97% of the sum of all the eigenvalues. The flexibility and statistical power of RRM suggest that this model could be an effective method to improve eggshell quality and to reduce losses due to cracked eggs in a breeding plan.
van de Kassteele, Jan; Zwakhals, Laurens; Breugelmans, Oscar; Ameling, Caroline; van den Brink, Carolien
2017-07-01
Local policy makers increasingly need information on health-related indicators at smaller geographic levels like districts or neighbourhoods. Although more large data sources have become available, direct estimates of the prevalence of a health-related indicator cannot be produced for neighbourhoods for which only small samples or no samples are available. Small area estimation provides a solution, but unit-level models for binary-valued outcomes that can handle both non-linear effects of the predictors and spatially correlated random effects in a unified framework are rarely encountered. We used data on 26 binary-valued health-related indicators collected on 387,195 persons in the Netherlands. We associated the health-related indicators at the individual level with a set of 12 predictors obtained from national registry data. We formulated a structured additive regression model for small area estimation. The model captured potential non-linear relations between the predictors and the outcome through additive terms in a functional form using penalized splines and included a term that accounted for spatially correlated heterogeneity between neighbourhoods. The registry data were used to predict individual outcomes which in turn are aggregated into higher geographical levels, i.e. neighbourhoods. We validated our method by comparing the estimated prevalences with observed prevalences at the individual level and by comparing the estimated prevalences with direct estimates obtained by weighting methods at municipality level. We estimated the prevalence of the 26 health-related indicators for 415 municipalities, 2599 districts and 11,432 neighbourhoods in the Netherlands. We illustrate our method on overweight data and show that there are distinct geographic patterns in the overweight prevalence. Calibration plots show that the estimated prevalences agree very well with observed prevalences at the individual level. The estimated prevalences agree reasonably well with the
Moreno-Betancur, Margarita; Latouche, Aurélien; Menvielle, Gwenn; Kunst, Anton E.; Rey, Grégoire
2015-01-01
The relative index of inequality and the slope index of inequality are the two major indices used in epidemiologic studies for the measurement of socioeconomic inequalities in health. Yet the current definitions of these indices are not adapted to their main purpose, which is to provide summary
Directory of Open Access Journals (Sweden)
Wedagama D.M.P.
2010-01-01
Full Text Available In Denpasar the capital of Bali Province, motorcycle accident contributes to about 80% of total road accidents. Out of those motorcycle accidents, 32% are fatal accidents. This study investigates the influence of accident related factors on motorcycle fatal accidents in the city of Denpasar during period 2006-2008 using a logistic regression model. The study found that the fatality of collision with pedestrians and right angle accidents were respectively about 0.44 and 0.40 times lower than collision with other vehicles and accidents due to other factors. In contrast, the odds that a motorcycle accident will be fatal due to collision with heavy and light vehicles were 1.67 times more likely than with other motorcycles. Collision with pedestrians, right angle accidents, and heavy and light vehicles were respectively accounted for 31%, 29%, and 63% of motorcycle fatal accidents.
Dynamic travel time estimation using regression trees.
2008-10-01
This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...
A logistic regression estimating function for spatial Gibbs point processes
DEFF Research Database (Denmark)
Baddeley, Adrian; Coeurjolly, Jean-François; Rubak, Ege
We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related to the p......We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related...
Regression Equations for Birth Weight Estimation using ...
African Journals Online (AJOL)
In this study, Birth Weight has been estimated from anthropometric measurements of hand and foot. Linear regression equations were formed from each of the measured variables. These simple equations can be used to estimate Birth Weight of new born babies, in order to identify those with low birth weight and referred to ...
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
Robust median estimator in logisitc regression
Czech Academy of Sciences Publication Activity Database
Hobza, T.; Pardo, L.; Vajda, Igor
2008-01-01
Roč. 138, č. 12 (2008), s. 3822-3840 ISSN 0378-3758 R&D Projects: GA MŠk 1M0572 Grant - others:Instituto Nacional de Estadistica (ES) MPO FI - IM3/136; GA MŠk(CZ) MTM 2006-06872 Institutional research plan: CEZ:AV0Z10750506 Keywords : Logistic regression * Median * Robustness * Consistency and asymptotic normality * Morgenthaler * Bianco and Yohai * Croux and Hasellbroeck Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.679, year: 2008 http://library.utia.cas.cz/separaty/2008/SI/vajda-robust%20median%20estimator%20in%20logistic%20regression.pdf
Hodgson, Robert; Reason, Timothy; Trueman, David; Wickstead, Rose; Kusel, Jeanette; Jasilek, Adam; Claxton, Lindsay; Taylor, Matthew; Pulikottil-Jacob, Ruth
2017-10-01
The estimation of utility values for the economic evaluation of therapies for wet age-related macular degeneration (AMD) is a particular challenge. Previous economic models in wet AMD have been criticized for failing to capture the bilateral nature of wet AMD by modelling visual acuity (VA) and utility values associated with the better-seeing eye only. Here we present a de novo regression analysis using generalized estimating equations (GEE) applied to a previous dataset of time trade-off (TTO)-derived utility values from a sample of the UK population that wore contact lenses to simulate visual deterioration in wet AMD. This analysis allows utility values to be estimated as a function of VA in both the better-seeing eye (BSE) and worse-seeing eye (WSE). VAs in both the BSE and WSE were found to be statistically significant (p regression analysis provides a possible source of utility values to allow future economic models to capture the quality of life impact of changes in VA in both eyes. Novartis Pharmaceuticals UK Limited.
Estimating the exceedance probability of rain rate by logistic regression
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Independent contrasts and PGLS regression estimators are equivalent.
Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary
2012-05-01
We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.
A flexible fuzzy regression algorithm for forecasting oil consumption estimation
International Nuclear Information System (INIS)
Azadeh, A.; Khakestani, M.; Saberi, M.
2009-01-01
Oil consumption plays a vital role in socio-economic development of most countries. This study presents a flexible fuzzy regression algorithm for forecasting oil consumption based on standard economic indicators. The standard indicators are annual population, cost of crude oil import, gross domestic production (GDP) and annual oil production in the last period. The proposed algorithm uses analysis of variance (ANOVA) to select either fuzzy regression or conventional regression for future demand estimation. The significance of the proposed algorithm is three fold. First, it is flexible and identifies the best model based on the results of ANOVA and minimum absolute percentage error (MAPE), whereas previous studies consider the best fitted fuzzy regression model based on MAPE or other relative error results. Second, the proposed model may identify conventional regression as the best model for future oil consumption forecasting because of its dynamic structure, whereas previous studies assume that fuzzy regression always provide the best solutions and estimation. Third, it utilizes the most standard independent variables for the regression models. To show the applicability and superiority of the proposed flexible fuzzy regression algorithm the data for oil consumption in Canada, United States, Japan and Australia from 1990 to 2005 are used. The results show that the flexible algorithm provides accurate solution for oil consumption estimation problem. The algorithm may be used by policy makers to accurately foresee the behavior of oil consumption in various regions.
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Parameter Estimation for Improving Association Indicators in Binary Logistic Regression
Directory of Open Access Journals (Sweden)
Mahdi Bashiri
2012-02-01
Full Text Available The aim of this paper is estimation of Binary logistic regression parameters for maximizing the log-likelihood function with improved association indicators. In this paper the parameter estimation steps have been explained and then measures of association have been introduced and their calculations have been analyzed. Moreover a new related indicators based on membership degree level have been expressed. Indeed association measures demonstrate the number of success responses occurred in front of failure in certain number of Bernoulli independent experiments. In parameter estimation, existing indicators values is not sensitive to the parameter values, whereas the proposed indicators are sensitive to the estimated parameters during the iterative procedure. Therefore, proposing a new association indicator of binary logistic regression with more sensitivity to the estimated parameters in maximizing the log- likelihood in iterative procedure is innovation of this study.
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2011-01-01
In this paper, two non-parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel-based approaches. The second estimator
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2009-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2010-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Sparse reduced-rank regression with covariance estimation
Chen, Lisha
2014-12-08
Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Sparse reduced-rank regression with covariance estimation
Chen, Lisha; Huang, Jianhua Z.
2014-01-01
Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
On the estimation and testing of predictive panel regressions
Karabiyik, H.; Westerlund, Joakim; Narayan, Paresh
2016-01-01
Hjalmarsson (2010) considers an OLS-based estimator of predictive panel regressions that is argued to be mixed normal under very general conditions. In a recent paper, Westerlund et al. (2016) show that while consistent, the estimator is generally not mixed normal, which invalidates standard normal
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Small sample GEE estimation of regression parameters for longitudinal data.
Paul, Sudhir; Zhang, Xuemao
2014-09-28
Longitudinal (clustered) response data arise in many bio-statistical applications which, in general, cannot be assumed to be independent. Generalized estimating equation (GEE) is a widely used method to estimate marginal regression parameters for correlated responses. The advantage of the GEE is that the estimates of the regression parameters are asymptotically unbiased even if the correlation structure is misspecified, although their small sample properties are not known. In this paper, two bias adjusted GEE estimators of the regression parameters in longitudinal data are obtained when the number of subjects is small. One is based on a bias correction, and the other is based on a bias reduction. Simulations show that the performances of both the bias-corrected methods are similar in terms of bias, efficiency, coverage probability, average coverage length, impact of misspecification of correlation structure, and impact of cluster size on bias correction. Both these methods show superior properties over the GEE estimates for small samples. Further, analysis of data involving a small number of subjects also shows improvement in bias, MSE, standard error, and length of the confidence interval of the estimates by the two bias adjusted methods over the GEE estimates. For small to moderate sample sizes (N ≤50), either of the bias-corrected methods GEEBc and GEEBr can be used. However, the method GEEBc should be preferred over GEEBr, as the former is computationally easier. For large sample sizes, the GEE method can be used. Copyright © 2014 John Wiley & Sons, Ltd.
Performance of a New Restricted Biased Estimator in Logistic Regression
Directory of Open Access Journals (Sweden)
Yasin ASAR
2017-12-01
Full Text Available It is known that the variance of the maximum likelihood estimator (MLE inflates when the explanatory variables are correlated. This situation is called the multicollinearity problem. As a result, the estimations of the model may not be trustful. Therefore, this paper introduces a new restricted estimator (RLTE that may be applied to get rid of the multicollinearity when the parameters lie in some linear subspace in logistic regression. The mean squared errors (MSE and the matrix mean squared errors (MMSE of the estimators considered in this paper are given. A Monte Carlo experiment is designed to evaluate the performances of the proposed estimator, the restricted MLE (RMLE, MLE and Liu-type estimator (LTE. The criterion of performance is chosen to be MSE. Moreover, a real data example is presented. According to the results, proposed estimator has better performance than MLE, RMLE and LTE.
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
On the robust nonparametric regression estimation for a functional regressor
Azzedine , Nadjia; Laksaci , Ali; Ould-Saïd , Elias
2009-01-01
On the robust nonparametric regression estimation for a functional regressor correspondance: Corresponding author. (Ould-Said, Elias) (Azzedine, Nadjia) (Laksaci, Ali) (Ould-Said, Elias) Departement de Mathematiques--> , Univ. Djillali Liabes--> , BP 89--> , 22000 Sidi Bel Abbes--> - ALGERIA (Azzedine, Nadjia) Departement de Mathema...
Two biased estimation techniques in linear regression: Application to aircraft
Klein, Vladislav
1988-01-01
Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.
Online and Batch Supervised Background Estimation via L1 Regression
Dutta, Aritra
2017-11-23
We propose a surprisingly simple model for supervised video background estimation. Our model is based on $\\\\ell_1$ regression. As existing methods for $\\\\ell_1$ regression do not scale to high-resolution videos, we propose several simple and scalable methods for solving the problem, including iteratively reweighted least squares, a homotopy method, and stochastic gradient descent. We show through extensive experiments that our model and methods match or outperform the state-of-the-art online and batch methods in virtually all quantitative and qualitative measures.
Online and Batch Supervised Background Estimation via L1 Regression
Dutta, Aritra; Richtarik, Peter
2017-01-01
We propose a surprisingly simple model for supervised video background estimation. Our model is based on $\\ell_1$ regression. As existing methods for $\\ell_1$ regression do not scale to high-resolution videos, we propose several simple and scalable methods for solving the problem, including iteratively reweighted least squares, a homotopy method, and stochastic gradient descent. We show through extensive experiments that our model and methods match or outperform the state-of-the-art online and batch methods in virtually all quantitative and qualitative measures.
Directory of Open Access Journals (Sweden)
Qiutong Jin
2016-06-01
Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.
Optimized support vector regression for drilling rate of penetration estimation
Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa
2015-12-01
In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Directory of Open Access Journals (Sweden)
Biqing Cai
2015-04-01
Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.
Hierarchical Matching and Regression with Application to Photometric Redshift Estimation
Murtagh, Fionn
2017-06-01
This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or `photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.
Estimating Frequency by Interpolation Using Least Squares Support Vector Regression
Directory of Open Access Journals (Sweden)
Changwei Ma
2015-01-01
Full Text Available Discrete Fourier transform- (DFT- based maximum likelihood (ML algorithm is an important part of single sinusoid frequency estimation. As signal to noise ratio (SNR increases and is above the threshold value, it will lie very close to Cramer-Rao lower bound (CRLB, which is dependent on the number of DFT points. However, its mean square error (MSE performance is directly proportional to its calculation cost. As a modified version of support vector regression (SVR, least squares SVR (LS-SVR can not only still keep excellent capabilities for generalizing and fitting but also exhibit lower computational complexity. In this paper, therefore, LS-SVR is employed to interpolate on Fourier coefficients of received signals and attain high frequency estimation accuracy. Our results show that the proposed algorithm can make a good compromise between calculation cost and MSE performance under the assumption that the sample size, number of DFT points, and resampling points are already known.
Estimation of adjusted rate differences using additive negative binomial regression.
Donoghoe, Mark W; Marschner, Ian C
2016-08-15
Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity-link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model-fitting methods are often unable to cope with the constrained parameter space arising from the non-negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation-conditional maximisation-either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi-parametric regression functions. We illustrate the method using a placebo-controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Genomic breeding value estimation using nonparametric additive regression models
Directory of Open Access Journals (Sweden)
Solberg Trygve
2009-01-01
Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.
Image Jacobian Matrix Estimation Based on Online Support Vector Regression
Directory of Open Access Journals (Sweden)
Shangqin Mao
2012-10-01
Full Text Available Research into robotics visual servoing is an important area in the field of robotics. It has proven difficult to achieve successful results for machine vision and robotics in unstructured environments without using any a priori camera or kinematic models. In uncalibrated visual servoing, image Jacobian matrix estimation methods can be divided into two groups: the online method and the offline method. The offline method is not appropriate for most natural environments. The online method is robust but rough. Moreover, if the images feature configuration changes, it needs to restart the approximating procedure. A novel approach based on an online support vector regression (OL-SVR algorithm is proposed which overcomes the drawbacks and combines the virtues just mentioned.
Nonparametric Estimation of Regression Parameters in Measurement Error Models
Czech Academy of Sciences Publication Activity Database
Ehsanes Saleh, A.K.M.D.; Picek, J.; Kalina, Jan
2009-01-01
Roč. 67, č. 2 (2009), s. 177-200 ISSN 0026-1424 Grant - others:GA AV ČR(CZ) IAA101120801; GA MŠk(CZ) LC06024 Institutional research plan: CEZ:AV0Z10300504 Keywords : asymptotic relative efficiency(ARE) * asymptotic theory * emaculate mode * Me model * R-estimation * Reliabilty ratio(RR) Subject RIV: BB - Applied Statistics, Operational Research
Regression and kriging analysis for grid power factor estimation
Directory of Open Access Journals (Sweden)
Rajesh Guntaka
2014-12-01
Full Text Available The measurement of power factor (PF in electrical utility grids is a mainstay of load balancing and is also a critical element of transmission and distribution efficiency. The measurement of PF dates back to the earliest periods of electrical power distribution to public grids. In the wide-area distribution grid, measurement of current waveforms is trivial and may be accomplished at any point in the grid using a current tap transformer. However, voltage measurement requires reference to ground and so is more problematic and measurements are normally constrained to points that have ready and easy access to a ground source. We present two mathematical analysis methods based on kriging and linear least square estimation (LLSE (regression to derive PF at nodes with unknown voltages that are within a perimeter of sample nodes with ground reference across a selected power grid. Our results indicate an error average of 1.884% that is within acceptable tolerances for PF measurements that are used in load balancing tasks.
Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression
Ndiaye, Eugene; Fercoq, Olivier; Gramfort, Alexandre; Leclère, Vincent; Salmon, Joseph
2017-10-01
In high dimensional settings, sparse structures are crucial for efficiency, both in term of memory, computation and performance. It is customary to consider ℓ 1 penalty to enforce sparsity in such scenarios. Sparsity enforcing methods, the Lasso being a canonical example, are popular candidates to address high dimension. For efficiency, they rely on tuning a parameter trading data fitting versus sparsity. For the Lasso theory to hold this tuning parameter should be proportional to the noise level, yet the latter is often unknown in practice. A possible remedy is to jointly optimize over the regression parameter as well as over the noise level. This has been considered under several names in the literature: Scaled-Lasso, Square-root Lasso, Concomitant Lasso estimation for instance, and could be of interest for uncertainty quantification. In this work, after illustrating numerical difficulties for the Concomitant Lasso formulation, we propose a modification we coined Smoothed Concomitant Lasso, aimed at increasing numerical stability. We propose an efficient and accurate solver leading to a computational cost no more expensive than the one for the Lasso. We leverage on standard ingredients behind the success of fast Lasso solvers: a coordinate descent algorithm, combined with safe screening rules to achieve speed efficiency, by eliminating early irrelevant features.
DEFF Research Database (Denmark)
Sharifzadeh, Sara; Skytte, Jacob Lercke; Nielsen, Otto Højager Attermann
2012-01-01
Statistical solutions find wide spread use in food and medicine quality control. We investigate the effect of different regression and sparse regression methods for a viscosity estimation problem using the spectro-temporal features from new Sub-Surface Laser Scattering (SLS) vision system. From...... with sparse LAR, lasso and Elastic Net (EN) sparse regression methods. Due to the inconsistent measurement condition, Locally Weighted Scatter plot Smoothing (Loess) has been employed to alleviate the undesired variation in the estimated viscosity. The experimental results of applying different methods show...
Magnitude conversion to unified moment magnitude using orthogonal regression relation
Das, Ranjit; Wason, H. R.; Sharma, M. L.
2012-05-01
Homogenization of earthquake catalog being a pre-requisite for seismic hazard assessment requires region based magnitude conversion relationships. Linear Standard Regression (SR) relations fail when both the magnitudes have measurement errors. To accomplish homogenization, techniques like Orthogonal Standard Regression (OSR) are thus used. In this paper a technique is proposed for using such OSR for preparation of homogenized earthquake catalog in moment magnitude Mw. For derivation of orthogonal regression relation between mb and Mw, a data set consisting of 171 events with observed body wave magnitudes (mb,obs) and moment magnitude (Mw,obs) values has been taken from ISC and GCMT databases for Northeast India and adjoining region for the period 1978-2006. Firstly, an OSR relation given below has been developed using mb,obs and Mw,obs values corresponding to 150 events from this data set. M=1.3(±0.004)m-1.4(±0.130), where mb,proxy are body wave magnitude values of the points on the OSR line given by the orthogonality criterion, for observed (mb,obs, Mw,obs) points. A linear relation is then developed between these 150 mb,obs values and corresponding mb,proxy values given by the OSR line using orthogonality criterion. The relation obtained is m=0.878(±0.03)m+0.653(±0.15). The accuracy of the above procedure has been checked with the rest of the data i.e., 21 events values. The improvement in the correlation coefficient value between mb,obs and Mw estimated using the proposed procedure compared to the correlation coefficient value between mb,obs and Mw,obs shows the advantage of OSR relationship for homogenization. The OSR procedure developed in this study can be used to homogenize any catalog containing various magnitudes (e.g., ML, mb, MS) with measurement errors, by their conversion to unified moment magnitude Mw. The proposed procedure also remains valid in case the magnitudes have measurement errors of different orders, i.e. the error variance ratio is
Bulcock, J. W.; And Others
Multicollinearity refers to the presence of highly intercorrelated independent variables in structural equation models, that is, models estimated by using techniques such as least squares regression and maximum likelihood. There is a problem of multicollinearity in both the natural and social sciences where theory formulation and estimation is in…
Directory of Open Access Journals (Sweden)
Danilo F Pereira
2011-02-01
Full Text Available The increasing demand of consumer markets for the welfare of birds in poultry house has motivated many scientific researches to monitor and classify the welfare according to the production environment. Given the complexity between the birds and the environment of the aviary, the correct interpretation of the conduct becomes an important way to estimate the welfare of these birds. This study obtained multiple logistic regression models with capacity of estimating the welfare of broiler breeders in relation to the environment of the aviaries and behaviors expressed by the birds. In the experiment, were observed several behaviors expressed by breeders housed in a climatic chamber under controlled temperatures and three different ammonia concentrations from the air monitored daily. From the analysis of the data it was obtained two logistic regression models, of which the first model uses a value of ammonia concentration measured by unit and the second model uses a binary value to classify the ammonia concentration that is assigned by a person through his olfactory perception. The analysis showed that both models classified the broiler breeder's welfare successfully.As crescentes demandas e exigências dos mercados consumidores pelo bem-estar das aves nos aviários têm motivado diversas pesquisas científicas a monitorar e a classificar o bem-estar em função do ambiente de criação. Diante da complexidade com que as aves interagem com o ambiente do aviário, a correta interpretação dos comportamentos torna-se uma importante maneira para estimar o bem-estar dessas aves. Este trabalho criou modelos de regressão logística múltipla capazes de estimar o bem-estar de matrizes pesadas em função do ambiente do aviário e dos comportamentos expressos pelas aves. No experimento, foram observados diversos comportamentos expressos por matrizes pesadas alojadas em câmara climática sob três temperaturas controladas e diferentes concentrações de am
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
Bias in regression coefficient estimates upon different treatments of ...
African Journals Online (AJOL)
MS and PW consistently overestimated the population parameter. EM and RI, on the other hand, tended to consistently underestimate the population parameter under non-monotonic pattern. Keywords: Missing data, bias, regression, percent missing, non-normality, missing pattern > East African Journal of Statistics Vol.
Stellar atmospheric parameter estimation using Gaussian process regression
Bu, Yude; Pan, Jingchang
2015-02-01
As is well known, it is necessary to derive stellar parameters from massive amounts of spectral data automatically and efficiently. However, in traditional automatic methods such as artificial neural networks (ANNs) and kernel regression (KR), it is often difficult to optimize the algorithm structure and determine the optimal algorithm parameters. Gaussian process regression (GPR) is a recently developed method that has been proven to be capable of overcoming these difficulties. Here we apply GPR to derive stellar atmospheric parameters from spectra. Through evaluating the performance of GPR on Sloan Digital Sky Survey (SDSS) spectra, Medium resolution Isaac Newton Telescope Library of Empirical Spectra (MILES) spectra, ELODIE spectra and the spectra of member stars of galactic globular clusters, we conclude that GPR can derive stellar parameters accurately and precisely, especially when we use data preprocessed with principal component analysis (PCA). We then compare the performance of GPR with that of several widely used regression methods (ANNs, support-vector regression and KR) and find that with GPR it is easier to optimize structures and parameters and more efficient and accurate to extract atmospheric parameters.
Estimation of Production KWS Maize Hybrids Using Nonlinear Regression
Directory of Open Access Journals (Sweden)
Florica MORAR
2018-06-01
Full Text Available This article approaches the model of non-linear regression and the method of smallest squares with examples, including calculations for the model of logarithmic function. This required data obtained from a study which involved the observation of the phases of growth and development in KWS maize hybrids in order to analyze the influence of the MMB quality indicator on grain production per hectare.
Regularized Regression and Density Estimation based on Optimal Transport
Burger, M.; Franek, M.; Schonlieb, C.-B.
2012-01-01
for estimating densities and for preserving edges in the case of total variation regularization. In order to compute solutions of the variational problems, a regularized optimal transport problem needs to be solved, for which we discuss several formulations
Estimating life expectancies for US small areas: a regression framework
Congdon, Peter
2014-01-01
Analysis of area mortality variations and estimation of area life tables raise methodological questions relevant to assessing spatial clustering, and socioeconomic inequalities in mortality. Existing small area analyses of US life expectancy variation generally adopt ad hoc amalgamations of counties to alleviate potential instability of mortality rates involved in deriving life tables, and use conventional life table analysis which takes no account of correlated mortality for adjacent areas or ages. The alternative strategy here uses structured random effects methods that recognize correlations between adjacent ages and areas, and allows retention of the original county boundaries. This strategy generalizes to include effects of area category (e.g. poverty status, ethnic mix), allowing estimation of life tables according to area category, and providing additional stabilization of estimated life table functions. This approach is used here to estimate stabilized mortality rates, derive life expectancies in US counties, and assess trends in clustering and in inequality according to county poverty category.
Allelic drop-out probabilities estimated by logistic regression
DEFF Research Database (Denmark)
Tvedebrink, Torben; Eriksen, Poul Svante; Asplund, Maria
2012-01-01
We discuss the model for estimating drop-out probabilities presented by Tvedebrink et al. [7] and the concerns, that have been raised. The criticism of the model has demonstrated that the model is not perfect. However, the model is very useful for advanced forensic genetic work, where allelic drop-out...... is occurring. With this discussion, we hope to improve the drop-out model, so that it can be used for practical forensic genetics and stimulate further discussions. We discuss how to estimate drop-out probabilities when using a varying number of PCR cycles and other experimental conditions....
Directory of Open Access Journals (Sweden)
Hsin-Lun Wu
Full Text Available Although procedure time analyses are important for operating room management, it is not easy to extract useful information from clinical procedure time data. A novel approach was proposed to analyze procedure time during anesthetic induction. A two-step regression analysis was performed to explore influential factors of anesthetic induction time (AIT. Linear regression with stepwise model selection was used to select significant correlates of AIT and then quantile regression was employed to illustrate the dynamic relationships between AIT and selected variables at distinct quantiles. A total of 1,060 patients were analyzed. The first and second-year residents (R1-R2 required longer AIT than the third and fourth-year residents and attending anesthesiologists (p = 0.006. Factors prolonging AIT included American Society of Anesthesiologist physical status ≧ III, arterial, central venous and epidural catheterization, and use of bronchoscopy. Presence of surgeon before induction would decrease AIT (p < 0.001. Types of surgery also had significant influence on AIT. Quantile regression satisfactorily estimated extra time needed to complete induction for each influential factor at distinct quantiles. Our analysis on AIT demonstrated the benefit of quantile regression analysis to provide more comprehensive view of the relationships between procedure time and related factors. This novel two-step regression approach has potential applications to procedure time analysis in operating room management.
On the estimation of the degree of regression polynomial
International Nuclear Information System (INIS)
Toeroek, Cs.
1997-01-01
The mathematical functions most commonly used to model curvature in plots are polynomials. Generally, the higher the degree of the polynomial, the more complex is the trend that its graph can represent. We propose a new statistical-graphical approach based on the discrete projective transformation (DPT) to estimating the degree of polynomial that adequately describes the trend in the plot
Regularized Regression and Density Estimation based on Optimal Transport
Burger, M.
2012-03-11
The aim of this paper is to investigate a novel nonparametric approach for estimating and smoothing density functions as well as probability densities from discrete samples based on a variational regularization method with the Wasserstein metric as a data fidelity. The approach allows a unified treatment of discrete and continuous probability measures and is hence attractive for various tasks. In particular, the variational model for special regularization functionals yields a natural method for estimating densities and for preserving edges in the case of total variation regularization. In order to compute solutions of the variational problems, a regularized optimal transport problem needs to be solved, for which we discuss several formulations and provide a detailed analysis. Moreover, we compute special self-similar solutions for standard regularization functionals and we discuss several computational approaches and results. © 2012 The Author(s).
A SAS-macro for estimation of the cumulative incidence using Poisson regression
DEFF Research Database (Denmark)
Waltoft, Berit Lindum
2009-01-01
the hazard rates, and the hazard rates are often estimated by the Cox regression. This procedure may not be suitable for large studies due to limited computer resources. Instead one uses Poisson regression, which approximates the Cox regression. Rosthøj et al. presented a SAS-macro for the estimation...... of the cumulative incidences based on the Cox regression. I present the functional form of the probabilities and variances when using piecewise constant hazard rates and a SAS-macro for the estimation using Poisson regression. The use of the macro is demonstrated through examples and compared to the macro presented...
Mass estimation of loose parts in nuclear power plant based on multiple regression
International Nuclear Information System (INIS)
He, Yuanfeng; Cao, Yanlong; Yang, Jiangxin; Gan, Chunbiao
2012-01-01
According to the application of the Hilbert–Huang transform to the non-stationary signal and the relation between the mass of loose parts in nuclear power plant and corresponding frequency content, a new method for loose part mass estimation based on the marginal Hilbert–Huang spectrum (MHS) and multiple regression is proposed in this paper. The frequency spectrum of a loose part in a nuclear power plant can be expressed by the MHS. The multiple regression model that is constructed by the MHS feature of the impact signals for mass estimation is used to predict the unknown masses of a loose part. A simulated experiment verified that the method is feasible and the errors of the results are acceptable. (paper)
[Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].
Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao
2016-03-01
Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.
Directory of Open Access Journals (Sweden)
Rajesh D. R
2015-01-01
Full Text Available Background: At times fragments of soft tissues are found disposed off in the open, in ditches at the crime scene and the same are brought to forensic experts for the purpose of identification and such type of cases pose a real challenge. Objectives: This study was aimed at developing a methodology which could help in personal identification by studying the relation between foot dimensions and stature among south subjects using regression models. Material and Methods: Stature and foot length of 100 subjects (age range 18-22 years were measured. Linear regression equations for stature estimation were calculated. Result: The correlation coefficients between stature and foot lengths were found to be positive and statistically significant. Height = 98.159 + 3.746 × FLRT (r = 0.821 and Height = 91.242 + 3.284 × FLRT (r = 0.837 are the regression formulas from foot lengths for males and females respectively. Conclusion: The regression equation derived in the study can be used reliably for estimation of stature in a diverse population group thus would be of immense value in the field of personal identification especially from mutilated bodies or fragmentary remains.
Directory of Open Access Journals (Sweden)
Menon Carlo
2011-09-01
Full Text Available Abstract Background Several regression models have been proposed for estimation of isometric joint torque using surface electromyography (SEMG signals. Common issues related to torque estimation models are degradation of model accuracy with passage of time, electrode displacement, and alteration of limb posture. This work compares the performance of the most commonly used regression models under these circumstances, in order to assist researchers with identifying the most appropriate model for a specific biomedical application. Methods Eleven healthy volunteers participated in this study. A custom-built rig, equipped with a torque sensor, was used to measure isometric torque as each volunteer flexed and extended his wrist. SEMG signals from eight forearm muscles, in addition to wrist joint torque data were gathered during the experiment. Additional data were gathered one hour and twenty-four hours following the completion of the first data gathering session, for the purpose of evaluating the effects of passage of time and electrode displacement on accuracy of models. Acquired SEMG signals were filtered, rectified, normalized and then fed to models for training. Results It was shown that mean adjusted coefficient of determination (Ra2 values decrease between 20%-35% for different models after one hour while altering arm posture decreased mean Ra2 values between 64% to 74% for different models. Conclusions Model estimation accuracy drops significantly with passage of time, electrode displacement, and alteration of limb posture. Therefore model retraining is crucial for preserving estimation accuracy. Data resampling can significantly reduce model training time without losing estimation accuracy. Among the models compared, ordinary least squares linear regression model (OLS was shown to have high isometric torque estimation accuracy combined with very short training times.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
Tightness of M-estimators for multiple linear regression in time series
DEFF Research Database (Denmark)
Johansen, Søren; Nielsen, Bent
We show tightness of a general M-estimator for multiple linear regression in time series. The positive criterion function for the M-estimator is assumed lower semi-continuous and sufficiently large for large argument: Particular cases are the Huber-skip and quantile regression. Tightness requires...
Estimation of monthly solar exposure on horizontal surface by Angstrom-type regression equation
International Nuclear Information System (INIS)
Ravanshid, S.H.
1981-01-01
To obtain solar flux intensity, solar radiation measuring instruments are the best. In the absence of instrumental data there are other meteorological measurements which are related to solar energy and also it is possible to use empirical relationships to estimate solar flux intensit. One of these empirical relationships to estimate monthly averages of total solar radiation on a horizontal surface is the modified angstrom-type regression equation which has been employed in this report in order to estimate the solar flux intensity on a horizontal surface for Tehran. By comparing the results of this equation with four years measured valued by Tehran's meteorological weather station the values of meteorological constants (a,b) in the equation were obtained for Tehran. (author)
Outlier Detection in Regression Using an Iterated One-Step Approximation to the Huber-Skip Estimator
DEFF Research Database (Denmark)
Johansen, Søren; Nielsen, Bent
2013-01-01
In regression we can delete outliers based upon a preliminary estimator and reestimate the parameters by least squares based upon the retained observations. We study the properties of an iteratively defined sequence of estimators based on this idea. We relate the sequence to the Huber-skip estima......In regression we can delete outliers based upon a preliminary estimator and reestimate the parameters by least squares based upon the retained observations. We study the properties of an iteratively defined sequence of estimators based on this idea. We relate the sequence to the Huber...... that the normalized estimation errors are tight and are close to a linear function of the kernel, thus providing a stochastic expansion of the estimators, which is the same as for the Huber-skip. This implies that the iterated estimator is a close approximation of the Huber-skip...
Comparison of Classical and Robust Estimates of Threshold Auto-regression Parameters
Directory of Open Access Journals (Sweden)
V. B. Goryainov
2017-01-01
Full Text Available The study object is the first-order threshold auto-regression model with a single zero-located threshold. The model describes a stochastic temporal series with discrete time by means of a piecewise linear equation consisting of two linear classical first-order autoregressive equations. One of these equations is used to calculate a running value of the temporal series. A control variable that determines the choice between these two equations is the sign of the previous value of the same series.The first-order threshold autoregressive model with a single threshold depends on two real parameters that coincide with the coefficients of the piecewise linear threshold equation. These parameters are assumed to be unknown. The paper studies an estimate of the least squares, an estimate the least modules, and the M-estimates of these parameters. The aim of the paper is a comparative study of the accuracy of these estimates for the main probabilistic distributions of the updating process of the threshold autoregressive equation. These probability distributions were normal, contaminated normal, logistic, double-exponential distributions, a Student's distribution with different number of degrees of freedom, and a Cauchy distribution.As a measure of the accuracy of each estimate, was chosen its variance to measure the scattering of the estimate around the estimated parameter. An estimate with smaller variance made from the two estimates was considered to be the best. The variance was estimated by computer simulation. To estimate the smallest modules an iterative weighted least-squares method was used and the M-estimates were done by the method of a deformable polyhedron (the Nelder-Mead method. To calculate the least squares estimate, an explicit analytic expression was used.It turned out that the estimation of least squares is best only with the normal distribution of the updating process. For the logistic distribution and the Student's distribution with the
Sidik, S. M.
1975-01-01
Ridge, Marquardt's generalized inverse, shrunken, and principal components estimators are discussed in terms of the objectives of point estimation of parameters, estimation of the predictive regression function, and hypothesis testing. It is found that as the normal equations approach singularity, more consideration must be given to estimable functions of the parameters as opposed to estimation of the full parameter vector; that biased estimators all introduce constraints on the parameter space; that adoption of mean squared error as a criterion of goodness should be independent of the degree of singularity; and that ordinary least-squares subset regression is the best overall method.
Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said
2014-09-01
In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.
truncSP: An R Package for Estimation of Semi-Parametric Truncated Linear Regression Models
Directory of Open Access Journals (Sweden)
Maria Karlsson
2014-05-01
Full Text Available Problems with truncated data occur in many areas, complicating estimation and inference. Regarding linear regression models, the ordinary least squares estimator is inconsistent and biased for these types of data and is therefore unsuitable for use. Alternative estimators, designed for the estimation of truncated regression models, have been developed. This paper presents the R package truncSP. The package contains functions for the estimation of semi-parametric truncated linear regression models using three different estimators: the symmetrically trimmed least squares, quadratic mode, and left truncated estimators, all of which have been shown to have good asymptotic and ?nite sample properties. The package also provides functions for the analysis of the estimated models. Data from the environmental sciences are used to illustrate the functions in the package.
On the relation between S-Estimators and M-Estimators of multivariate location and covariance
Lopuhaa, H.P.
1987-01-01
We discuss the relation between S-estimators and M-estimators of multivariate location and covariance. As in the case of the estimation of a multiple regression parameter, S-estimators are shown to satisfy first-order conditions of M-estimators. We show that the influence function IF (x;S F) of
The efficiency of modified jackknife and ridge type regression estimators: a comparison
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2008-09-01
Full Text Available A common problem in multiple regression models is multicollinearity, which produces undesirable effects on the least squares estimator. To circumvent this problem, two well known estimation procedures are often suggested in the literature. They are Generalized Ridge Regression (GRR estimation suggested by Hoerl and Kennard iteb8 and the Jackknifed Ridge Regression (JRR estimation suggested by Singh et al. iteb13. The GRR estimation leads to a reduction in the sampling variance, whereas, JRR leads to a reduction in the bias. In this paper, we propose a new estimator namely, Modified Jackknife Ridge Regression Estimator (MJR. It is based on the criterion that combines the ideas underlying both the GRR and JRR estimators. We have investigated standard properties of this new estimator. From a simulation study, we find that the new estimator often outperforms the LASSO, and it is superior to both GRR and JRR estimators, using the mean squared error criterion. The conditions under which the MJR estimator is better than the other two competing estimators have been investigated.
Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha
2012-05-01
Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
International Nuclear Information System (INIS)
Bae, In Ho; Naa, Man Gyun; Lee, Yoon Joon; Park, Goon Cherl
2009-01-01
The monitoring of detailed 3-dimensional (3D) reactor core power distribution is a prerequisite in the operation of nuclear power reactors to ensure that various safety limits imposed on the LPD and DNBR, are not violated during nuclear power reactor operation. The LPD and DNBR should be calculated in order to perform the two major functions of the core protection calculator system (CPCS) and the core operation limit supervisory system (COLSS). The LPD at the hottest part of a hot fuel rod, which is related to the power peaking factor (PPF, F q ), is more important than the LPD at any other position in a reactor core. The LPD needs to be estimated accurately to prevent nuclear fuel rods from melting. In this study, support vector regression (SVR) and uncertainty analysis have been applied to estimation of reactor core power peaking factor
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression
Directory of Open Access Journals (Sweden)
Morufu Olusola Ibitoye
2016-07-01
Full Text Available The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70% and testing (30% subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R2 between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
Estimating nonlinear selection gradients using quadratic regression coefficients: double or nothing?
Stinchcombe, John R; Agrawal, Aneil F; Hohenlohe, Paul A; Arnold, Stevan J; Blows, Mark W
2008-09-01
The use of regression analysis has been instrumental in allowing evolutionary biologists to estimate the strength and mode of natural selection. Although directional and correlational selection gradients are equal to their corresponding regression coefficients, quadratic regression coefficients must be doubled to estimate stabilizing/disruptive selection gradients. Based on a sample of 33 papers published in Evolution between 2002 and 2007, at least 78% of papers have not doubled quadratic regression coefficients, leading to an appreciable underestimate of the strength of stabilizing and disruptive selection. Proper treatment of quadratic regression coefficients is necessary for estimation of fitness surfaces and contour plots, canonical analysis of the gamma matrix, and modeling the evolution of populations on an adaptive landscape.
Relative Importance for Linear Regression in R: The Package relaimpo
Directory of Open Access Journals (Sweden)
Ulrike Gromping
2006-09-01
Full Text Available Relative importance is a topic that has seen a lot of interest in recent years, particularly in applied work. The R package relaimpo implements six different metrics for assessing relative importance of regressors in the linear model, two of which are recommended - averaging over orderings of regressors and a newly proposed metric (Feldman 2005 called pmvd. Apart from delivering the metrics themselves, relaimpo also provides (exploratory bootstrap confidence intervals. This paper offers a brief tutorial introduction to the package. The methods and relaimpo’s functionality are illustrated using the data set swiss that is generally available in R. The paper targets readers who have a basic understanding of multiple linear regression. For the background of more advanced aspects, references are provided.
U.S. Environmental Protection Agency — Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression...
Simultaneous Estimation of Regression Functions for Marine Corps Technical Training Specialties.
Dunbar, Stephen B.; And Others
This paper considers the application of Bayesian techniques for simultaneous estimation to the specification of regression weights for selection tests used in various technical training courses in the Marine Corps. Results of a method for m-group regression developed by Molenaar and Lewis (1979) suggest that common weights for training courses…
Soil moisture estimation using multi linear regression with terraSAR-X data
Directory of Open Access Journals (Sweden)
G. García
2016-06-01
Full Text Available The first five centimeters of soil form an interface where the main heat fluxes exchanges between the land surface and the atmosphere occur. Besides ground measurements, remote sensing has proven to be an excellent tool for the monitoring of spatial and temporal distributed data of the most relevant Earth surface parameters including soil’s parameters. Indeed, active microwave sensors (Synthetic Aperture Radar - SAR offer the opportunity to monitor soil moisture (HS at global, regional and local scales by monitoring involved processes. Several inversion algorithms, that derive geophysical information as HS from SAR data, were developed. Many of them use electromagnetic models for simulating the backscattering coefficient and are based on statistical techniques, such as neural networks, inversion methods and regression models. Recent studies have shown that simple multiple regression techniques yield satisfactory results. The involved geophysical variables in these methodologies are descriptive of the soil structure, microwave characteristics and land use. Therefore, in this paper we aim at developing a multiple linear regression model to estimate HS on flat agricultural regions using TerraSAR-X satellite data and data from a ground weather station. The results show that the backscatter, the precipitation and the relative humidity are the explanatory variables of HS. The results obtained presented a RMSE of 5.4 and a R2 of about 0.6
Sparling, D.W.; Barzen, J.A.; Lovvorn, J.R.; Serie, J.R.
1992-01-01
Regression equations that use mensural data to estimate body condition have been developed for several water birds. These equations often have been based on data that represent different sexes, age classes, or seasons, without being adequately tested for intergroup differences. We used proximate carcass analysis of 538 adult and juvenile canvasbacks (Aythya valisineria ) collected during fall migration, winter, and spring migrations in 1975-76 and 1982-85 to test regression methods for estimating body condition.
Support vector regression methodology for estimating global solar radiation in Algeria
Guermoui, Mawloud; Rabehi, Abdelaziz; Gairaa, Kacem; Benkaciali, Said
2018-01-01
Accurate estimation of Daily Global Solar Radiation (DGSR) has been a major goal for solar energy applications. In this paper we show the possibility of developing a simple model based on the Support Vector Regression (SVM-R), which could be used to estimate DGSR on the horizontal surface in Algeria based only on sunshine ratio as input. The SVM model has been developed and tested using a data set recorded over three years (2005-2007). The data was collected at the Applied Research Unit for Renewable Energies (URAER) in Ghardaïa city. The data collected between 2005-2006 are used to train the model while the 2007 data are used to test the performance of the selected model. The measured and the estimated values of DGSR were compared during the testing phase statistically using the Root Mean Square Error (RMSE), Relative Square Error (rRMSE), and correlation coefficient (r2), which amount to 1.59(MJ/m2), 8.46 and 97,4%, respectively. The obtained results show that the SVM-R is highly qualified for DGSR estimation using only sunshine ratio.
Ding, A Adam; Wu, Hulin
2014-10-01
We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Estimation of Panel Data Regression Models with Two-Sided Censoring or Truncation
DEFF Research Database (Denmark)
Alan, Sule; Honore, Bo E.; Hu, Luojia
2014-01-01
This paper constructs estimators for panel data regression models with individual speci…fic heterogeneity and two–sided censoring and truncation. Following Powell (1986) the estimation strategy is based on moment conditions constructed from re–censored or re–truncated residuals. While these moment...
DEFF Research Database (Denmark)
Petersen, Jørgen Holm
2016-01-01
This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied...
Asymptotic normality of kernel estimator of $\\psi$-regression function for functional ergodic data
Laksaci ALI; Benziadi Fatima; Gheriballak Abdelkader
2016-01-01
In this paper we consider the problem of the estimation of the $\\psi$-regression function when the covariates take values in an infinite dimensional space. Our main aim is to establish, under a stationary ergodic process assumption, the asymptotic normality of this estimate.
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
The importance of the chosen technique to estimate diffuse solar radiation by means of regression
Energy Technology Data Exchange (ETDEWEB)
Arslan, Talha; Altyn Yavuz, Arzu [Department of Statistics. Science and Literature Faculty. Eskisehir Osmangazi University (Turkey)], email: mtarslan@ogu.edu.tr, email: aaltin@ogu.edu.tr; Acikkalp, Emin [Department of Mechanical and Manufacturing Engineering. Engineering Faculty. Bilecik University (Turkey)], email: acikkalp@gmail.com
2011-07-01
The Ordinary Least Squares (OLS) method is one of the most frequently used for estimation of diffuse solar radiation. The data set must provide certain assumptions for the OLS method to work. The most important is that the regression equation offered by OLS error terms must fit within the normal distribution. Utilizing an alternative robust estimator to get parameter estimations is highly effective in solving problems where there is a lack of normal distribution due to the presence of outliers or some other factor. The purpose of this study is to investigate the value of the chosen technique for the estimation of diffuse radiation. This study described alternative robust methods frequently used in applications and compared them with the OLS method. Making a comparison of the data set analysis of the OLS and that of the M Regression (Huber, Andrews and Tukey) techniques, it was study found that robust regression techniques are preferable to OLS because of the smoother explanation values.
Dai, Wenlin; Tong, Tiejun; Zhu, Lixing
2017-01-01
Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.
Robust estimation for homoscedastic regression in the secondary analysis of case-control data
Wei, Jiawei
2012-12-04
Primary analysis of case-control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of Y on X is different from what it is in the population. Previous work has assumed a parametric distribution for Y given X and derived semiparametric efficient estimation and inference without any distributional assumptions about X. We take up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model, but otherwise the distribution of Y is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for Y given X. We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for Y given X is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach.
Dai, Wenlin
2017-09-01
Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.
Robust estimation for homoscedastic regression in the secondary analysis of case-control data
Wei, Jiawei; Carroll, Raymond J.; Mü ller, Ursula U.; Keilegom, Ingrid Van; Chatterjee, Nilanjan
2012-01-01
Primary analysis of case-control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of Y on X is different from what it is in the population. Previous work has assumed a parametric distribution for Y given X and derived semiparametric efficient estimation and inference without any distributional assumptions about X. We take up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model, but otherwise the distribution of Y is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for Y given X. We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for Y given X is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach.
Estimating integrated variance in the presence of microstructure noise using linear regression
Holý, Vladimír
2017-07-01
Using financial high-frequency data for estimation of integrated variance of asset prices is beneficial but with increasing number of observations so-called microstructure noise occurs. This noise can significantly bias the realized variance estimator. We propose a method for estimation of the integrated variance robust to microstructure noise as well as for testing the presence of the noise. Our method utilizes linear regression in which realized variances estimated from different data subsamples act as dependent variable while the number of observations act as explanatory variable. We compare proposed estimator with other methods on simulated data for several microstructure noise structures.
A Gaussian IV estimator of cointegrating relations
DEFF Research Database (Denmark)
Bårdsen, Gunnar; Haldrup, Niels
2006-01-01
In static single equation cointegration regression modelsthe OLS estimator will have a non-standard distribution unless regressors arestrictly exogenous. In the literature a number of estimators have been suggestedto deal with this problem, especially by the use of semi-nonparametricestimators. T......In static single equation cointegration regression modelsthe OLS estimator will have a non-standard distribution unless regressors arestrictly exogenous. In the literature a number of estimators have been suggestedto deal with this problem, especially by the use of semi...... in cointegrating regressions. These instruments are almost idealand simulations show that the IV estimator using such instruments alleviatethe endogeneity problem extremely well in both finite and large samples....
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions.
Rativa, Diego; Fernandes, Bruno J T; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care.
International Nuclear Information System (INIS)
Lang, Corey; Siler, Matthew
2013-01-01
Energy efficiency upgrades have been gaining widespread attention across global channels as a cost-effective approach to addressing energy challenges. The cost-effectiveness of these projects is generally predicted using engineering estimates pre-implementation, often with little ex post analysis of project success. In this paper, for a suite of energy efficiency projects, we directly compare ex ante engineering estimates of energy savings to ex post econometric estimates that use 15-min interval, building-level energy consumption data. In contrast to most prior literature, our econometric results confirm the engineering estimates, even suggesting the engineering estimates were too modest. Further, we find heterogeneous efficiency impacts by time of day, suggesting select efficiency projects can be useful in reducing peak load. - Highlights: • Regression discontinuity used to estimate energy savings from efficiency projects. • Ex post econometric estimates validate ex ante engineering estimates of energy savings. • Select efficiency projects shown to reduce peak load
Monopole and dipole estimation for multi-frequency sky maps by linear regression
Wehus, I. K.; Fuskeland, U.; Eriksen, H. K.; Banday, A. J.; Dickinson, C.; Ghosh, T.; Górski, K. M.; Lawrence, C. R.; Leahy, J. P.; Maino, D.; Reich, P.; Reich, W.
2017-01-01
We describe a simple but efficient method for deriving a consistent set of monopole and dipole corrections for multi-frequency sky map data sets, allowing robust parametric component separation with the same data set. The computational core of this method is linear regression between pairs of frequency maps, often called T-T plots. Individual contributions from monopole and dipole terms are determined by performing the regression locally in patches on the sky, while the degeneracy between different frequencies is lifted whenever the dominant foreground component exhibits a significant spatial spectral index variation. Based on this method, we present two different, but each internally consistent, sets of monopole and dipole coefficients for the nine-year WMAP, Planck 2013, SFD 100 μm, Haslam 408 MHz and Reich & Reich 1420 MHz maps. The two sets have been derived with different analysis assumptions and data selection, and provide an estimate of residual systematic uncertainties. In general, our values are in good agreement with previously published results. Among the most notable results are a relative dipole between the WMAP and Planck experiments of 10-15μK (depending on frequency), an estimate of the 408 MHz map monopole of 8.9 ± 1.3 K, and a non-zero dipole in the 1420 MHz map of 0.15 ± 0.03 K pointing towards Galactic coordinates (l,b) = (308°,-36°) ± 14°. These values represent the sum of any instrumental and data processing offsets, as well as any Galactic or extra-Galactic component that is spectrally uniform over the full sky.
Directory of Open Access Journals (Sweden)
Jibo Yue
2018-01-01
Full Text Available Above-ground biomass (AGB provides a vital link between solar energy consumption and yield, so its correct estimation is crucial to accurately monitor crop growth and predict yield. In this work, we estimate AGB by using 54 vegetation indexes (e.g., Normalized Difference Vegetation Index, Soil-Adjusted Vegetation Index and eight statistical regression techniques: artificial neural network (ANN, multivariable linear regression (MLR, decision-tree regression (DT, boosted binary regression tree (BBRT, partial least squares regression (PLSR, random forest regression (RF, support vector machine regression (SVM, and principal component regression (PCR, which are used to analyze hyperspectral data acquired by using a field spectrophotometer. The vegetation indexes (VIs determined from the spectra were first used to train regression techniques for modeling and validation to select the best VI input, and then summed with white Gaussian noise to study how remote sensing errors affect the regression techniques. Next, the VIs were divided into groups of different sizes by using various sampling methods for modeling and validation to test the stability of the techniques. Finally, the AGB was estimated by using a leave-one-out cross validation with these powerful techniques. The results of the study demonstrate that, of the eight techniques investigated, PLSR and MLR perform best in terms of stability and are most suitable when high-accuracy and stable estimates are required from relatively few samples. In addition, RF is extremely robust against noise and is best suited to deal with repeated observations involving remote-sensing data (i.e., data affected by atmosphere, clouds, observation times, and/or sensor noise. Finally, the leave-one-out cross-validation method indicates that PLSR provides the highest accuracy (R2 = 0.89, RMSE = 1.20 t/ha, MAE = 0.90 t/ha, NRMSE = 0.07, CV (RMSE = 0.18; thus, PLSR is best suited for works requiring high
Islamiyati, A.; Fatmawati; Chamidah, N.
2018-03-01
The correlation assumption of the longitudinal data with bi-response occurs on the measurement between the subjects of observation and the response. It causes the auto-correlation of error, and this can be overcome by using a covariance matrix. In this article, we estimate the covariance matrix based on the penalized spline regression model. Penalized spline involves knot points and smoothing parameters simultaneously in controlling the smoothness of the curve. Based on our simulation study, the estimated regression model of the weighted penalized spline with covariance matrix gives a smaller error value compared to the error of the model without covariance matrix.
Generalized allometric regression to estimate biomass of Populus in short-rotation coppice
Energy Technology Data Exchange (ETDEWEB)
Ben Brahim, Mohammed; Gavaland, Andre; Cabanettes, Alain [INRA Centre de Toulouse, Castanet-Tolosane Cedex (France). Unite Agroforesterie et Foret Paysanne
2000-07-01
Data from four different stands were combined to establish a single generalized allometric equation to estimate above-ground biomass of individual Populus trees grown on short-rotation coppice. The generalized model was performed using diameter at breast height, the mean diameter and the mean height of each site as dependent variables and then compared with the stand-specific regressions using F-test. Results showed that this single regression estimates tree biomass well at each stand and does not introduce bias with increasing diameter.
Lusiana, Evellin Dewi
2017-12-01
The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
Regression tools for CO2 inversions: application of a shrinkage estimator to process attribution
International Nuclear Information System (INIS)
Shaby, Benjamin A.; Field, Christopher B.
2006-01-01
In this study we perform an atmospheric inversion based on a shrinkage estimator. This method is used to estimate surface fluxes of CO 2 , first partitioned according to constituent geographic regions, and then according to constituent processes that are responsible for the total flux. Our approach differs from previous approaches in two important ways. The first is that the technique of linear Bayesian inversion is recast as a regression problem. Seen as such, standard regression tools are employed to analyse and reduce errors in the resultant estimates. A shrinkage estimator, which combines standard ridge regression with the linear 'Bayesian inversion' model, is introduced. This method introduces additional bias into the model with the aim of reducing variance such that errors are decreased overall. Compared with standard linear Bayesian inversion, the ridge technique seems to reduce both flux estimation errors and prediction errors. The second divergence from previous studies is that instead of dividing the world into geographically distinct regions and estimating the CO 2 flux in each region, the flux space is divided conceptually into processes that contribute to the total global flux. Formulating the problem in this manner adds to the interpretability of the resultant estimates and attempts to shed light on the problem of attributing sources and sinks to their underlying mechanisms
Robust best linear estimation for regression analysis using surrogate and instrumental variables.
Wang, C Y
2012-04-01
We investigate methods for regression analysis when covariates are measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies the classical measurement error model, but it may not have repeated measurements. In addition to the surrogate variables that are available among the subjects in the calibration sample, we assume that there is an instrumental variable (IV) that is available for all study subjects. An IV is correlated with the unobserved true exposure variable and hence can be useful in the estimation of the regression coefficients. We propose a robust best linear estimator that uses all the available data, which is the most efficient among a class of consistent estimators. The proposed estimator is shown to be consistent and asymptotically normal under very weak distributional assumptions. For Poisson or linear regression, the proposed estimator is consistent even if the measurement error from the surrogate or IV is heteroscedastic. Finite-sample performance of the proposed estimator is examined and compared with other estimators via intensive simulation studies. The proposed method and other methods are applied to a bladder cancer case-control study.
Hu, L; Zhang, Z G; Mouraux, A; Iannetti, G D
2015-05-01
Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical
Energy Technology Data Exchange (ETDEWEB)
Luukkonen, A.; Korkealaakso, J.; Pitkaenen, P. [VTT Communities and Infrastructure, Espoo (Finland)
1997-11-01
Teollisuuden Voima Oy selected five investigation areas for preliminary site studies (1987Ae1992). The more detailed site investigation project, launched at the beginning of 1993 and presently supervised by Posiva Oy, is concentrated to three investigation areas. Romuvaara at Kuhmo is one of the present target areas, and the geochemical, structural and hydrological data used in this study are extracted from there. The aim of the study is to develop suitable methods for groundwater composition estimation based on a group of known hydrogeological variables. The input variables used are related to the host type of groundwater, hydrological conditions around the host location, mixing potentials between different types of groundwater, and minerals equilibrated with the groundwater. The output variables are electrical conductivity, Ca, Mg, Mn, Na, K, Fe, Cl, S, HS, SO{sub 4}, alkalinity, {sup 3}H, {sup 14}C, {sup 13}C, Al, Sr, F, Br and I concentrations, and pH of the groundwater. The methodology is to associate the known hydrogeological conditions (i.e. input variables), with the known water compositions (output variables), and to evaluate mathematical relations between these groups. Output estimations are done with two separate procedures: partial least squares regressions on the principal components of input variables, and by training neural networks with input-output pairs. Coefficients of linear equations and trained networks are optional methods for actual predictions. The quality of output predictions are monitored with confidence limit estimations, evaluated from input variable covariances and output variances, and with charge balance calculations. Groundwater compositions in Romuvaara borehole KR10 are predicted at 10 metre intervals with both prediction methods. 46 refs.
On the degrees of freedom of reduced-rank estimators in multivariate regression.
Mukherjee, A; Chen, K; Wang, N; Zhu, J
We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example.
Estimation of the laser cutting operating cost by support vector regression methodology
Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam
2016-09-01
Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.
Estimating carbon and showing impacts of drought using satellite data in regression-tree models
Boyte, Stephen; Wylie, Bruce K.; Howard, Danny; Dahal, Devendra; Gilmanov, Tagir G.
2018-01-01
Integrating spatially explicit biogeophysical and remotely sensed data into regression-tree models enables the spatial extrapolation of training data over large geographic spaces, allowing a better understanding of broad-scale ecosystem processes. The current study presents annual gross primary production (GPP) and annual ecosystem respiration (RE) for 2000–2013 in several short-statured vegetation types using carbon flux data from towers that are located strategically across the conterminous United States (CONUS). We calculate carbon fluxes (annual net ecosystem production [NEP]) for each year in our study period, which includes 2012 when drought and higher-than-normal temperatures influence vegetation productivity in large parts of the study area. We present and analyse carbon flux dynamics in the CONUS to better understand how drought affects GPP, RE, and NEP. Model accuracy metrics show strong correlation coefficients (r) (r ≥ 94%) between training and estimated data for both GPP and RE. Overall, average annual GPP, RE, and NEP are relatively constant throughout the study period except during 2012 when almost 60% less carbon is sequestered than normal. These results allow us to conclude that this modelling method effectively estimates carbon dynamics through time and allows the exploration of impacts of meteorological anomalies and vegetation types on carbon dynamics.
Brillouin Scattering Spectrum Analysis Based on Auto-Regressive Spectral Estimation
Huang, Mengyun; Li, Wei; Liu, Zhangyun; Cheng, Linghao; Guan, Bai-Ou
2018-06-01
Auto-regressive (AR) spectral estimation technology is proposed to analyze the Brillouin scattering spectrum in Brillouin optical time-domain refelectometry. It shows that AR based method can reliably estimate the Brillouin frequency shift with an accuracy much better than fast Fourier transform (FFT) based methods provided the data length is not too short. It enables about 3 times improvement over FFT at a moderate spatial resolution.
Brillouin Scattering Spectrum Analysis Based on Auto-Regressive Spectral Estimation
Huang, Mengyun; Li, Wei; Liu, Zhangyun; Cheng, Linghao; Guan, Bai-Ou
2018-03-01
Auto-regressive (AR) spectral estimation technology is proposed to analyze the Brillouin scattering spectrum in Brillouin optical time-domain refelectometry. It shows that AR based method can reliably estimate the Brillouin frequency shift with an accuracy much better than fast Fourier transform (FFT) based methods provided the data length is not too short. It enables about 3 times improvement over FFT at a moderate spatial resolution.
Directory of Open Access Journals (Sweden)
Danang Ariyanto
2017-11-01
Full Text Available Regression is a method connected independent variable and dependent variable with estimation parameter as an output. Principal problem in this method is its application in spatial data. Geographically Weighted Regression (GWR method used to solve the problem. GWR is a regression technique that extends the traditional regression framework by allowing the estimation of local rather than global parameters. In other words, GWR runs a regression for each location, instead of a sole regression for the entire study area. The purpose of this research is to analyze the factors influencing wet land paddy productivities in Tulungagung Regency. The methods used in this research is GWR using cross validation bandwidth and weighted by adaptive Gaussian kernel fungtion.This research using 4 variables which are presumed affecting the wet land paddy productivities such as: the rate of rainfall(X1, the average cost of fertilizer per hectare(X2, the average cost of pestisides per hectare(X3 and Allocation of subsidized NPK fertilizer of food crops sub-sector(X4. Based on the result, X1, X2, X3 and X4 has a different effect on each Distric. So, to improve the productivity of wet land paddy in Tulungagung Regency required a special policy based on the GWR model in each distric.
Amalia, Junita; Purhadi, Otok, Bambang Widjanarko
2017-11-01
Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.
Estimating traffic volume on Wyoming low volume roads using linear and logistic regression methods
Directory of Open Access Journals (Sweden)
Dick Apronti
2016-12-01
Full Text Available Traffic volume is an important parameter in most transportation planning applications. Low volume roads make up about 69% of road miles in the United States. Estimating traffic on the low volume roads is a cost-effective alternative to taking traffic counts. This is because traditional traffic counts are expensive and impractical for low priority roads. The purpose of this paper is to present the development of two alternative means of cost-effectively estimating traffic volumes for low volume roads in Wyoming and to make recommendations for their implementation. The study methodology involves reviewing existing studies, identifying data sources, and carrying out the model development. The utility of the models developed were then verified by comparing actual traffic volumes to those predicted by the model. The study resulted in two regression models that are inexpensive and easy to implement. The first regression model was a linear regression model that utilized pavement type, access to highways, predominant land use types, and population to estimate traffic volume. In verifying the model, an R2 value of 0.64 and a root mean square error of 73.4% were obtained. The second model was a logistic regression model that identified the level of traffic on roads using five thresholds or levels. The logistic regression model was verified by estimating traffic volume thresholds and determining the percentage of roads that were accurately classified as belonging to the given thresholds. For the five thresholds, the percentage of roads classified correctly ranged from 79% to 88%. In conclusion, the verification of the models indicated both model types to be useful for accurate and cost-effective estimation of traffic volumes for low volume Wyoming roads. The models developed were recommended for use in traffic volume estimations for low volume roads in pavement management and environmental impact assessment studies.
Salmerón, Diego; Cano, Juan A; Chirlaque, María D
2015-08-30
In cohort studies, binary outcomes are very often analyzed by logistic regression. However, it is well known that when the goal is to estimate a risk ratio, the logistic regression is inappropriate if the outcome is common. In these cases, a log-binomial regression model is preferable. On the other hand, the estimation of the regression coefficients of the log-binomial model is difficult owing to the constraints that must be imposed on these coefficients. Bayesian methods allow a straightforward approach for log-binomial regression models and produce smaller mean squared errors in the estimation of risk ratios than the frequentist methods, and the posterior inferences can be obtained using the software WinBUGS. However, Markov chain Monte Carlo methods implemented in WinBUGS can lead to large Monte Carlo errors in the approximations to the posterior inferences because they produce correlated simulations, and the accuracy of the approximations are inversely related to this correlation. To reduce correlation and to improve accuracy, we propose a reparameterization based on a Poisson model and a sampling algorithm coded in R. Copyright © 2015 John Wiley & Sons, Ltd.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Huang, C.; Townshend, J.R.G.
2003-01-01
A stepwise regression tree (SRT) algorithm was developed for approximating complex nonlinear relationships. Based on the regression tree of Breiman et al . (BRT) and a stepwise linear regression (SLR) method, this algorithm represents an improvement over SLR in that it can approximate nonlinear relationships and over BRT in that it gives more realistic predictions. The applicability of this method to estimating subpixel forest was demonstrated using three test data sets, on all of which it gave more accurate predictions than SLR and BRT. SRT also generated more compact trees and performed better than or at least as well as BRT at all 10 equal forest proportion interval ranging from 0 to 100%. This method is appealing to estimating subpixel land cover over large areas.
Melguizo, Tatiana; Bos, Johannes M.; Ngo, Federick; Mills, Nicholas; Prather, George
2016-01-01
This study evaluates the effectiveness of math placement policies for entering community college students on these students' academic success in math. We estimate the impact of placement decisions by using a discrete-time survival model within a regression discontinuity framework. The primary conclusion that emerges is that initial placement in a…
Replicating Experimental Impact Estimates Using a Regression Discontinuity Approach. NCEE 2012-4025
Gleason, Philip M.; Resch, Alexandra M.; Berk, Jillian A.
2012-01-01
This NCEE Technical Methods Paper compares the estimated impacts of an educational intervention using experimental and regression discontinuity (RD) study designs. The analysis used data from two large-scale randomized controlled trials--the Education Technology Evaluation and the Teach for America Study--to provide evidence on the performance of…
Adding a Parameter Increases the Variance of an Estimated Regression Function
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Early cost estimating for road construction projects using multiple regression techniques
Directory of Open Access Journals (Sweden)
Ibrahim Mahamid
2011-12-01
Full Text Available The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that the input data for the required regression model could be easily extracted from sketches or scope definition of the project. 11 regression models are developed to estimate the total cost of road construction project in US dollar; 5 of them include bid quantities as input variables and 6 include road length and road width. The coefficient of determination r2 for the developed models is ranging from 0.92 to 0.98 which indicate that the predicted values from a forecast models fit with the real-life data. The values of the mean absolute percentage error (MAPE of the developed regression models are ranging from 13% to 31%, the results compare favorably with past researches which have shown that the estimate accuracy in the early stages of a project is between ±25% and ±50%.
The limiting behavior of the estimated parameters in a misspecified random field regression model
DEFF Research Database (Denmark)
Dahl, Christian Møller; Qin, Yu
This paper examines the limiting properties of the estimated parameters in the random field regression model recently proposed by Hamilton (Econometrica, 2001). Though the model is parametric, it enjoys the flexibility of the nonparametric approach since it can approximate a large collection of n...
Directory of Open Access Journals (Sweden)
Yoonseok Shin
2015-01-01
Full Text Available Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.
Inverse estimation of multiple muscle activations based on linear logistic regression.
Sekiya, Masashi; Tsuji, Toshiaki
2017-07-01
This study deals with a technology to estimate the muscle activity from the movement data using a statistical model. A linear regression (LR) model and artificial neural networks (ANN) have been known as statistical models for such use. Although ANN has a high estimation capability, it is often in the clinical application that the lack of data amount leads to performance deterioration. On the other hand, the LR model has a limitation in generalization performance. We therefore propose a muscle activity estimation method to improve the generalization performance through the use of linear logistic regression model. The proposed method was compared with the LR model and ANN in the verification experiment with 7 participants. As a result, the proposed method showed better generalization performance than the conventional methods in various tasks.
Directory of Open Access Journals (Sweden)
Yi-Ming Kuo
2011-06-01
Full Text Available Fine airborne particulate matter (PM2.5 has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS, the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME method. The resulting epistemic framework can assimilate knowledge bases including: (a empirical-based spatial trends of PM concentration based on landuse regression, (b the spatio-temporal dependence among PM observation information, and (c site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan from 2005–2007.
Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming
2011-06-01
Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005-2007.
DEFF Research Database (Denmark)
Kirkeby, Carsten Thure; Hisham Beshara Halasa, Tariq; Gussmann, Maya Katrin
2017-01-01
the transmission rate. We use data from the two simulation models and vary the sampling intervals and the size of the population sampled. We devise two new methods to determine transmission rate, and compare these to the frequently used Poisson regression method in both epidemic and endemic situations. For most...... tested scenarios these new methods perform similar or better than Poisson regression, especially in the case of long sampling intervals. We conclude that transmission rate estimates are easily biased, which is important to take into account when using these rates in simulation models....
Graphical evaluation of the ridge-type robust regression estimators in mixture experiments.
Erkoc, Ali; Emiroglu, Esra; Akay, Kadri Ulas
2014-01-01
In mixture experiments, estimation of the parameters is generally based on ordinary least squares (OLS). However, in the presence of multicollinearity and outliers, OLS can result in very poor estimates. In this case, effects due to the combined outlier-multicollinearity problem can be reduced to certain extent by using alternative approaches. One of these approaches is to use biased-robust regression techniques for the estimation of parameters. In this paper, we evaluate various ridge-type robust estimators in the cases where there are multicollinearity and outliers during the analysis of mixture experiments. Also, for selection of biasing parameter, we use fraction of design space plots for evaluating the effect of the ridge-type robust estimators with respect to the scaled mean squared error of prediction. The suggested graphical approach is illustrated on Hald cement data set.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard
2016-10-01
In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.
Energy Technology Data Exchange (ETDEWEB)
Lekadir, Karim, E-mail: karim.lekadir@upf.edu; Hoogendoorn, Corné [Center for Computational Imaging and Simulation Technologies in Biomedicine, Universitat Pompeu Fabra, Barcelona 08018 (Spain); Armitage, Paul [The Academic Unit of Radiology, The University of Sheffield, Sheffield S10 2JF (United Kingdom); Whitby, Elspeth [The Academic Unit of Reproductive and Developmental Medicine, The University of Sheffield, Sheffield S10 2SF (United Kingdom); King, David [The Academic Unit of Child Health, The University of Sheffield, Sheffield S10 2TH (United Kingdom); Dimitri, Paul [The Mellanby Centre for Bone Research, The University of Sheffield, Sheffield S10 2RX (United Kingdom); Frangi, Alejandro F. [Center for Computational Imaging and Simulation Technologies in Biomedicine, The University of Sheffield, Sheffield S1 3JD (United Kingdom)
2016-06-15
Purpose: This paper presents a statistical approach for the prediction of trabecular bone parameters from low-resolution multisequence magnetic resonance imaging (MRI) in children, thus addressing the limitations of high-resolution modalities such as HR-pQCT, including the significant exposure of young patients to radiation and the limited applicability of such modalities to peripheral bones in vivo. Methods: A statistical predictive model is constructed from a database of MRI and HR-pQCT datasets, to relate the low-resolution MRI appearance in the cancellous bone to the trabecular parameters extracted from the high-resolution images. The description of the MRI appearance is achieved between subjects by using a collection of feature descriptors, which describe the texture properties inside the cancellous bone, and which are invariant to the geometry and size of the trabecular areas. The predictive model is built by fitting to the training data a nonlinear partial least square regression between the input MRI features and the output trabecular parameters. Results: Detailed validation based on a sample of 96 datasets shows correlations >0.7 between the trabecular parameters predicted from low-resolution multisequence MRI based on the proposed statistical model and the values extracted from high-resolution HRp-QCT. Conclusions: The obtained results indicate the promise of the proposed predictive technique for the estimation of trabecular parameters in children from multisequence MRI, thus reducing the need for high-resolution radiation-based scans for a fragile population that is under development and growth.
Direct and simultaneous estimation of cardiac four chamber volumes by multioutput sparse regression.
Zhen, Xiantong; Zhang, Heye; Islam, Ali; Bhaduri, Mousumi; Chan, Ian; Li, Shuo
2017-02-01
Cardiac four-chamber volume estimation serves as a fundamental and crucial role in clinical quantitative analysis of whole heart functions. It is a challenging task due to the huge complexity of the four chambers including great appearance variations, huge shape deformation and interference between chambers. Direct estimation has recently emerged as an effective and convenient tool for cardiac ventricular volume estimation. However, existing direct estimation methods were specifically developed for one single ventricle, i.e., left ventricle (LV), or bi-ventricles; they can not be directly used for four chamber volume estimation due to the great combinatorial variability and highly complex anatomical interdependency of the four chambers. In this paper, we propose a new, general framework for direct and simultaneous four chamber volume estimation. We have addressed two key issues, i.e., cardiac image representation and simultaneous four chamber volume estimation, which enables accurate and efficient four-chamber volume estimation. We generate compact and discriminative image representations by supervised descriptor learning (SDL) which can remove irrelevant information and extract discriminative features. We propose direct and simultaneous four-chamber volume estimation by the multioutput sparse latent regression (MSLR), which enables jointly modeling nonlinear input-output relationships and capturing four-chamber interdependence. The proposed method is highly generalized, independent of imaging modalities, which provides a general regression framework that can be extensively used for clinical data prediction to achieve automated diagnosis. Experiments on both MR and CT images show that our method achieves high performance with a correlation coefficient of up to 0.921 with ground truth obtained manually by human experts, which is clinically significant and enables more accurate, convenient and comprehensive assessment of cardiac functions. Copyright © 2016 Elsevier
Burns, Douglas A.; Smith, Martyn J.; Freehafer, Douglas A.
2015-12-31
A new Web-based application, titled “Application of Flood Regressions and Climate Change Scenarios To Explore Estimates of Future Peak Flows”, has been developed by the U.S. Geological Survey, in cooperation with the New York State Department of Transportation, that allows a user to apply a set of regression equations to estimate the magnitude of future floods for any stream or river in New York State (exclusive of Long Island) and the Lake Champlain Basin in Vermont. The regression equations that are the basis of the current application were developed in previous investigations by the U.S. Geological Survey (USGS) and are described at the USGS StreamStats Web sites for New York (http://water.usgs.gov/osw/streamstats/new_york.html) and Vermont (http://water.usgs.gov/osw/streamstats/Vermont.html). These regression equations include several fixed landscape metrics that quantify aspects of watershed geomorphology, basin size, and land cover as well as a climate variable—either annual precipitation or annual runoff.
In search of a corrected prescription drug elasticity estimate: a meta-regression approach.
Gemmill, Marin C; Costa-Font, Joan; McGuire, Alistair
2007-06-01
An understanding of the relationship between cost sharing and drug consumption depends on consistent and unbiased price elasticity estimates. However, there is wide heterogeneity among studies, which constrains the applicability of elasticity estimates for empirical purposes and policy simulation. This paper attempts to provide a corrected measure of the drug price elasticity by employing meta-regression analysis (MRA). The results indicate that the elasticity estimates are significantly different from zero, and the corrected elasticity is -0.209 when the results are made robust to heteroskedasticity and clustering of observations. Elasticity values are higher when the study was published in an economic journal, when the study employed a greater number of observations, and when the study used aggregate data. Elasticity estimates are lower when the institutional setting was a tax-based health insurance system.
Estimating the Impact of Urbanization on Air Quality in China Using Spatial Regression Models
Fang, Chuanglin; Liu, Haimeng; Li, Guangdong; Sun, Dongqi; Miao, Zhuang
2015-01-01
Urban air pollution is one of the most visible environmental problems to have accompanied China’s rapid urbanization. Based on emission inventory data from 2014, gathered from 289 cities, we used Global and Local Moran’s I to measure the spatial autorrelation of Air Quality Index (AQI) values at the city level, and employed Ordinary Least Squares (OLS), Spatial Lag Model (SAR), and Geographically Weighted Regression (GWR) to quantitatively estimate the comprehensive impact and spatial variati...
Ordinal Regression Based Subpixel Shift Estimation for Video Super-Resolution
Directory of Open Access Journals (Sweden)
Petrovic Nemanja
2007-01-01
Full Text Available We present a supervised learning-based approach for subpixel motion estimation which is then used to perform video super-resolution. The novelty of this work is the formulation of the problem of subpixel motion estimation in a ranking framework. The ranking formulation is a variant of classification and regression formulation, in which the ordering present in class labels namely, the shift between patches is explicitly taken into account. Finally, we demonstrate the applicability of our approach on superresolving synthetically generated images with global subpixel shifts and enhancing real video frames by accounting for both local integer and subpixel shifts.
Kim, Yoonsang; Emery, Sherry
2013-01-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415
Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry
2013-08-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.
Accounting for estimated IQ in neuropsychological test performance with regression-based techniques.
Testa, S Marc; Winicki, Jessica M; Pearlson, Godfrey D; Gordon, Barry; Schretlen, David J
2009-11-01
Regression-based normative techniques account for variability in test performance associated with multiple predictor variables and generate expected scores based on algebraic equations. Using this approach, we show that estimated IQ, based on oral word reading, accounts for 1-9% of the variability beyond that explained by individual differences in age, sex, race, and years of education for most cognitive measures. These results confirm that adding estimated "premorbid" IQ to demographic predictors in multiple regression models can incrementally improve the accuracy with which regression-based norms (RBNs) benchmark expected neuropsychological test performance in healthy adults. It remains to be seen whether the incremental variance in test performance explained by estimated "premorbid" IQ translates to improved diagnostic accuracy in patient samples. We describe these methods, and illustrate the step-by-step application of RBNs with two cases. We also discuss the rationale, assumptions, and caveats of this approach. More broadly, we note that adjusting test scores for age and other characteristics might actually decrease the accuracy with which test performance predicts absolute criteria, such as the ability to drive or live independently.
Sanford, Ward E.; Selnick, David L.
2013-01-01
Evapotranspiration (ET) is an important quantity for water resource managers to know because it often represents the largest sink for precipitation (P) arriving at the land surface. In order to estimate actual ET across the conterminous United States (U.S.) in this study, a water-balance method was combined with a climate and land-cover regression equation. Precipitation and streamflow records were compiled for 838 watersheds for 1971-2000 across the U.S. to obtain long-term estimates of actual ET. A regression equation was developed that related the ratio ET/P to climate and land-cover variables within those watersheds. Precipitation and temperatures were used from the PRISM climate dataset, and land-cover data were used from the USGS National Land Cover Dataset. Results indicate that ET can be predicted relatively well at a watershed or county scale with readily available climate variables alone, and that land-cover data can also improve those predictions. Using the climate and land-cover data at an 800-m scale and then averaging to the county scale, maps were produced showing estimates of ET and ET/P for the entire conterminous U.S. Using the regression equation, such maps could also be made for more detailed state coverages, or for other areas of the world where climate and land-cover data are plentiful.
Regression to fuzziness method for estimation of remaining useful life in power plant components
Alamaniotis, Miltiadis; Grelle, Austin; Tsoukalas, Lefteri H.
2014-10-01
Mitigation of severe accidents in power plants requires the reliable operation of all systems and the on-time replacement of mechanical components. Therefore, the continuous surveillance of power systems is a crucial concern for the overall safety, cost control, and on-time maintenance of a power plant. In this paper a methodology called regression to fuzziness is presented that estimates the remaining useful life (RUL) of power plant components. The RUL is defined as the difference between the time that a measurement was taken and the estimated failure time of that component. The methodology aims to compensate for a potential lack of historical data by modeling an expert's operational experience and expertise applied to the system. It initially identifies critical degradation parameters and their associated value range. Once completed, the operator's experience is modeled through fuzzy sets which span the entire parameter range. This model is then synergistically used with linear regression and a component's failure point to estimate the RUL. The proposed methodology is tested on estimating the RUL of a turbine (the basic electrical generating component of a power plant) in three different cases. Results demonstrate the benefits of the methodology for components for which operational data is not readily available and emphasize the significance of the selection of fuzzy sets and the effect of knowledge representation on the predicted output. To verify the effectiveness of the methodology, it was benchmarked against the data-based simple linear regression model used for predictions which was shown to perform equal or worse than the presented methodology. Furthermore, methodology comparison highlighted the improvement in estimation offered by the adoption of appropriate of fuzzy sets for parameter representation.
Carroll, Raymond J.
2011-03-01
In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment or dosage, the conditional mean of the response, Y , is often anticipated to be a monotone function of X. Indeed, if this regression mean is not monotone (in the appropriate direction) then the medical or commercial value of the treatment is likely to be significantly curtailed, at least for values of X that lie beyond the point at which monotonicity fails. In the case of a density, common shape constraints include log-concavity and unimodality. If we can correctly guess the shape of a curve, then nonparametric estimators can be improved by taking this information into account. Addressing such problems requires a method for testing the hypothesis that the curve of interest satisfies a shape constraint, and, if the conclusion of the test is positive, a technique for estimating the curve subject to the constraint. Nonparametric methodology for solving these problems already exists, but only in cases where the covariates are observed precisely. However in many problems, data can only be observed with measurement errors, and the methods employed in the error-free case typically do not carry over to this error context. In this paper we develop a novel approach to hypothesis testing and function estimation under shape constraints, which is valid in the context of measurement errors. Our method is based on tilting an estimator of the density or the regression mean until it satisfies the shape constraint, and we take as our test statistic the distance through which it is tilted. Bootstrap methods are used to calibrate the test. The constrained curve estimators that we develop are also based on tilting, and in that context our work has points of contact with methodology in the error-free case.
Estimating HIES Data through Ratio and Regression Methods for Different Sampling Designs
Directory of Open Access Journals (Sweden)
Faqir Muhammad
2007-01-01
Full Text Available In this study, comparison has been made for different sampling designs, using the HIES data of North West Frontier Province (NWFP for 2001-02 and 1998-99 collected from the Federal Bureau of Statistics, Statistical Division, Government of Pakistan, Islamabad. The performance of the estimators has also been considered using bootstrap and Jacknife. A two-stage stratified random sample design is adopted by HIES. In the first stage, enumeration blocks and villages are treated as the first stage Primary Sampling Units (PSU. The sample PSU’s are selected with probability proportional to size. Secondary Sampling Units (SSU i.e., households are selected by systematic sampling with a random start. They have used a single study variable. We have compared the HIES technique with some other designs, which are: Stratified Simple Random Sampling. Stratified Systematic Sampling. Stratified Ranked Set Sampling. Stratified Two Phase Sampling. Ratio and Regression methods were applied with two study variables, which are: Income (y and Household sizes (x. Jacknife and Bootstrap are used for variance replication. Simple Random Sampling with sample size (462 to 561 gave moderate variances both by Jacknife and Bootstrap. By applying Systematic Sampling, we received moderate variance with sample size (467. In Jacknife with Systematic Sampling, we obtained variance of regression estimator greater than that of ratio estimator for a sample size (467 to 631. At a sample size (952 variance of ratio estimator gets greater than that of regression estimator. The most efficient design comes out to be Ranked set sampling compared with other designs. The Ranked set sampling with jackknife and bootstrap, gives minimum variance even with the smallest sample size (467. Two Phase sampling gave poor performance. Multi-stage sampling applied by HIES gave large variances especially if used with a single study variable.
A robust background regression based score estimation algorithm for hyperspectral anomaly detection
Zhao, Rui; Du, Bo; Zhang, Liangpei; Zhang, Lefei
2016-12-01
Anomaly detection has become a hot topic in the hyperspectral image analysis and processing fields in recent years. The most important issue for hyperspectral anomaly detection is the background estimation and suppression. Unreasonable or non-robust background estimation usually leads to unsatisfactory anomaly detection results. Furthermore, the inherent nonlinearity of hyperspectral images may cover up the intrinsic data structure in the anomaly detection. In order to implement robust background estimation, as well as to explore the intrinsic data structure of the hyperspectral image, we propose a robust background regression based score estimation algorithm (RBRSE) for hyperspectral anomaly detection. The Robust Background Regression (RBR) is actually a label assignment procedure which segments the hyperspectral data into a robust background dataset and a potential anomaly dataset with an intersection boundary. In the RBR, a kernel expansion technique, which explores the nonlinear structure of the hyperspectral data in a reproducing kernel Hilbert space, is utilized to formulate the data as a density feature representation. A minimum squared loss relationship is constructed between the data density feature and the corresponding assigned labels of the hyperspectral data, to formulate the foundation of the regression. Furthermore, a manifold regularization term which explores the manifold smoothness of the hyperspectral data, and a maximization term of the robust background average density, which suppresses the bias caused by the potential anomalies, are jointly appended in the RBR procedure. After this, a paired-dataset based k-nn score estimation method is undertaken on the robust background and potential anomaly datasets, to implement the detection output. The experimental results show that RBRSE achieves superior ROC curves, AUC values, and background-anomaly separation than some of the other state-of-the-art anomaly detection methods, and is easy to implement
International Nuclear Information System (INIS)
Baser, Furkan; Demirhan, Haydar
2017-01-01
Accurate estimation of the amount of horizontal global solar radiation for a particular field is an important input for decision processes in solar radiation investments. In this article, we focus on the estimation of yearly mean daily horizontal global solar radiation by using an approach that utilizes fuzzy regression functions with support vector machine (FRF-SVM). This approach is not seriously affected by outlier observations and does not suffer from the over-fitting problem. To demonstrate the utility of the FRF-SVM approach in the estimation of horizontal global solar radiation, we conduct an empirical study over a dataset collected in Turkey and applied the FRF-SVM approach with several kernel functions. Then, we compare the estimation accuracy of the FRF-SVM approach to an adaptive neuro-fuzzy system and a coplot supported-genetic programming approach. We observe that the FRF-SVM approach with a Gaussian kernel function is not affected by both outliers and over-fitting problem and gives the most accurate estimates of horizontal global solar radiation among the applied approaches. Consequently, the use of hybrid fuzzy functions and support vector machine approaches is found beneficial in long-term forecasting of horizontal global solar radiation over a region with complex climatic and terrestrial characteristics. - Highlights: • A fuzzy regression functions with support vector machines approach is proposed. • The approach is robust against outlier observations and over-fitting problem. • Estimation accuracy of the model is superior to several existent alternatives. • A new solar radiation estimation model is proposed for the region of Turkey. • The model is useful under complex terrestrial and climatic conditions.
A different approach to estimate nonlinear regression model using numerical methods
Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.
2017-11-01
This research paper concerns with the computational methods namely the Gauss-Newton method, Gradient algorithm methods (Newton-Raphson method, Steepest Descent or Steepest Ascent algorithm method, the Method of Scoring, the Method of Quadratic Hill-Climbing) based on numerical analysis to estimate parameters of nonlinear regression model in a very different way. Principles of matrix calculus have been used to discuss the Gradient-Algorithm methods. Yonathan Bard [1] discussed a comparison of gradient methods for the solution of nonlinear parameter estimation problems. However this article discusses an analytical approach to the gradient algorithm methods in a different way. This paper describes a new iterative technique namely Gauss-Newton method which differs from the iterative technique proposed by Gorden K. Smyth [2]. Hans Georg Bock et.al [10] proposed numerical methods for parameter estimation in DAE’s (Differential algebraic equation). Isabel Reis Dos Santos et al [11], Introduced weighted least squares procedure for estimating the unknown parameters of a nonlinear regression metamodel. For large-scale non smooth convex minimization the Hager and Zhang (HZ) conjugate gradient Method and the modified HZ (MHZ) method were presented by Gonglin Yuan et al [12].
Wang, Wei; Griswold, Michael E
2016-11-30
The random effect Tobit model is a regression model that accommodates both left- and/or right-censoring and within-cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood-based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the 'Average Predicted Value' method to estimate the model-predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi-Newton optimization algorithm with Gauss-Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.
2018-05-01
Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
A subagging regression method for estimating the qualitative and quantitative state of groundwater
Jeong, Jina; Park, Eungyu; Han, Weon Shik; Kim, Kue-Young
2017-08-01
A subsample aggregating (subagging) regression (SBR) method for the analysis of groundwater data pertaining to trend-estimation-associated uncertainty is proposed. The SBR method is validated against synthetic data competitively with other conventional robust and non-robust methods. From the results, it is verified that the estimation accuracies of the SBR method are consistent and superior to those of other methods, and the uncertainties are reasonably estimated; the others have no uncertainty analysis option. To validate further, actual groundwater data are employed and analyzed comparatively with Gaussian process regression (GPR). For all cases, the trend and the associated uncertainties are reasonably estimated by both SBR and GPR regardless of Gaussian or non-Gaussian skewed data. However, it is expected that GPR has a limitation in applications to severely corrupted data by outliers owing to its non-robustness. From the implementations, it is determined that the SBR method has the potential to be further developed as an effective tool of anomaly detection or outlier identification in groundwater state data such as the groundwater level and contaminant concentration.
Relative Importance for Linear Regression in R: The Package relaimpo
Groemping, Ulrike
2006-01-01
Relative importance is a topic that has seen a lot of interest in recent years, particularly in applied work. The R package relaimpo implements six different metrics for assessing relative importance of regressors in the linear model, two of which are recommended - averaging over orderings of regressors and a newly proposed metric (Feldman 2005) called pmvd. Apart from delivering the metrics themselves, relaimpo also provides (exploratory) bootstrap confidence intervals. This paper offers a b...
Directory of Open Access Journals (Sweden)
Francesco Sarracino
2017-04-01
Full Text Available Recent studies documented that survey data contain duplicate records. We assess how duplicate records affect regression estimates, and we evaluate the effectiveness of solutions to deal with duplicate records. Results show that the chances of obtaining unbiased estimates when data contain 40 doublets (about 5% of the sample range between 3.5% and 11.5% depending on the distribution of duplicates. If 7 quintuplets are present in the data (2% of the sample, then the probability of obtaining biased estimates ranges between 11% and 20%. Weighting the duplicate records by the inverse of their multiplicity, or dropping superfluous duplicates outperform other solutions in all considered scenarios. Our results illustrate the risk of using data in presence of duplicate records and call for further research on strategies to analyze affected data.
Oil and gas pipeline construction cost analysis and developing regression models for cost estimation
Thaduri, Ravi Kiran
In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.
Saunders, Christina T; Blume, Jeffrey D
2017-10-26
Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.
Visser, H.; Molenaar, J.
1995-05-01
The detection of trends in climatological data has become central to the discussion on climate change due to the enhanced greenhouse effect. To prove detection, a method is needed (i) to make inferences on significant rises or declines in trends, (ii) to take into account natural variability in climate series, and (iii) to compare output from GCMs with the trends in observed climate data. To meet these requirements, flexible mathematical tools are needed. A structural time series model is proposed with which a stochastic trend, a deterministic trend, and regression coefficients can be estimated simultaneously. The stochastic trend component is described using the class of ARIMA models. The regression component is assumed to be linear. However, the regression coefficients corresponding with the explanatory variables may be time dependent to validate this assumption. The mathematical technique used to estimate this trend-regression model is the Kaiman filter. The main features of the filter are discussed.Examples of trend estimation are given using annual mean temperatures at a single station in the Netherlands (1706-1990) and annual mean temperatures at Northern Hemisphere land stations (1851-1990). The inclusion of explanatory variables is shown by regressing the latter temperature series on four variables: Southern Oscillation index (SOI), volcanic dust index (VDI), sunspot numbers (SSN), and a simulated temperature signal, induced by increasing greenhouse gases (GHG). In all analyses, the influence of SSN on global temperatures is found to be negligible. The correlations between temperatures and SOI and VDI appear to be negative. For SOI, this correlation is significant, but for VDI it is not, probably because of a lack of volcanic eruptions during the sample period. The relation between temperatures and GHG is positive, which is in agreement with the hypothesis of a warming climate because of increasing levels of greenhouse gases. The prediction performance of
Espelt, Albert; Marí-Dell'Olmo, Marc; Penelo, Eva; Bosque-Prous, Marina
2016-06-14
To examine the differences between Prevalence Ratio (PR) and Odds Ratio (OR) in a cross-sectional study and to provide tools to calculate PR using two statistical packages widely used in substance use research (STATA and R). We used cross-sectional data from 41,263 participants of 16 European countries participating in the Survey on Health, Ageing and Retirement in Europe (SHARE). The dependent variable, hazardous drinking, was calculated using the Alcohol Use Disorders Identification Test - Consumption (AUDIT-C). The main independent variable was gender. Other variables used were: age, educational level and country of residence. PR of hazardous drinking in men with relation to women was estimated using Mantel-Haenszel method, log-binomial regression models and poisson regression models with robust variance. These estimations were compared to the OR calculated using logistic regression models. Prevalence of hazardous drinkers varied among countries. Generally, men have higher prevalence of hazardous drinking than women [PR=1.43 (1.38-1.47)]. Estimated PR was identical independently of the method and the statistical package used. However, OR overestimated PR, depending on the prevalence of hazardous drinking in the country. In cross-sectional studies, where comparisons between countries with differences in the prevalence of the disease or condition are made, it is advisable to use PR instead of OR.
International Nuclear Information System (INIS)
Urrutia, J D; Bautista, L A; Baccay, E B
2014-01-01
The aim of this study was to develop mathematical models for estimating earthquake casualties such as death, number of injured persons, affected families and total cost of damage. To quantify the direct damages from earthquakes to human beings and properties given the magnitude, intensity, depth of focus, location of epicentre and time duration, the regression models were made. The researchers formulated models through regression analysis using matrices and used α = 0.01. The study considered thirty destructive earthquakes that hit the Philippines from the inclusive years 1968 to 2012. Relevant data about these said earthquakes were obtained from Philippine Institute of Volcanology and Seismology. Data on damages and casualties were gathered from the records of National Disaster Risk Reduction and Management Council. This study will be of great value in emergency planning, initiating and updating programs for earthquake hazard reduction in the Philippines, which is an earthquake-prone country.
Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of
Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei
2014-10-01
Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.
Zhai, Liang; Li, Shuang; Zou, Bin; Sang, Huiyong; Fang, Xin; Xu, Shan
2018-05-01
Considering the spatial non-stationary contributions of environment variables to PM2.5 variations, the geographically weighted regression (GWR) modeling method has been using to estimate PM2.5 concentrations widely. However, most of the GWR models in reported studies so far were established based on the screened predictors through pretreatment correlation analysis, and this process might cause the omissions of factors really driving PM2.5 variations. This study therefore developed a best subsets regression (BSR) enhanced principal component analysis-GWR (PCA-GWR) modeling approach to estimate PM2.5 concentration by fully considering all the potential variables' contributions simultaneously. The performance comparison experiment between PCA-GWR and regular GWR was conducted in the Beijing-Tianjin-Hebei (BTH) region over a one-year-period. Results indicated that the PCA-GWR modeling outperforms the regular GWR modeling with obvious higher model fitting- and cross-validation based adjusted R2 and lower RMSE. Meanwhile, the distribution map of PM2.5 concentration from PCA-GWR modeling also clearly depicts more spatial variation details in contrast to the one from regular GWR modeling. It can be concluded that the BSR enhanced PCA-GWR modeling could be a reliable way for effective air pollution concentration estimation in the coming future by involving all the potential predictor variables' contributions to PM2.5 variations.
Large biases in regression-based constituent flux estimates: causes and diagnostic tools
Hirsch, Robert M.
2014-01-01
It has been documented in the literature that, in some cases, widely used regression-based models can produce severely biased estimates of long-term mean river fluxes of various constituents. These models, estimated using sample values of concentration, discharge, and date, are used to compute estimated fluxes for a multiyear period at a daily time step. This study compares results of the LOADEST seven-parameter model, LOADEST five-parameter model, and the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using subsampling of six very large datasets to better understand this bias problem. This analysis considers sample datasets for dissolved nitrate and total phosphorus. The results show that LOADEST-7 and LOADEST-5, although they often produce very nearly unbiased results, can produce highly biased results. This study identifies three conditions that can give rise to these severe biases: (1) lack of fit of the log of concentration vs. log discharge relationship, (2) substantial differences in the shape of this relationship across seasons, and (3) severely heteroscedastic residuals. The WRTDS model is more resistant to the bias problem than the LOADEST models but is not immune to them. Understanding the causes of the bias problem is crucial to selecting an appropriate method for flux computations. Diagnostic tools for identifying the potential for bias problems are introduced, and strategies for resolving bias problems are described.
Directory of Open Access Journals (Sweden)
Hongjian Wang
2014-01-01
Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.
Cawyer, Chase R; Anderson, Sarah B; Szychowski, Jeff M; Neely, Cherry; Owen, John
2018-03-01
To compare the accuracy of a new regression-derived formula developed from the National Fetal Growth Studies data to the common alternative method that uses the average of the gestational ages (GAs) calculated for each fetal biometric measurement (biparietal diameter, head circumference, abdominal circumference, and femur length). This retrospective cross-sectional study identified nonanomalous singleton pregnancies that had a crown-rump length plus at least 1 additional sonographic examination with complete fetal biometric measurements. With the use of the crown-rump length to establish the referent estimated date of delivery, each method's (National Institute of Child Health and Human Development regression versus Hadlock average [Radiology 1984; 152:497-501]), error at every examination was computed. Error, defined as the difference between the crown-rump length-derived GA and each method's predicted GA (weeks), was compared in 3 GA intervals: 1 (14 weeks-20 weeks 6 days), 2 (21 weeks-28 weeks 6 days), and 3 (≥29 weeks). In addition, the proportion of each method's examinations that had errors outside prespecified (±) day ranges was computed by using odds ratios. A total of 16,904 sonograms were identified. The overall and prespecified GA range subset mean errors were significantly smaller for the regression compared to the average (P < .01), and the regression had significantly lower odds of observing examinations outside the specified range of error in GA intervals 2 (odds ratio, 1.15; 95% confidence interval, 1.01-1.31) and 3 (odds ratio, 1.24; 95% confidence interval, 1.17-1.32) than the average method. In a contemporary unselected population of women dated by a crown-rump length-derived GA, the National Institute of Child Health and Human Development regression formula produced fewer estimates outside a prespecified margin of error than the commonly used Hadlock average; the differences were most pronounced for GA estimates at 29 weeks and later.
Impact of regression methods on improved effects of soil structure on soil water retention estimates
Nguyen, Phuong Minh; De Pue, Jan; Le, Khoa Van; Cornelis, Wim
2015-06-01
Increasing the accuracy of pedotransfer functions (PTFs), an indirect method for predicting non-readily available soil features such as soil water retention characteristics (SWRC), is of crucial importance for large scale agro-hydrological modeling. Adding significant predictors (i.e., soil structure), and implementing more flexible regression algorithms are among the main strategies of PTFs improvement. The aim of this study was to investigate whether the improved effect of categorical soil structure information on estimating soil-water content at various matric potentials, which has been reported in literature, could be enduringly captured by regression techniques other than the usually applied linear regression. Two data mining techniques, i.e., Support Vector Machines (SVM), and k-Nearest Neighbors (kNN), which have been recently introduced as promising tools for PTF development, were utilized to test if the incorporation of soil structure will improve PTF's accuracy under a context of rather limited training data. The results show that incorporating descriptive soil structure information, i.e., massive, structured and structureless, as grouping criterion can improve the accuracy of PTFs derived by SVM approach in the range of matric potential of -6 to -33 kPa (average RMSE decreased up to 0.005 m3 m-3 after grouping, depending on matric potentials). The improvement was primarily attributed to the outperformance of SVM-PTFs calibrated on structureless soils. No improvement was obtained with kNN technique, at least not in our study in which the data set became limited in size after grouping. Since there is an impact of regression techniques on the improved effect of incorporating qualitative soil structure information, selecting a proper technique will help to maximize the combined influence of flexible regression algorithms and soil structure information on PTF accuracy.
Larson, Steven J.; Crawford, Charles G.; Gilliom, Robert J.
2004-01-01
Regression models were developed for predicting atrazine concentration distributions in rivers and streams, using the Watershed Regressions for Pesticides (WARP) methodology. Separate regression equations were derived for each of nine percentiles of the annual distribution of atrazine concentrations and for the annual time-weighted mean atrazine concentration. In addition, seasonal models were developed for two specific periods of the year--the high season, when the highest atrazine concentrations are expected in streams, and the low season, when concentrations are expected to be low or undetectable. Various nationally available watershed parameters were used as explanatory variables, including atrazine use intensity, soil characteristics, hydrologic parameters, climate and weather variables, land use, and agricultural management practices. Concentration data from 112 river and stream stations sampled as part of the U.S. Geological Survey's National Water-Quality Assessment and National Stream Quality Accounting Network Programs were used for computing the concentration percentiles and mean concentrations used as the response variables in regression models. Tobit regression methods, using maximum likelihood estimation, were used for developing the models because some of the concentration values used for the response variables were censored (reported as less than a detection threshold). Data from 26 stations not used for model development were used for model validation. The annual models accounted for 62 to 77 percent of the variability in concentrations among the 112 model development stations. Atrazine use intensity (the amount of atrazine used in the watershed divided by watershed area) was the most important explanatory variable in all models, but additional watershed parameters significantly increased the amount of variability explained by the models. Predicted concentrations from all 10 models were within a factor of 10 of the observed concentrations at most
Longitudinal beta regression models for analyzing health-related quality of life scores over time
Directory of Open Access Journals (Sweden)
Hunger Matthias
2012-09-01
Full Text Available Abstract Background Health-related quality of life (HRQL has become an increasingly important outcome parameter in clinical trials and epidemiological research. HRQL scores are typically bounded at both ends of the scale and often highly skewed. Several regression techniques have been proposed to model such data in cross-sectional studies, however, methods applicable in longitudinal research are less well researched. This study examined the use of beta regression models for analyzing longitudinal HRQL data using two empirical examples with distributional features typically encountered in practice. Methods We used SF-6D utility data from a German older age cohort study and stroke-specific HRQL data from a randomized controlled trial. We described the conceptual differences between mixed and marginal beta regression models and compared both models to the commonly used linear mixed model in terms of overall fit and predictive accuracy. Results At any measurement time, the beta distribution fitted the SF-6D utility data and stroke-specific HRQL data better than the normal distribution. The mixed beta model showed better likelihood-based fit statistics than the linear mixed model and respected the boundedness of the outcome variable. However, it tended to underestimate the true mean at the upper part of the distribution. Adjusted group means from marginal beta model and linear mixed model were nearly identical but differences could be observed with respect to standard errors. Conclusions Understanding the conceptual differences between mixed and marginal beta regression models is important for their proper use in the analysis of longitudinal HRQL data. Beta regression fits the typical distribution of HRQL data better than linear mixed models, however, if focus is on estimating group mean scores rather than making individual predictions, the two methods might not differ substantially.
The Roots of Inequality: Estimating Inequality of Opportunity from Regression Trees
DEFF Research Database (Denmark)
Brunori, Paolo; Hufe, Paul; Mahler, Daniel Gerszon
2017-01-01
the risk of arbitrary and ad-hoc model selection. Second, they provide a standardized way of trading off upward and downward biases in inequality of opportunity estimations. Finally, regression trees can be graphically represented; their structure is immediate to read and easy to understand. This will make...... the measurement of inequality of opportunity more easily comprehensible to a large audience. These advantages are illustrated by an empirical application based on the 2011 wave of the European Union Statistics on Income and Living Conditions....
Yang, Duo; Zhang, Xu; Pan, Rui; Wang, Yujie; Chen, Zonghai
2018-04-01
The state-of-health (SOH) estimation is always a crucial issue for lithium-ion batteries. In order to provide an accurate and reliable SOH estimation, a novel Gaussian process regression (GPR) model based on charging curve is proposed in this paper. Different from other researches where SOH is commonly estimated by cycle life, in this work four specific parameters extracted from charging curves are used as inputs of the GPR model instead of cycle numbers. These parameters can reflect the battery aging phenomenon from different angles. The grey relational analysis method is applied to analyze the relational grade between selected features and SOH. On the other hand, some adjustments are made in the proposed GPR model. Covariance function design and the similarity measurement of input variables are modified so as to improve the SOH estimate accuracy and adapt to the case of multidimensional input. Several aging data from NASA data repository are used for demonstrating the estimation effect by the proposed method. Results show that the proposed method has high SOH estimation accuracy. Besides, a battery with dynamic discharging profile is used to verify the robustness and reliability of this method.
Moore, Richard Bridge; Johnston, Craig M.; Robinson, Keith W.; Deacon, Jeffrey R.
2004-01-01
The U.S. Geological Survey (USGS), in cooperation with the U.S. Environmental Protection Agency (USEPA) and the New England Interstate Water Pollution Control Commission (NEIWPCC), has developed a water-quality model, called SPARROW (Spatially Referenced Regressions on Watershed Attributes), to assist in regional total maximum daily load (TMDL) and nutrient-criteria activities in New England. SPARROW is a spatially detailed, statistical model that uses regression equations to relate total nitrogen and phosphorus (nutrient) stream loads to nutrient sources and watershed characteristics. The statistical relations in these equations are then used to predict nutrient loads in unmonitored streams. The New England SPARROW models are built using a hydrologic network of 42,000 stream reaches and associated watersheds. Watershed boundaries are defined for each stream reach in the network through the use of a digital elevation model and existing digitized watershed divides. Nutrient source data is from permitted wastewater discharge data from USEPA's Permit Compliance System (PCS), various land-use sources, and atmospheric deposition. Physical watershed characteristics include drainage area, land use, streamflow, time-of-travel, stream density, percent wetlands, slope of the land surface, and soil permeability. The New England SPARROW models for total nitrogen and total phosphorus have R-squared values of 0.95 and 0.94, with mean square errors of 0.16 and 0.23, respectively. Variables that were statistically significant in the total nitrogen model include permitted municipal-wastewater discharges, atmospheric deposition, agricultural area, and developed land area. Total nitrogen stream-loss rates were significant only in streams with average annual flows less than or equal to 2.83 cubic meters per second. In streams larger than this, there is nondetectable in-stream loss of annual total nitrogen in New England. Variables that were statistically significant in the total
Estimating the Impact of Urbanization on Air Quality in China Using Spatial Regression Models
Directory of Open Access Journals (Sweden)
Chuanglin Fang
2015-11-01
Full Text Available Urban air pollution is one of the most visible environmental problems to have accompanied China’s rapid urbanization. Based on emission inventory data from 2014, gathered from 289 cities, we used Global and Local Moran’s I to measure the spatial autorrelation of Air Quality Index (AQI values at the city level, and employed Ordinary Least Squares (OLS, Spatial Lag Model (SAR, and Geographically Weighted Regression (GWR to quantitatively estimate the comprehensive impact and spatial variations of China’s urbanization process on air quality. The results show that a significant spatial dependence and heterogeneity existed in AQI values. Regression models revealed urbanization has played an important negative role in determining air quality in Chinese cities. The population, urbanization rate, automobile density, and the proportion of secondary industry were all found to have had a significant influence over air quality. Per capita Gross Domestic Product (GDP and the scale of urban land use, however, failed the significance test at 10% level. The GWR model performed better than global models and the results of GWR modeling show that the relationship between urbanization and air quality was not constant in space. Further, the local parameter estimates suggest significant spatial variation in the impacts of various urbanization factors on air quality.
Flexible regression models for estimating postmortem interval (PMI) in forensic medicine.
Muñoz Barús, José Ignacio; Febrero-Bande, Manuel; Cadarso-Suárez, Carmen
2008-10-30
Correct determination of time of death is an important goal in forensic medicine. Numerous methods have been described for estimating postmortem interval (PMI), but most are imprecise, poorly reproducible and/or have not been validated with real data. In recent years, however, some progress in PMI estimation has been made, notably through the use of new biochemical methods for quantifying relevant indicator compounds in the vitreous humour. The best, but unverified, results have been obtained with [K+] and hypoxanthine [Hx], using simple linear regression (LR) models. The main aim of this paper is to offer more flexible alternatives to LR, such as generalized additive models (GAMs) and support vector machines (SVMs) in order to obtain improved PMI estimates. The present study, based on detailed analysis of [K+] and [Hx] in more than 200 vitreous humour samples from subjects with known PMI, compared classical LR methodology with GAM and SVM methodologies. Both proved better than LR for estimation of PMI. SVM showed somewhat greater precision than GAM, but GAM offers a readily interpretable graphical output, facilitating understanding of findings by legal professionals; there are thus arguments for using both types of models. R code for these methods is available from the authors, permitting accurate prediction of PMI from vitreous humour [K+], [Hx] and [U], with confidence intervals and graphical output provided. Copyright 2008 John Wiley & Sons, Ltd.
Evaluation of Regression and Neuro_Fuzzy Models in Estimating Saturated Hydraulic Conductivity
Directory of Open Access Journals (Sweden)
J. Behmanesh
2015-06-01
Full Text Available Study of soil hydraulic properties such as saturated and unsaturated hydraulic conductivity is required in the environmental investigations. Despite numerous research, measuring saturated hydraulic conductivity using by direct methods are still costly, time consuming and professional. Therefore estimating saturated hydraulic conductivity using rapid and low cost methods such as pedo-transfer functions with acceptable accuracy was developed. The purpose of this research was to compare and evaluate 11 pedo-transfer functions and Adaptive Neuro-Fuzzy Inference System (ANFIS to estimate saturated hydraulic conductivity of soil. In this direct, saturated hydraulic conductivity and physical properties in 40 points of Urmia were calculated. The soil excavated was used in the lab to determine its easily accessible parameters. The results showed that among existing models, Aimrun et al model had the best estimation for soil saturated hydraulic conductivity. For mentioned model, the Root Mean Square Error and Mean Absolute Error parameters were 0.174 and 0.028 m/day respectively. The results of the present research, emphasises the importance of effective porosity application as an important accessible parameter in accuracy of pedo-transfer functions. sand and silt percent, bulk density and soil particle density were selected to apply in 561 ANFIS models. In training phase of best ANFIS model, the R2 and RMSE were calculated 1 and 1.2×10-7 respectively. These amounts in the test phase were 0.98 and 0.0006 respectively. Comparison of regression and ANFIS models showed that the ANFIS model had better results than regression functions. Also Nuro-Fuzzy Inference System had capability to estimatae with high accuracy in various soil textures.
Directory of Open Access Journals (Sweden)
Gholam Reza Sheykhzadeh
2017-02-01
Full Text Available Introduction: Penetration resistance is one of the criteria for evaluating soil compaction. It correlates with several soil properties such as vehicle trafficability, resistance to root penetration, seedling emergence, and soil compaction by farm machinery. Direct measurement of penetration resistance is time consuming and difficult because of high temporal and spatial variability. Therefore, many different regressions and artificial neural network pedotransfer functions have been proposed to estimate penetration resistance from readily available soil variables such as particle size distribution, bulk density (Db and gravimetric water content (θm. The lands of Ardabil Province are one of the main production regions of potato in Iran, thus, obtaining the soil penetration resistance in these regions help with the management of potato production. The objective of this research was to derive pedotransfer functions by using regression and artificial neural network to predict penetration resistance from some soil variations in the agricultural soils of Ardabil plain and to compare the performance of artificial neural network with regression models. Materials and methods: Disturbed and undisturbed soil samples (n= 105 were systematically taken from 0-10 cm soil depth with nearly 3000 m distance in the agricultural lands of the Ardabil plain ((lat 38°15' to 38°40' N, long 48°16' to 48°61' E. The contents of sand, silt and clay (hydrometer method, CaCO3 (titration method, bulk density (cylinder method, particle density (Dp (pychnometer method, organic carbon (wet oxidation method, total porosity(calculating from Db and Dp, saturated (θs and field soil water (θf using the gravimetric method were measured in the laboratory. Mean geometric diameter (dg and standard deviation (σg of soil particles were computed using the percentages of sand, silt and clay. Penetration resistance was measured in situ using cone penetrometer (analog model at 10
Jones, Kelly W; Lewis, David J
2015-01-01
Deforestation and conversion of native habitats continues to be the leading driver of biodiversity and ecosystem service loss. A number of conservation policies and programs are implemented--from protected areas to payments for ecosystem services (PES)--to deter these losses. Currently, empirical evidence on whether these approaches stop or slow land cover change is lacking, but there is increasing interest in conducting rigorous, counterfactual impact evaluations, especially for many new conservation approaches, such as PES and REDD, which emphasize additionality. In addition, several new, globally available and free high-resolution remote sensing datasets have increased the ease of carrying out an impact evaluation on land cover change outcomes. While the number of conservation evaluations utilizing 'matching' to construct a valid control group is increasing, the majority of these studies use simple differences in means or linear cross-sectional regression to estimate the impact of the conservation program using this matched sample, with relatively few utilizing fixed effects panel methods--an alternative estimation method that relies on temporal variation in the data. In this paper we compare the advantages and limitations of (1) matching to construct the control group combined with differences in means and cross-sectional regression, which control for observable forms of bias in program evaluation, to (2) fixed effects panel methods, which control for observable and time-invariant unobservable forms of bias, with and without matching to create the control group. We then use these four approaches to estimate forest cover outcomes for two conservation programs: a PES program in Northeastern Ecuador and strict protected areas in European Russia. In the Russia case we find statistically significant differences across estimators--due to the presence of unobservable bias--that lead to differences in conclusions about effectiveness. The Ecuador case illustrates that
Directory of Open Access Journals (Sweden)
Kelly W Jones
Full Text Available Deforestation and conversion of native habitats continues to be the leading driver of biodiversity and ecosystem service loss. A number of conservation policies and programs are implemented--from protected areas to payments for ecosystem services (PES--to deter these losses. Currently, empirical evidence on whether these approaches stop or slow land cover change is lacking, but there is increasing interest in conducting rigorous, counterfactual impact evaluations, especially for many new conservation approaches, such as PES and REDD, which emphasize additionality. In addition, several new, globally available and free high-resolution remote sensing datasets have increased the ease of carrying out an impact evaluation on land cover change outcomes. While the number of conservation evaluations utilizing 'matching' to construct a valid control group is increasing, the majority of these studies use simple differences in means or linear cross-sectional regression to estimate the impact of the conservation program using this matched sample, with relatively few utilizing fixed effects panel methods--an alternative estimation method that relies on temporal variation in the data. In this paper we compare the advantages and limitations of (1 matching to construct the control group combined with differences in means and cross-sectional regression, which control for observable forms of bias in program evaluation, to (2 fixed effects panel methods, which control for observable and time-invariant unobservable forms of bias, with and without matching to create the control group. We then use these four approaches to estimate forest cover outcomes for two conservation programs: a PES program in Northeastern Ecuador and strict protected areas in European Russia. In the Russia case we find statistically significant differences across estimators--due to the presence of unobservable bias--that lead to differences in conclusions about effectiveness. The Ecuador case
A regressive methodology for estimating missing data in rainfall daily time series
Barca, E.; Passarella, G.
2009-04-01
The "presence" of gaps in environmental data time series represents a very common, but extremely critical problem, since it can produce biased results (Rubin, 1976). Missing data plagues almost all surveys. The problem is how to deal with missing data once it has been deemed impossible to recover the actual missing values. Apart from the amount of missing data, another issue which plays an important role in the choice of any recovery approach is the evaluation of "missingness" mechanisms. When data missing is conditioned by some other variable observed in the data set (Schafer, 1997) the mechanism is called MAR (Missing at Random). Otherwise, when the missingness mechanism depends on the actual value of the missing data, it is called NCAR (Not Missing at Random). This last is the most difficult condition to model. In the last decade interest arose in the estimation of missing data by using regression (single imputation). More recently multiple imputation has become also available, which returns a distribution of estimated values (Scheffer, 2002). In this paper an automatic methodology for estimating missing data is presented. In practice, given a gauging station affected by missing data (target station), the methodology checks the randomness of the missing data and classifies the "similarity" between the target station and the other gauging stations spread over the study area. Among different methods useful for defining the similarity degree, whose effectiveness strongly depends on the data distribution, the Spearman correlation coefficient was chosen. Once defined the similarity matrix, a suitable, nonparametric, univariate, and regressive method was applied in order to estimate missing data in the target station: the Theil method (Theil, 1950). Even though the methodology revealed to be rather reliable an improvement of the missing data estimation can be achieved by a generalization. A first possible improvement consists in extending the univariate technique to
Tang, Robert; Tian, Linwei; Thach, Thuan-Quoc; Tsui, Tsz Him; Brauer, Michael; Lee, Martha; Allen, Ryan; Yuchi, Weiran; Lai, Poh-Chin; Wong, Paulina; Barratt, Benjamin
2018-04-01
Epidemiological studies typically use subjects' residential address to estimate individuals' air pollution exposure. However, in reality this exposure is rarely static as people move from home to work/study locations and commute during the day. Integrating mobility and time-activity data may reduce errors and biases, thereby improving estimates of health risks. To incorporate land use regression with movement and building infiltration data to estimate time-weighted air pollution exposures stratified by age, sex, and employment status for population subgroups in Hong Kong. A large population-representative survey (N = 89,385) was used to characterize travel behavior, and derive time-activity pattern for each subject. Infiltration factors calculated from indoor/outdoor monitoring campaigns were used to estimate micro-environmental concentrations. We evaluated dynamic and static (residential location-only) exposures in a staged modeling approach to quantify effects of each component. Higher levels of exposures were found for working adults and students due to increased mobility. Compared to subjects aged 65 or older, exposures to PM 2.5 , BC, and NO 2 were 13%, 39% and 14% higher, respectively for subjects aged below 18, and 3%, 18% and 11% higher, respectively for working adults. Exposures of females were approximately 4% lower than those of males. Dynamic exposures were around 20% lower than ambient exposures at residential addresses. The incorporation of infiltration and mobility increased heterogeneity in population exposure and allowed identification of highly exposed groups. The use of ambient concentrations may lead to exposure misclassification which introduces bias, resulting in lower effect estimates than 'true' exposures. Copyright © 2018 Elsevier Ltd. All rights reserved.
Improved regression models for ventilation estimation based on chest and abdomen movements
International Nuclear Information System (INIS)
Liu, Shaopeng; Gao, Robert; He, Qingbo; Staudenmayer, John; Freedson, Patty
2012-01-01
Non-invasive estimation of minute ventilation is important for quantifying the intensity of physical activity of individuals. In this paper, several improved regression models are presented, based on the measurement of chest and abdomen movements from sensor belts worn by subjects (n = 50) engaged in 14 types of physical activity. Five linear models involving a combination of 11 features were developed, and the effects of different model training approaches and window sizes for computing the features were investigated. The performance of the models was evaluated using experimental data collected during the physical activity protocol. The predicted minute ventilation was compared to the criterion ventilation measured using a bidirectional digital volume transducer housed in a respiratory gas exchange system. The results indicate that the inclusion of breathing frequency and the use of percentile points instead of interdecile ranges over a 60 s window size reduced error by about 43%, when applied to the classical two-degrees-of-freedom model. The mean percentage error of the minute ventilation estimated for all the activities was below 7.5%, verifying reasonably good performance of the models and the applicability of the wearable sensing system for minute ventilation estimation during physical activity. (paper)
Oldenburg, Catherine E; Venkatesh Prajna, N; Krishnan, Tiruvengada; Rajaraman, Revathi; Srinivasan, Muthiah; Ray, Kathryn J; O'Brien, Kieran S; Glymour, M Maria; Porco, Travis C; Acharya, Nisha R; Rose-Nussbaumer, Jennifer; Lietman, Thomas M
2018-08-01
We compare results from regression discontinuity (RD) analysis to primary results of a randomized controlled trial (RCT) utilizing data from two contemporaneous RCTs for treatment of fungal corneal ulcers. Patients were enrolled in the Mycotic Ulcer Treatment Trials I and II (MUTT I & MUTT II) based on baseline visual acuity: patients with acuity ≤ 20/400 (logMAR 1.3) enrolled in MUTT I, and >20/400 in MUTT II. MUTT I investigated the effect of topical natamycin versus voriconazole on best spectacle-corrected visual acuity. MUTT II investigated the effect of topical voriconazole plus placebo versus topical voriconazole plus oral voriconazole. We compared the RD estimate (natamycin arm of MUTT I [N = 162] versus placebo arm of MUTT II [N = 54]) to the RCT estimate from MUTT I (topical natamycin [N = 162] versus topical voriconazole [N = 161]). In the RD, patients receiving natamycin had mean improvement of 4-lines of visual acuity at 3 months (logMAR -0.39, 95% CI: -0.61, -0.17) compared to topical voriconazole plus placebo, and 2-lines in the RCT (logMAR -0.18, 95% CI: -0.30, -0.05) compared to topical voriconazole. The RD and RCT estimates were similar, although the RD design overestimated effects compared to the RCT.
Spady, Richard; Stouli, Sami
2012-01-01
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
Novikov, I; Fund, N; Freedman, L S
2010-01-15
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
A Note on Penalized Regression Spline Estimation in the Secondary Analysis of Case-Control Data
Gazioglu, Suzan; Wei, Jiawei; Jennings, Elizabeth M.; Carroll, Raymond J.
2013-01-01
Primary analysis of case-control studies focuses on the relationship between disease (D) and a set of covariates of interest (Y, X). A secondary application of the case-control study, often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated due to the case-control sampling, and to avoid the biased sampling that arises from the design, it is typical to use the control data only. In this paper, we develop penalized regression spline methodology that uses all the data, and improves precision of estimation compared to using only the controls. A simulation study and an empirical example are used to illustrate the methodology.
A Note on Penalized Regression Spline Estimation in the Secondary Analysis of Case-Control Data
Gazioglu, Suzan
2013-05-25
Primary analysis of case-control studies focuses on the relationship between disease (D) and a set of covariates of interest (Y, X). A secondary application of the case-control study, often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated due to the case-control sampling, and to avoid the biased sampling that arises from the design, it is typical to use the control data only. In this paper, we develop penalized regression spline methodology that uses all the data, and improves precision of estimation compared to using only the controls. A simulation study and an empirical example are used to illustrate the methodology.
Messier, Kyle P; Akita, Yasuyuki; Serre, Marc L
2012-03-06
Geographic information systems (GIS) based techniques are cost-effective and efficient methods used by state agencies and epidemiology researchers for estimating concentration and exposure. However, budget limitations have made statewide assessments of contamination difficult, especially in groundwater media. Many studies have implemented address geocoding, land use regression, and geostatistics independently, but this is the first to examine the benefits of integrating these GIS techniques to address the need of statewide exposure assessments. A novel framework for concentration exposure is introduced that integrates address geocoding, land use regression (LUR), below detect data modeling, and Bayesian Maximum Entropy (BME). A LUR model was developed for tetrachloroethylene that accounts for point sources and flow direction. We then integrate the LUR model into the BME method as a mean trend while also modeling below detects data as a truncated Gaussian probability distribution function. We increase available PCE data 4.7 times from previously available databases through multistage geocoding. The LUR model shows significant influence of dry cleaners at short ranges. The integration of the LUR model as mean trend in BME results in a 7.5% decrease in cross validation mean square error compared to BME with a constant mean trend.
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly
Jones, Kelly W.; Lewis, David J.
2015-01-01
Deforestation and conversion of native habitats continues to be the leading driver of biodiversity and ecosystem service loss. A number of conservation policies and programs are implemented—from protected areas to payments for ecosystem services (PES)—to deter these losses. Currently, empirical evidence on whether these approaches stop or slow land cover change is lacking, but there is increasing interest in conducting rigorous, counterfactual impact evaluations, especially for many new conservation approaches, such as PES and REDD, which emphasize additionality. In addition, several new, globally available and free high-resolution remote sensing datasets have increased the ease of carrying out an impact evaluation on land cover change outcomes. While the number of conservation evaluations utilizing ‘matching’ to construct a valid control group is increasing, the majority of these studies use simple differences in means or linear cross-sectional regression to estimate the impact of the conservation program using this matched sample, with relatively few utilizing fixed effects panel methods—an alternative estimation method that relies on temporal variation in the data. In this paper we compare the advantages and limitations of (1) matching to construct the control group combined with differences in means and cross-sectional regression, which control for observable forms of bias in program evaluation, to (2) fixed effects panel methods, which control for observable and time-invariant unobservable forms of bias, with and without matching to create the control group. We then use these four approaches to estimate forest cover outcomes for two conservation programs: a PES program in Northeastern Ecuador and strict protected areas in European Russia. In the Russia case we find statistically significant differences across estimators—due to the presence of unobservable bias—that lead to differences in conclusions about effectiveness. The Ecuador case
International Nuclear Information System (INIS)
Harlim, John; Mahdi, Adam; Majda, Andrew J.
2014-01-01
A central issue in contemporary science is the development of nonlinear data driven statistical–dynamical models for time series of noisy partial observations from nature or a complex model. It has been established recently that ad-hoc quadratic multi-level regression models can have finite-time blow-up of statistical solutions and/or pathological behavior of their invariant measure. Recently, a new class of physics constrained nonlinear regression models were developed to ameliorate this pathological behavior. Here a new finite ensemble Kalman filtering algorithm is developed for estimating the state, the linear and nonlinear model coefficients, the model and the observation noise covariances from available partial noisy observations of the state. Several stringent tests and applications of the method are developed here. In the most complex application, the perfect model has 57 degrees of freedom involving a zonal (east–west) jet, two topographic Rossby waves, and 54 nonlinearly interacting Rossby waves; the perfect model has significant non-Gaussian statistics in the zonal jet with blocked and unblocked regimes and a non-Gaussian skewed distribution due to interaction with the other 56 modes. We only observe the zonal jet contaminated by noise and apply the ensemble filter algorithm for estimation. Numerically, we find that a three dimensional nonlinear stochastic model with one level of memory mimics the statistical effect of the other 56 modes on the zonal jet in an accurate fashion, including the skew non-Gaussian distribution and autocorrelation decay. On the other hand, a similar stochastic model with zero memory levels fails to capture the crucial non-Gaussian behavior of the zonal jet from the perfect 57-mode model
Linear regressive model structures for estimation and prediction of compartmental diffusive systems
Vries, D; Keesman, K.J.; Zwart, Heiko J.
In input-output relations of (compartmental) diffusive systems, physical parameters appear non-linearly, resulting in the use of (constrained) non-linear parameter estimation techniques with its short-comings regarding global optimality and computational effort. Given a LTI system in state space
Linear regressive model structures for estimation and prediction of compartmental diffusive systems
Vries, D.; Keesman, K.J.; Zwart, H.
2006-01-01
Abstract In input-output relations of (compartmental) diffusive systems, physical parameters appear non-linearly, resulting in the use of (constrained) non-linear parameter estimation techniques with its short-comings regarding global optimality and computational effort. Given a LTI system in state
National Research Council Canada - National Science Library
Bielecki, John
2003-01-01
.... Previous research has demonstrated the use of a two-step logistic and multiple regression methodology to predicting cost growth produces desirable results versus traditional single-step regression...
The Growth Points of Regional Economy and Regression Estimation for Branch Investment Multipliers
Directory of Open Access Journals (Sweden)
Nina Pavlovna Goridko
2018-03-01
Full Text Available The article develops the methodology of using investment multipliers to identify growth points for a regional economy. The paper discusses various options for the assessment of multiplicative effects caused by investments in certain sectors of the economy. All calculations are carried out on the example of economy of the Republic of Tatarstan for the period 2005–2015. The instrument of regression modeling using the method of least squares, permits to estimate sectoral and cross-sectoral investment multipliers in the economy of the Republic of Tatarstan. Moreover, this method allows to assess the elasticity of gross output of regional economy and its individual sectors depending on investment in various sectors of the economy. Calculations results allowed to identify three growth points of the economy of the Republic of Tatarstan. They are mining industry, manufacturing industry and construction. The success of a particular industry or sub-industry in a country or a region should be measured not only by its share in macro-system’s gross output or value added, but also by the multiplicative effect that investments in the industry have on the development of other industries, on employment and on general national or regional product. In recent years, the growth of the Russian was close to zero. Thus, it is crucial to understand the structural consequences of the increasing investments in various sectors of the Russian economy. In this regard, the problems solved in the article are relevant for a number of countries and regions with a similar economic situation. The obtained results can be applied for similar estimations of investment multipliers as well as multipliers of government spending, and other components of aggregate demand in various countries and regions to identify growth points. Investments in these growth points will induce the greatest and the most evident increment of the outcome from the macro-system’s economic activities.
Brakebill, J.W.; Wolock, D.M.; Terziotti, S.E.
2011-01-01
Digital hydrologic networks depicting surface-water pathways and their associated drainage catchments provide a key component to hydrologic analysis and modeling. Collectively, they form common spatial units that can be used to frame the descriptions of aquatic and watershed processes. In addition, they provide the ability to simulate and route the movement of water and associated constituents throughout the landscape. Digital hydrologic networks have evolved from derivatives of mapping products to detailed, interconnected, spatially referenced networks of water pathways, drainage areas, and stream and watershed characteristics. These properties are important because they enhance the ability to spatially evaluate factors that affect the sources and transport of water-quality constituents at various scales. SPAtially Referenced Regressions On Watershed attributes (SPARROW), a process-based/statistical model, relies on a digital hydrologic network in order to establish relations between quantities of monitored contaminant flux, contaminant sources, and the associated physical characteristics affecting contaminant transport. Digital hydrologic networks modified from the River Reach File (RF1) and National Hydrography Dataset (NHD) geospatial datasets provided frameworks for SPARROW in six regions of the conterminous United States. In addition, characteristics of the modified RF1 were used to update estimates of mean-annual streamflow. This produced more current flow estimates for use in SPARROW modeling. ?? 2011 American Water Resources Association. This article is a U.S. Government work and is in the public domain in the USA.
Estimation of Chinese surface NO2 concentrations combining satellite data and Land Use Regression
Anand, J.; Monks, P.
2016-12-01
Monitoring surface-level air quality is often limited by in-situ instrument placement and issues arising from harmonisation over long timescales. Satellite instruments can offer a synoptic view of regional pollution sources, but in many cases only a total or tropospheric column can be measured. In this work a new technique of estimating surface NO2 combining both satellite and in-situ data is presented, in which a Land Use Regression (LUR) model is used to create high resolution pollution maps based on known predictor variables such as population density, road networks, and land cover. By employing a mixed effects approach, it is possible to take advantage of the spatiotemporal variability in the satellite-derived column densities to account for daily and regional variations in surface NO2 caused by factors such as temperature, elevation, and wind advection. In this work, surface NO2 maps are modelled over the North China Plain and Pearl River Delta during high-pollution episodes by combining in-situ measurements and tropospheric columns from the Ozone Monitoring Instrument (OMI). The modelled concentrations show good agreement with in-situ data and surface NO2 concentrations derived from the MACC-II global reanalysis.
Estimated prevalence of halitosis: a systematic review and meta-regression analysis.
Silva, Manuela F; Leite, Fábio R M; Ferreira, Larissa B; Pola, Natália M; Scannapieco, Frank A; Demarco, Flávio F; Nascimento, Gustavo G
2018-01-01
This study aims to conduct a systematic review to determine the prevalence of halitosis in adolescents and adults. Electronic searches were performed using four different databases without restrictions: PubMed, Scopus, Web of Science, and SciELO. Population-based observational studies that provided data about the prevalence of halitosis in adolescents and adults were included. Additionally, meta-analyses, meta-regression, and sensitivity analyses were conducted to synthesize the evidence. A total of 584 articles were initially found and considered for title and abstract evaluation. Thirteen articles met inclusion criteria. The combined prevalence of halitosis was found to be 31.8% (95% CI 24.6-39.0%). Methodological aspects such as the year of publication and the socioeconomic status of the country where the study was conducted seemed to influence the prevalence of halitosis. Our results demonstrated that the estimated prevalence of halitosis was 31.8%, with high heterogeneity between studies. The results suggest a worldwide trend towards a rise in halitosis prevalence. Given the high prevalence of halitosis and its complex etiology, dental professionals should be aware of their roles in halitosis prevention and treatment.
ESTIMATION OF GENETIC PARAMETERS IN TROPICARNE CATTLE WITH RANDOM REGRESSION MODELS USING B-SPLINES
Directory of Open Access Journals (Sweden)
Joel DomÃnguez Viveros
2015-04-01
Full Text Available The objectives were to estimate variance components, and direct (h2 and maternal (m2 heritability in the growth of Tropicarne cattle based on a random regression model using B-Splines for random effects modeling. Information from 12 890 monthly weightings of 1787 calves, from birth to 24 months old, was analyzed. The pedigree included 2504 animals. The random effects model included genetic and permanent environmental (direct and maternal of cubic order, and residuals. The fixed effects included contemporaneous groups (year â€“ season of weighed, sex and the covariate age of the cow (linear and quadratic. The B-Splines were defined in four knots through the growth period analyzed. Analyses were performed with the software Wombat. The variances (phenotypic and residual presented a similar behavior; of 7 to 12 months of age had a negative trend; from birth to 6 months and 13 to 18 months had positive trend; after 19 months were maintained constant. The m2 were low and near to zero, with an average of 0.06 in an interval of 0.04 to 0.11; the h2 also were close to zero, with an average of 0.10 in an interval of 0.03 to 0.23.
An Analysis of Bank Service Satisfaction Based on Quantile Regression and Grey Relational Analysis
Directory of Open Access Journals (Sweden)
Wen-Tsao Pan
2016-01-01
Full Text Available Bank service satisfaction is vital to the success of a bank. In this paper, we propose to use the grey relational analysis to gauge the levels of service satisfaction of the banks. With the grey relational analysis, we compared the effects of different variables on service satisfaction. We gave ranks to the banks according to their levels of service satisfaction. We further used the quantile regression model to find the variables that affected the satisfaction of a customer at a specific quantile of satisfaction level. The result of the quantile regression analysis provided a bank manager with information to formulate policies to further promote satisfaction of the customers at different quantiles of satisfaction level. We also compared the prediction accuracies of the regression models at different quantiles. The experiment result showed that, among the seven quantile regression models, the median regression model has the best performance in terms of RMSE, RTIC, and CE performance measures.
Directory of Open Access Journals (Sweden)
Y. Plancherel
2013-07-01
Full Text Available Quantifying oceanic anthropogenic carbon uptake by monitoring interior dissolved inorganic carbon (DIC concentrations is complicated by the influence of natural variability. The "eMLR method" aims to address this issue by using empirical regression fits of the data instead of the data themselves, inferring the change in anthropogenic carbon in time by difference between predictions generated by the regressions at each time. The advantages of the method are that it provides in principle a means to filter out natural variability, which theoretically becomes the regression residuals, and a way to deal with sparsely and unevenly distributed data. The degree to which these advantages are realized in practice is unclear, however. The ability of the eMLR method to recover the anthropogenic carbon signal is tested here using a global circulation and biogeochemistry model in which the true signal is known. Results show that regression model selection is particularly important when the observational network changes in time. When the observational network is fixed, the likelihood that co-located systematic misfits between the empirical model and the underlying, yet unknown, true model cancel is greater, improving eMLR results. Changing the observational network modifies how the spatio-temporal variance pattern is captured by the respective datasets, resulting in empirical models that are dynamically or regionally inconsistent, leading to systematic errors. In consequence, the use of regression formulae that change in time to represent systematically best-fit models at all times does not guarantee the best estimates of anthropogenic carbon change if the spatial distributions of the stations emphasize hydrographic features differently in time. Other factors, such as a balanced and representative station coverage, vertical continuity of the regression formulae consistent with the hydrographic context and resiliency of the spatial distribution of the residual
International Nuclear Information System (INIS)
Fruehwirth, R.
1993-01-01
We present an estimation procedure of the error components in a linear regression model with multiple independent stochastic error contributions. After solving the general problem we apply the results to the estimation of the actual trajectory in track fitting with multiple scattering. (orig.)
Dynamical System and Nonlinear Regression for Estimate Host-Parasitoid Relationship
Directory of Open Access Journals (Sweden)
Ileana Miranda Cabrera
2010-01-01
Full Text Available The complex relationships of a crop with the pest, its natural enemies, and the climate factors exist in all the ecosystems, but the mathematic models has studied only some components to know the relation cause-effect. The most studied system has been concerned with the relationship pest-natural enemies such as prey-predator or host-parasitoid. The present paper shows a dynamical system for studying the relationship host-parasitoid (Diaphorina citri, Tamarixia radiata and shows that a nonlinear model permits the estimation of the parasite nymphs using nymphs healthy as the known variable. The model showed the functional answer of the parasitoid, in which a point arrives that its density is not augmented although the number host increases, and it becomes necessary to intervene in the ecosystem. A simple algorithm is used to estimate the parasitoids level using the priori relationship between the host and the climate factors and then the nonlinear model.
van der Zijden, A M; Groen, B E; Tanck, E; Nienhuis, B; Verdonschot, N; Weerdesteyn, V
2017-03-21
Many research groups have studied fall impact mechanics to understand how fall severity can be reduced to prevent hip fractures. Yet, direct impact force measurements with force plates are restricted to a very limited repertoire of experimental falls. The purpose of this study was to develop a generic model for estimating hip impact forces (i.e. fall severity) in in vivo sideways falls without the use of force plates. Twelve experienced judokas performed sideways Martial Arts (MA) and Block ('natural') falls on a force plate, both with and without a mat on top. Data were analyzed to determine the hip impact force and to derive 11 selected (subject-specific and kinematic) variables. Falls from kneeling height were used to perform a stepwise regression procedure to assess the effects of these input variables and build the model. The final model includes four input variables, involving one subject-specific measure and three kinematic variables: maximum upper body deceleration, body mass, shoulder angle at the instant of 'maximum impact' and maximum hip deceleration. The results showed that estimated and measured hip impact forces were linearly related (explained variances ranging from 46 to 63%). Hip impact forces of MA falls onto the mat from a standing position (3650±916N) estimated by the final model were comparable with measured values (3698±689N), even though these data were not used for training the model. In conclusion, a generic linear regression model was developed that enables the assessment of fall severity through kinematic measures of sideways falls, without using force plates. Copyright © 2017 Elsevier Ltd. All rights reserved.
Penalized estimation for competing risks regression with applications to high-dimensional covariates
DEFF Research Database (Denmark)
Ambrogi, Federico; Scheike, Thomas H.
2016-01-01
of competing events. The direct binomial regression model of Scheike and others (2008. Predicting cumulative incidence probability by direct binomial regression. Biometrika 95: (1), 205-220) is reformulated in a penalized framework to possibly fit a sparse regression model. The developed approach is easily...... Research 19: (1), 29-51), the research regarding competing risks is less developed (Binder and others, 2009. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics 25: (7), 890-896). The aim of this work is to consider how to do penalized regression in the presence...... implementable using existing high-performance software to do penalized regression. Results from simulation studies are presented together with an application to genomic data when the endpoint is progression-free survival. An R function is provided to perform regularized competing risks regression according...
Guermoui, Mawloud; Gairaa, Kacem; Rabehi, Abdelaziz; Djafer, Djelloul; Benkaciali, Said
2018-06-01
Accurate estimation of solar radiation is the major concern in renewable energy applications. Over the past few years, a lot of machine learning paradigms have been proposed in order to improve the estimation performances, mostly based on artificial neural networks, fuzzy logic, support vector machine and adaptive neuro-fuzzy inference system. The aim of this work is the prediction of the daily global solar radiation, received on a horizontal surface through the Gaussian process regression (GPR) methodology. A case study of Ghardaïa region (Algeria) has been used in order to validate the above methodology. In fact, several combinations have been tested; it was found that, GPR-model based on sunshine duration, minimum air temperature and relative humidity gives the best results in term of mean absolute bias error (MBE), root mean square error (RMSE), relative mean square error (rRMSE), and correlation coefficient ( r) . The obtained values of these indicators are 0.67 MJ/m2, 1.15 MJ/m2, 5.2%, and 98.42%, respectively.
2017-12-01
Fig. 2 Simulation method; the process for one iteration of the simulation . It was repeated 250 times per combination of HR and FAR. Analysis was...distribution is unlimited. 8 Fig. 2 Simulation method; the process for one iteration of the simulation . It was repeated 250 times per combination of HR...stimuli. Simulations show that this regression method results in an unbiased and accurate estimate of target detection performance. The regression
Relative Pose Estimation Algorithm with Gyroscope Sensor
Directory of Open Access Journals (Sweden)
Shanshan Wei
2016-01-01
Full Text Available This paper proposes a novel vision and inertial fusion algorithm S2fM (Simplified Structure from Motion for camera relative pose estimation. Different from current existing algorithms, our algorithm estimates rotation parameter and translation parameter separately. S2fM employs gyroscopes to estimate camera rotation parameter, which is later fused with the image data to estimate camera translation parameter. Our contributions are in two aspects. (1 Under the circumstance that no inertial sensor can estimate accurately enough translation parameter, we propose a translation estimation algorithm by fusing gyroscope sensor and image data. (2 Our S2fM algorithm is efficient and suitable for smart devices. Experimental results validate efficiency of the proposed S2fM algorithm.
The effect of PLS regression in PLS path model estimation when multicollinearity is present
DEFF Research Database (Denmark)
Nielsen, Rikke; Kristensen, Kai; Eskildsen, Jacob
PLS path modelling has previously been found to be robust to multicollinearity both between latent variables and between manifest variables of a common latent variable (see e.g. Cassel et al. (1999), Kristensen, Eskildsen (2005), Westlund et al. (2008)). However, most of the studies investigate...... models with relatively few variables and very simple dependence structures compared to the models that are often estimated in practical settings. A recent study by Nielsen et al. (2009) found that when model structure is more complex, PLS path modelling is not as robust to multicollinearity between...... latent variables as previously assumed. A difference in the standard error of path coefficients of as much as 83% was found between moderate and severe levels of multicollinearity. Large differences were found not only for large path coefficients, but also for small path coefficients and in some cases...
A note on modeling of tumor regression for estimation of radiobiological parameters
International Nuclear Information System (INIS)
Zhong, Hualiang; Chetty, Indrin
2014-01-01
Purpose: Accurate calculation of radiobiological parameters is crucial to predicting radiation treatment response. Modeling differences may have a significant impact on derived parameters. In this study, the authors have integrated two existing models with kinetic differential equations to formulate a new tumor regression model for estimation of radiobiological parameters for individual patients. Methods: A system of differential equations that characterizes the birth-and-death process of tumor cells in radiation treatment was analytically solved. The solution of this system was used to construct an iterative model (Z-model). The model consists of three parameters: tumor doubling time T d , half-life of dead cells T r , and cell survival fraction SF D under dose D. The Jacobian determinant of this model was proposed as a constraint to optimize the three parameters for six head and neck cancer patients. The derived parameters were compared with those generated from the two existing models: Chvetsov's model (C-model) and Lim's model (L-model). The C-model and L-model were optimized with the parameter T d fixed. Results: With the Jacobian-constrained Z-model, the mean of the optimized cell survival fractions is 0.43 ± 0.08, and the half-life of dead cells averaged over the six patients is 17.5 ± 3.2 days. The parameters T r and SF D optimized with the Z-model differ by 1.2% and 20.3% from those optimized with the T d -fixed C-model, and by 32.1% and 112.3% from those optimized with the T d -fixed L-model, respectively. Conclusions: The Z-model was analytically constructed from the differential equations of cell populations that describe changes in the number of different tumor cells during the course of radiation treatment. The Jacobian constraints were proposed to optimize the three radiobiological parameters. The generated model and its optimization method may help develop high-quality treatment regimens for individual patients
National Aeronautics and Space Administration — The application of the Bayesian theory of managing uncertainty and complexity to regression and classification in the form of Relevance Vector Machine (RVM), and to...
Carroll, Raymond J.; Delaigle, Aurore; Hall, Peter
2011-01-01
In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment
Singh, Jagmahender; Pathak, R K; Chavali, Krishnadutt H
2011-03-20
Skeletal height estimation from regression analysis of eight sternal lengths in the subjects of Chandigarh zone of Northwest India is the topic of discussion in this study. Analysis of eight sternal lengths (length of manubrium, length of mesosternum, combined length of manubrium and mesosternum, total sternal length and first four intercostals lengths of mesosternum) measured from 252 male and 91 female sternums obtained at postmortems revealed that mean cadaver stature and sternal lengths were more in North Indians and males than the South Indians and females. Except intercostal lengths, all the sternal lengths were positively correlated with stature of the deceased in both sexes (P regression analysis of sternal lengths was found more useful than the linear regression for stature estimation. Using multivariate regression analysis, the combined length of manubrium and mesosternum in both sexes and the length of manubrium along with 2nd and 3rd intercostal lengths of mesosternum in males were selected as best estimators of stature. Nonetheless, the stature of males can be predicted with SEE of 6.66 (R(2) = 0.16, r = 0.318) from combination of MBL+BL_3+LM+BL_2, and in females from MBL only, it can be estimated with SEE of 6.65 (R(2) = 0.10, r = 0.318), whereas from the multiple regression analysis of pooled data, stature can be known with SEE of 6.97 (R(2) = 0.387, r = 575) from the combination of MBL+LM+BL_2+TSL+BL_3. The R(2) and F-ratio were found to be statistically significant for almost all the variables in both the sexes, except 4th intercostal length in males and 2nd to 4th intercostal lengths in females. The 'major' sternal lengths were more useful than the 'minor' ones for stature estimation The universal regression analysis used by Kanchan et al. [39] when applied to sternal lengths, gave satisfactory estimates of stature for males only but female stature was comparatively better estimated from simple linear regressions. But they are not proposed for the
Uncertainty relations for approximation and estimation
Energy Technology Data Exchange (ETDEWEB)
Lee, Jaeha, E-mail: jlee@post.kek.jp [Department of Physics, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033 (Japan); Tsutsui, Izumi, E-mail: izumi.tsutsui@kek.jp [Department of Physics, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033 (Japan); Theory Center, Institute of Particle and Nuclear Studies, High Energy Accelerator Research Organization (KEK), 1-1 Oho, Tsukuba, Ibaraki 305-0801 (Japan)
2016-05-27
We present a versatile inequality of uncertainty relations which are useful when one approximates an observable and/or estimates a physical parameter based on the measurement of another observable. It is shown that the optimal choice for proxy functions used for the approximation is given by Aharonov's weak value, which also determines the classical Fisher information in parameter estimation, turning our inequality into the genuine Cramér–Rao inequality. Since the standard form of the uncertainty relation arises as a special case of our inequality, and since the parameter estimation is available as well, our inequality can treat both the position–momentum and the time–energy relations in one framework albeit handled differently. - Highlights: • Several inequalities interpreted as uncertainty relations for approximation/estimation are derived from a single ‘versatile inequality’. • The ‘versatile inequality’ sets a limit on the approximation of an observable and/or the estimation of a parameter by another observable. • The ‘versatile inequality’ turns into an elaboration of the Robertson–Kennard (Schrödinger) inequality and the Cramér–Rao inequality. • Both the position–momentum and the time–energy relation are treated in one framework. • In every case, Aharonov's weak value arises as a key geometrical ingredient, deciding the optimal choice for the proxy functions.
Uncertainty relations for approximation and estimation
International Nuclear Information System (INIS)
Lee, Jaeha; Tsutsui, Izumi
2016-01-01
We present a versatile inequality of uncertainty relations which are useful when one approximates an observable and/or estimates a physical parameter based on the measurement of another observable. It is shown that the optimal choice for proxy functions used for the approximation is given by Aharonov's weak value, which also determines the classical Fisher information in parameter estimation, turning our inequality into the genuine Cramér–Rao inequality. Since the standard form of the uncertainty relation arises as a special case of our inequality, and since the parameter estimation is available as well, our inequality can treat both the position–momentum and the time–energy relations in one framework albeit handled differently. - Highlights: • Several inequalities interpreted as uncertainty relations for approximation/estimation are derived from a single ‘versatile inequality’. • The ‘versatile inequality’ sets a limit on the approximation of an observable and/or the estimation of a parameter by another observable. • The ‘versatile inequality’ turns into an elaboration of the Robertson–Kennard (Schrödinger) inequality and the Cramér–Rao inequality. • Both the position–momentum and the time–energy relation are treated in one framework. • In every case, Aharonov's weak value arises as a key geometrical ingredient, deciding the optimal choice for the proxy functions.
Sun, Jianguo; Feng, Yanqin; Zhao, Hui
2015-01-01
Interval-censored failure time data occur in many fields including epidemiological and medical studies as well as financial and sociological studies, and many authors have investigated their analysis (Sun, The statistical analysis of interval-censored failure time data, 2006; Zhang, Stat Modeling 9:321-343, 2009). In particular, a number of procedures have been developed for regression analysis of interval-censored data arising from the proportional hazards model (Finkelstein, Biometrics 42:845-854, 1986; Huang, Ann Stat 24:540-568, 1996; Pan, Biometrics 56:199-203, 2000). For most of these procedures, however, one drawback is that they involve estimation of both regression parameters and baseline cumulative hazard function. In this paper, we propose two simple estimation approaches that do not need estimation of the baseline cumulative hazard function. The asymptotic properties of the resulting estimates are given, and an extensive simulation study is conducted and indicates that they work well for practical situations.
INLA goes extreme: Bayesian tail regression for the estimation of high spatio-temporal quantiles
Opitz, Thomas; Huser, Raphaë l; Bakka, Haakon; Rue, Haavard
2018-01-01
approach is based on a Bayesian generalized additive modeling framework that is designed to estimate complex trends in marginal extremes over space and time. First, we estimate a high non-stationary threshold using a gamma distribution for precipitation
Khalil, Mohamed H; Shebl, Mostafa K; Kosba, Mohamed A; El-Sabrout, Karim; Zaki, Nesma
2016-08-01
This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens' eggs. Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens.
Berry, D.P.; Buckley, F.; Dillon, P.; Evans, R.D.; Rath, M.; Veerkamp, R.F.
2003-01-01
Genetic (co)variances between body condition score (BCS), body weight (BW), milk yield, and fertility were estimated using a random regression animal model extended to multivariate analysis. The data analyzed included 81,313 BCS observations, 91,937 BW observations, and 100,458 milk test-day yields
CSIR Research Space (South Africa)
Bencherif, H
2010-09-01
Full Text Available The present reports on the use of a multi-regression model adapted at Reunion University for temperature and ozone trend estimates. Depending on the location of the observing site, the studied geophysical signal is broken down in form of a sum...
Chai, Rui; Xu, Li-Sheng; Yao, Yang; Hao, Li-Ling; Qi, Lin
2017-01-01
This study analyzed ascending branch slope (A_slope), dicrotic notch height (Hn), diastolic area (Ad) and systolic area (As) diastolic blood pressure (DBP), systolic blood pressure (SBP), pulse pressure (PP), subendocardial viability ratio (SEVR), waveform parameter (k), stroke volume (SV), cardiac output (CO), and peripheral resistance (RS) of central pulse wave invasively and non-invasively measured. Invasively measured parameters were compared with parameters measured from brachial pulse waves by regression model and transfer function model. Accuracy of parameters estimated by regression and transfer function model, was compared too. Findings showed that k value, central pulse wave and brachial pulse wave parameters invasively measured, correlated positively. Regression model parameters including A_slope, DBP, SEVR, and transfer function model parameters had good consistency with parameters invasively measured. They had same effect of consistency. SBP, PP, SV, and CO could be calculated through the regression model, but their accuracies were worse than that of transfer function model.
DEFF Research Database (Denmark)
Brink, Carsten; Bernchou, Uffe; Bertelsen, Anders
2014-01-01
was estimated on the basis of the first one third and two thirds of the scans. The concordance between estimated and actual relative volume at the end of radiation therapy was quantified by Pearson's correlation coefficient. On the basis of the estimated relative volume, the patients were stratified into 2...... for other clinical characteristics. RESULTS: Automatic measurement of the tumor regression from standard CBCT images was feasible. Pearson's correlation coefficient between manual and automatic measurement was 0.86 in a sample of 9 patients. Most patients experienced tumor volume regression, and this could...
Combination of supervised and semi-supervised regression models for improved unbiased estimation
DEFF Research Database (Denmark)
Arenas-Garía, Jeronimo; Moriana-Varo, Carlos; Larsen, Jan
2010-01-01
In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised and semisupervi......In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised...
Estimating the relative utility of screening mammography.
Abbey, Craig K; Eckstein, Miguel P; Boone, John M
2013-05-01
The concept of diagnostic utility is a fundamental component of signal detection theory, going back to some of its earliest works. Attaching utility values to the various possible outcomes of a diagnostic test should, in principle, lead to meaningful approaches to evaluating and comparing such systems. However, in many areas of medical imaging, utility is not used because it is presumed to be unknown. In this work, we estimate relative utility (the utility benefit of a detection relative to that of a correct rejection) for screening mammography using its known relation to the slope of a receiver operating characteristic (ROC) curve at the optimal operating point. The approach assumes that the clinical operating point is optimal for the goal of maximizing expected utility and therefore the slope at this point implies a value of relative utility for the diagnostic task, for known disease prevalence. We examine utility estimation in the context of screening mammography using the Digital Mammographic Imaging Screening Trials (DMIST) data. We show how various conditions can influence the estimated relative utility, including characteristics of the rating scale, verification time, probability model, and scope of the ROC curve fit. Relative utility estimates range from 66 to 227. We argue for one particular set of conditions that results in a relative utility estimate of 162 (±14%). This is broadly consistent with values in screening mammography determined previously by other means. At the disease prevalence found in the DMIST study (0.59% at 365-day verification), optimal ROC slopes are near unity, suggesting that utility-based assessments of screening mammography will be similar to those found using Youden's index.
Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett
2009-01-01
Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.
2013-01-01
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
Maximum likelihood estimation for Cox's regression model under nested case-control sampling
DEFF Research Database (Denmark)
Scheike, Thomas; Juul, Anders
2004-01-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazard...
Closed-Loop Surface Related Multiple Estimation
Lopez Angarita, G.A.
2016-01-01
Surface-related multiple elimination (SRME) is one of the most commonly used methods for suppressing surface multiples. However, in order to obtain an accurate surface multiple estimation, dense source and receiver sampling is required. The traditional approach to this problem is performing data
Stackelberg, Paul E.; Barbash, Jack E.; Gilliom, Robert J.; Stone, Wesley W.; Wolock, David M.
2012-01-01
Tobit regression models were developed to predict the summed concentration of atrazine [6-chloro-N-ethyl-N'-(1-methylethyl)-1,3,5-triazine-2,4-diamine] and its degradate deethylatrazine [6-chloro-N-(1-methylethyl)-1,3,5,-triazine-2,4-diamine] (DEA) in shallow groundwater underlying agricultural settings across the conterminous United States. The models were developed from atrazine and DEA concentrations in samples from 1298 wells and explanatory variables that represent the source of atrazine and various aspects of the transport and fate of atrazine and DEA in the subsurface. One advantage of these newly developed models over previous national regression models is that they predict concentrations (rather than detection frequency), which can be compared with water quality benchmarks. Model results indicate that variability in the concentration of atrazine residues (atrazine plus DEA) in groundwater underlying agricultural areas is more strongly controlled by the history of atrazine use in relation to the timing of recharge (groundwater age) than by processes that control the dispersion, adsorption, or degradation of these compounds in the saturated zone. Current (1990s) atrazine use was found to be a weak explanatory variable, perhaps because it does not represent the use of atrazine at the time of recharge of the sampled groundwater and because the likelihood that these compounds will reach the water table is affected by other factors operating within the unsaturated zone, such as soil characteristics, artificial drainage, and water movement. Results show that only about 5% of agricultural areas have greater than a 10% probability of exceeding the USEPA maximum contaminant level of 3.0 μg L-1. These models are not developed for regulatory purposes but rather can be used to (i) identify areas of potential concern, (ii) provide conservative estimates of the concentrations of atrazine residues in deeper potential drinking water supplies, and (iii) set priorities
International Nuclear Information System (INIS)
Elliott Campbell, J.; Moen, Jeremie C.; Ney, Richard A.; Schnoor, Jerald L.
2008-01-01
Estimates of forest soil organic carbon (SOC) have applications in carbon science, soil quality studies, carbon sequestration technologies, and carbon trading. Forest SOC has been modeled using a regression coefficient methodology that applies mean SOC densities (mass/area) to broad forest regions. A higher resolution model is based on an approach that employs a geographic information system (GIS) with soil databases and satellite-derived landcover images. Despite this advancement, the regression approach remains the basis of current state and federal level greenhouse gas inventories. Both approaches are analyzed in detail for Wisconsin forest soils from 1983 to 2001, applying rigorous error-fixing algorithms to soil databases. Resulting SOC stock estimates are 20% larger when determined using the GIS method rather than the regression approach. Average annual rates of increase in SOC stocks are 3.6 and 1.0 million metric tons of carbon per year for the GIS and regression approaches respectively. - Large differences in estimates of soil organic carbon stocks and annual changes in stocks for Wisconsin forestlands indicate a need for validation from forthcoming forest surveys
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis
Directory of Open Access Journals (Sweden)
Carlos Augusto Zangrando Toneli
2011-09-01
Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.
Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales
Czech Academy of Sciences Publication Activity Database
Krištoufek, Ladislav
2015-01-01
Roč. 91, č. 1 (2015), 022802-1-022802-5 ISSN 1539-3755 R&D Projects: GA ČR(CZ) GP14-11402P Grant - others:GA ČR(CZ) GAP402/11/0948 Program:GA Institutional support: RVO:67985556 Keywords : Detrended cross-correlation analysis * Regression * Scales Subject RIV: AH - Economics Impact factor: 2.288, year: 2014 http://library.utia.cas.cz/separaty/2015/E/kristoufek-0452315.pdf
Estimating transmitted waves of floating breakwater using support vector regression model
Digital Repository Service at National Institute of Oceanography (India)
Mandal, S.; Hegde, A.V.; Kumar, V.; Patil, S.G.
is first mapped onto an m-dimensional feature space using some fixed (nonlinear) mapping, and then a linear model is constructed in this feature space (Ivanciuc Ovidiu 2007). Using mathematical notation, the linear model in the feature space f(x, w... regressive vector machines, Ocean Engineering Journal, Vol – 36, pp 339 – 347, 2009. 3. Ivanciuc Ovidiu, Applications of support vector machines in chemistry, Review in Computational Chemistry, Eds K. B. Lipkouitz and T. R. Cundari, Vol – 23...
Directory of Open Access Journals (Sweden)
T. Nataraja Moorthy
2015-05-01
Full Text Available The human foot has been studied for a variety of reasons, i.e., for forensic as well as non-forensic purposes by anatomists, forensic scientists, anthropologists, physicians, podiatrists, and numerous other groups. An aspect of human identification that has received scant attention from forensic anthropologists is the study of human feet and the footprints made by the feet. The present study, conducted during 2013-2014, aimed to derive population specific regression equations to estimate stature from the footprint anthropometry of indigenous adult Bidayuhs in the east of Malaysia. The study sample consisted of 480 bilateral footprints collected using a footprint kit from 240 Bidayuhs (120 males and 120 females, who consented to taking part in the study. Their ages ranged from 18 to 70 years. Stature was measured using a portable body meter device (SECA model 206. The data were analyzed using PASW Statistics version 20. In this investigation, better results were obtained in terms of correlation coefficient (R between stature and various footprint measurements and regression analysis in estimating the stature. The (R values showed a positive and statistically significant (p < 0.001 relationship between the two parameters. The correlation coefficients in the pooled sample (0.861–0.882 were comparatively higher than those of an individual male (0.762-0.795 and female (0.722-0.765. This study provided regression equations to estimate stature from footprints in the Bidayuh population. The result showed that the regression equations without sex indicators performed significantly better than models with gender indications. The regression equations derived for a pooled sample can be used to estimate stature, even when the sex of the footprint is unknown, as in real crime scenes.
Borodachev, S. M.
2016-06-01
The simple derivation of recursive least squares (RLS) method equations is given as special case of Kalman filter estimation of a constant system state under changing observation conditions. A numerical example illustrates application of RLS to multicollinearity problem.
Ono, Tomohiro; Nakamura, Mitsuhiro; Hirose, Yoshinori; Kitsuda, Kenji; Ono, Yuka; Ishigaki, Takashi; Hiraoka, Masahiro
2017-09-01
To estimate the lung tumor position from multiple anatomical features on four-dimensional computed tomography (4D-CT) data sets using single regression analysis (SRA) and multiple regression analysis (MRA) approach and evaluate an impact of the approach on internal target volume (ITV) for stereotactic body radiotherapy (SBRT) of the lung. Eleven consecutive lung cancer patients (12 cases) underwent 4D-CT scanning. The three-dimensional (3D) lung tumor motion exceeded 5 mm. The 3D tumor position and anatomical features, including lung volume, diaphragm, abdominal wall, and chest wall positions, were measured on 4D-CT images. The tumor position was estimated by SRA using each anatomical feature and MRA using all anatomical features. The difference between the actual and estimated tumor positions was defined as the root-mean-square error (RMSE). A standard partial regression coefficient for the MRA was evaluated. The 3D lung tumor position showed a high correlation with the lung volume (R = 0.92 ± 0.10). Additionally, ITVs derived from SRA and MRA approaches were compared with ITV derived from contouring gross tumor volumes on all 10 phases of the 4D-CT (conventional ITV). The RMSE of the SRA was within 3.7 mm in all directions. Also, the RMSE of the MRA was within 1.6 mm in all directions. The standard partial regression coefficient for the lung volume was the largest and had the most influence on the estimated tumor position. Compared with conventional ITV, average percentage decrease of ITV were 31.9% and 38.3% using SRA and MRA approaches, respectively. The estimation accuracy of lung tumor position was improved by the MRA approach, which provided smaller ITV than conventional ITV. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Liu, Xun; Li, Ning-shan; Lv, Lin-sheng; Huang, Jian-hua; Tang, Hua; Chen, Jin-xia; Ma, Hui-juan; Wu, Xiao-ming; Lou, Tan-qi
2013-12-01
Accurate estimation of glomerular filtration rate (GFR) is important in clinical practice. Current models derived from regression are limited by the imprecision of GFR estimates. We hypothesized that an artificial neural network (ANN) might improve the precision of GFR estimates. A study of diagnostic test accuracy. 1,230 patients with chronic kidney disease were enrolled, including the development cohort (n=581), internal validation cohort (n=278), and external validation cohort (n=371). Estimated GFR (eGFR) using a new ANN model and a new regression model using age, sex, and standardized serum creatinine level derived in the development and internal validation cohort, and the CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration) 2009 creatinine equation. Measured GFR (mGFR). GFR was measured using a diethylenetriaminepentaacetic acid renal dynamic imaging method. Serum creatinine was measured with an enzymatic method traceable to isotope-dilution mass spectrometry. In the external validation cohort, mean mGFR was 49±27 (SD) mL/min/1.73 m2 and biases (median difference between mGFR and eGFR) for the CKD-EPI, new regression, and new ANN models were 0.4, 1.5, and -0.5 mL/min/1.73 m2, respectively (P30% from mGFR) were 50.9%, 77.4%, and 78.7%, respectively (Psource of systematic bias in comparisons of new models to CKD-EPI, and both the derivation and validation cohorts consisted of a group of patients who were referred to the same institution. An ANN model using 3 variables did not perform better than a new regression model. Whether ANN can improve GFR estimation using more variables requires further investigation. Copyright © 2013 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.
Estimating the causes of traffic accidents using logistic regression and discriminant analysis.
Karacasu, Murat; Ergül, Barış; Altin Yavuz, Arzu
2014-01-01
Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results.
A constrained polynomial regression procedure for estimating the local False Discovery Rate
Directory of Open Access Journals (Sweden)
Broët Philippe
2007-06-01
Full Text Available Abstract Background In the context of genomic association studies, for which a large number of statistical tests are performed simultaneously, the local False Discovery Rate (lFDR, which quantifies the evidence of a specific gene association with a clinical or biological variable of interest, is a relevant criterion for taking into account the multiple testing problem. The lFDR not only allows an inference to be made for each gene through its specific value, but also an estimate of Benjamini-Hochberg's False Discovery Rate (FDR for subsets of genes. Results In the framework of estimating procedures without any distributional assumption under the alternative hypothesis, a new and efficient procedure for estimating the lFDR is described. The results of a simulation study indicated good performances for the proposed estimator in comparison to four published ones. The five different procedures were applied to real datasets. Conclusion A novel and efficient procedure for estimating lFDR was developed and evaluated.
Maximum likelihood estimation for Cox's regression model under nested case-control sampling
DEFF Research Database (Denmark)
Scheike, Thomas Harder; Juul, Anders
2004-01-01
-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used......Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards...... model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin...
DEFF Research Database (Denmark)
Tvedebrink, Torben; Eriksen, Poul Svante; Asplund, Maria
2012-01-01
We discuss the model for estimating drop-out probabilities presented by Tvedebrink et al. [7] and the concerns, that have been raised. The criticism of the model has demonstrated that the model is not perfect. However, the model is very useful for advanced forensic genetic work, where allelic drop-out...... is occurring. With this discussion, we hope to improve the drop-out model, so that it can be used for practical forensic genetics and stimulate further discussions. We discuss how to estimate drop-out probabilities when using a varying number of PCR cycles and other experimental conditions....
Normalization Ridge Regression in Practice II: The Estimation of Multiple Feedback Linkages.
Bulcock, J. W.
The use of the two-stage least squares (2 SLS) procedure for estimating nonrecursive social science models is often impractical when multiple feedback linkages are required. This is because 2 SLS is extremely sensitive to multicollinearity. The standard statistical solution to the multicollinearity problem is a biased, variance reduced procedure…
Energy Technology Data Exchange (ETDEWEB)
Koo, Young Do; Yoo, Kwae Hwan; Na, Man Gyun [Dept. of Nuclear Engineering, Chosun University, Gwangju (Korea, Republic of)
2017-06-15
Residual stress is a critical element in determining the integrity of parts and the lifetime of welded structures. It is necessary to estimate the residual stress of a welding zone because residual stress is a major reason for the generation of primary water stress corrosion cracking in nuclear power plants. That is, it is necessary to estimate the distribution of the residual stress in welding of dissimilar metals under manifold welding conditions. In this study, a cascaded support vector regression (CSVR) model was presented to estimate the residual stress of a welding zone. The CSVR model was serially and consecutively structured in terms of SVR modules. Using numerical data obtained from finite element analysis by a subtractive clustering method, learning data that explained the characteristic behavior of the residual stress of a welding zone were selected to optimize the proposed model. The results suggest that the CSVR model yielded a better estimation performance when compared with a classic SVR model.
Lo, Ching F.
1999-01-01
The integration of Radial Basis Function Networks and Back Propagation Neural Networks with the Multiple Linear Regression has been accomplished to map nonlinear response surfaces over a wide range of independent variables in the process of the Modem Design of Experiments. The integrated method is capable to estimate the precision intervals including confidence and predicted intervals. The power of the innovative method has been demonstrated by applying to a set of wind tunnel test data in construction of response surface and estimation of precision interval.
International Nuclear Information System (INIS)
Carausu, A.
1996-01-01
A method for the fragility estimation of seismically isolated nuclear power plant structure is proposed. The relationship between the ground motion intensity parameter (e.g. peak ground velocity or peak ground acceleration) and the response of isolated structures is expressed in terms of a bi-linear regression line, whose coefficients are estimated by the least-square method in terms of available data on seismic input and structural response. The notion of high confidence low probability of failure (HCLPF) value is also used for deriving compound fragility curves for coupled subsystems. (orig.)
Directory of Open Access Journals (Sweden)
Jonathan E. Leightner
2012-01-01
Full Text Available The omitted variables problem is one of regression analysis’ most serious problems. The standard approach to the omitted variables problem is to find instruments, or proxies, for the omitted variables, but this approach makes strong assumptions that are rarely met in practice. This paper introduces best projection reiterative truncated projected least squares (BP-RTPLS, the third generation of a technique that solves the omitted variables problem without using proxies or instruments. This paper presents a theoretical argument that BP-RTPLS produces unbiased reduced form estimates when there are omitted variables. This paper also provides simulation evidence that shows OLS produces between 250% and 2450% more errors than BP-RTPLS when there are omitted variables and when measurement and round-off error is 1 percent or less. In an example, the government spending multiplier, , is estimated using annual data for the USA between 1929 and 2010.
Energy Technology Data Exchange (ETDEWEB)
Jabr, R.A. [Electrical, Computer and Communication Engineering Department, Notre Dame University, P.O. Box 72, Zouk Mikhael, Zouk Mosbeh (Lebanon)
2006-02-15
This paper presents an implementation of the least absolute value (LAV) power system state estimator based on obtaining a sequence of solutions to the L{sub 1}-regression problem using an iteratively reweighted least squares (IRLS{sub L1}) method. The proposed implementation avoids reformulating the regression problem into standard linear programming (LP) form and consequently does not require the use of common methods of LP, such as those based on the simplex method or interior-point methods. It is shown that the IRLS{sub L1} method is equivalent to solving a sequence of linear weighted least squares (LS) problems. Thus, its implementation presents little additional effort since the sparse LS solver is common to existing LS state estimators. Studies on the termination criteria of the IRLS{sub L1} method have been carried out to determine a procedure for which the proposed estimator is more computationally efficient than a previously proposed non-linear iteratively reweighted least squares (IRLS) estimator. Indeed, it is revealed that the proposed method is a generalization of the previously reported IRLS estimator, but is based on more rigorous theory. (author)
Directory of Open Access Journals (Sweden)
Antonella Costanzo
2017-09-01
Full Text Available Abstract The number of studies addressing issues of inequality in educational outcomes using cognitive achievement tests and variables from large-scale assessment data has increased. Here the value of using a quantile regression approach is compared with a classical regression analysis approach to study the relationships between educational outcomes and likely predictor variables. Italian primary school data from INVALSI large-scale assessments were analyzed using both quantile and standard regression approaches. Mathematics and reading scores were regressed on students' characteristics and geographical variables selected for their theoretical and policy relevance. The results demonstrated that, in Italy, the role of gender and immigrant status varied across the entire conditional distribution of students’ performance. Analogous results emerged pertaining to the difference in students’ performance across Italian geographic areas. These findings suggest that quantile regression analysis is a useful tool to explore the determinants and mechanisms of inequality in educational outcomes. A proper interpretation of quantile estimates may enable teachers to identify effective learning activities and help policymakers to develop tailored programs that increase equity in education.
Energy Technology Data Exchange (ETDEWEB)
Azadeh, A; Seraj, O [Department of Industrial Engineering and Research Institute of Energy Management and Planning, Center of Excellence for Intelligent-Based Experimental Mechanics, College of Engineering, University of Tehran, P.O. Box 11365-4563 (Iran); Saberi, M [Department of Industrial Engineering, University of Tafresh (Iran); Institute for Digital Ecosystems and Business Intelligence, Curtin University of Technology, Perth (Australia)
2010-06-15
This study presents an integrated fuzzy regression and time series framework to estimate and predict electricity demand for seasonal and monthly changes in electricity consumption especially in developing countries such as China and Iran with non-stationary data. Furthermore, it is difficult to model uncertain behavior of energy consumption with only conventional fuzzy regression (FR) or time series and the integrated algorithm could be an ideal substitute for such cases. At First, preferred Time series model is selected from linear or nonlinear models. For this, after selecting preferred Auto Regression Moving Average (ARMA) model, Mcleod-Li test is applied to determine nonlinearity condition. When, nonlinearity condition is satisfied, the preferred nonlinear model is selected and defined as preferred time series model. At last, the preferred model from fuzzy regression and time series model is selected by the Granger-Newbold. Also, the impact of data preprocessing on the fuzzy regression performance is considered. Monthly electricity consumption of Iran from March 1994 to January 2005 is considered as the case of this study. The superiority of the proposed algorithm is shown by comparing its results with other intelligent tools such as Genetic Algorithm (GA) and Artificial Neural Network (ANN). (author)
Chai Rui; Li Si-Man; Xu Li-Sheng; Yao Yang; Hao Li-Ling
2017-07-01
This study mainly analyzed the parameters such as ascending branch slope (A_slope), dicrotic notch height (Hn), diastolic area (Ad) and systolic area (As) diastolic blood pressure (DBP), systolic blood pressure (SBP), pulse pressure (PP), subendocardial viability ratio (SEVR), waveform parameter (k), stroke volume (SV), cardiac output (CO) and peripheral resistance (RS) of central pulse wave invasively and non-invasively measured. These parameters extracted from the central pulse wave invasively measured were compared with the parameters measured from the brachial pulse waves by a regression model and a transfer function model. The accuracy of the parameters which were estimated by the regression model and the transfer function model was compared too. Our findings showed that in addition to the k value, the above parameters of the central pulse wave and the brachial pulse wave invasively measured had positive correlation. Both the regression model parameters including A_slope, DBP, SEVR and the transfer function model parameters had good consistency with the parameters invasively measured, and they had the same effect of consistency. The regression equations of the three parameters were expressed by Y'=a+bx. The SBP, PP, SV, CO of central pulse wave could be calculated through the regression model, but their accuracies were worse than that of transfer function model.
International Nuclear Information System (INIS)
Antanasijević, Davor; Pocajt, Viktor; Ristić, Mirjana; Perić-Grujić, Aleksandra
2015-01-01
This paper presents a new approach for the estimation of energy-related GHG (greenhouse gas) emissions at the national level that combines the simplicity of the concept of GHG intensity and the generalization capabilities of ANNs (artificial neural networks). The main objectives of this work includes the determination of the accuracy of a GRNN (general regression neural network) model applied for the prediction of EC (energy consumption) and GHG intensity of energy consumption, utilizing general country statistics as inputs, as well as analysis of the accuracy of energy-related GHG emissions obtained by multiplying the two aforementioned outputs. The models were developed using historical data from the period 2004–2012, for a set of 26 European countries (EU Members). The obtained results demonstrate that the GRNN GHG intensity model provides a more accurate prediction, with the MAPE (mean absolute percentage error) of 4.5%, than tested MLR (multiple linear regression) and second-order and third-order non-linear MPR (multiple polynomial regression) models. Also, the GRNN EC model has high accuracy (MAPE = 3.6%), and therefore both GRNN models and the proposed approach can be considered as suitable for the calculation of GHG emissions. The energy-related predicted GHG emissions were very similar to the actual GHG emissions of EU Members (MAPE = 6.4%). - Highlights: • ANN modeling of GHG intensity of energy consumption is presented. • ANN modeling of energy consumption at the national level is presented. • GHG intensity concept was used for the estimation of energy-related GHG emissions. • The ANN models provide better results in comparison with conventional models. • Forecast of GHG emissions for 26 countries was made successfully with MAPE of 6.4%
Biomass estimates of freshwater zooplankton from length-carbon regression equations
Directory of Open Access Journals (Sweden)
Patrizia COMOLI
2000-02-01
Full Text Available We present length/carbon regression equations of zooplankton species collected from Lake Maggiore (N. Italy during 1992. The results are discussed in terms of the environmental factors, e.g. food availability, predation, controlling biomass production of particle- feeders and predators in the pelagic system of lakes. The marked seasonality in the length-standardized carbon content of Daphnia, and its time-specific trend suggest that from spring onward food availability for Daphnia population may be regarded as a simple decay function. Seasonality does not affect the carbon content/unit length of the two predator Cladocera Leptodora kindtii and Bythotrephes longimanus. Predation is probably the most important regulating factor for the seasonal dynamics of their carbon biomass. The existence of a constant factor to convert the diameter of Conochilus colonies into carbon seems reasonable for an organism whose population comes on quickly and just as quickly disappears.
Estimating Unbiased Treatment Effects in Education Using a Regression Discontinuity Design
Directory of Open Access Journals (Sweden)
William C. Smith
2014-08-01
Full Text Available The ability of regression discontinuity (RD designs to provide an unbiased treatment effect while overcoming the ethical concerns plagued by Random Control Trials (RCTs make it a valuable and useful approach in education evaluation. RD is the only explicitly recognized quasi-experimental approach identified by the Institute of Education Statistics to meet the prerequisites of a causal relationship. Unfortunately, the statistical complexity of the RD design has limited its application in education research. This article provides a less technical introduction to RD for education researchers and practitioners. Using visual analysis to aide conceptual understanding, the article walks readers through the essential steps of a Sharp RD design using hypothetical, but realistic, district intervention data and provides additional resources for further exploration.
DEFF Research Database (Denmark)
Scott, Neil W; Fayers, Peter M; Aaronson, Neil K
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....
A case study to estimate costs using Neural Networks and regression based models
Directory of Open Access Journals (Sweden)
Nadia Bhuiyan
2012-07-01
Full Text Available Bombardier Aerospace’s high performance aircrafts and services set the utmost standard for the Aerospace industry. A case study in collaboration with Bombardier Aerospace is conducted in order to estimate the target cost of a landing gear. More precisely, the study uses both parametric model and neural network models to estimate the cost of main landing gears, a major aircraft commodity. A comparative analysis between the parametric based model and those upon neural networks model will be considered in order to determine the most accurate method to predict the cost of a main landing gear. Several trials are presented for the design and use of the neural network model. The analysis for the case under study shows the flexibility in the design of the neural network model. Furthermore, the performance of the neural network model is deemed superior to the parametric models for this case study.
Liou, Jyun-you; Smith, Elliot H.; Bateman, Lisa M.; McKhann, Guy M., II; Goodman, Robert R.; Greger, Bradley; Davis, Tyler S.; Kellis, Spencer S.; House, Paul A.; Schevon, Catherine A.
2017-08-01
Objective. Epileptiform discharges, an electrophysiological hallmark of seizures, can propagate across cortical tissue in a manner similar to traveling waves. Recent work has focused attention on the origination and propagation patterns of these discharges, yielding important clues to their source location and mechanism of travel. However, systematic studies of methods for measuring propagation are lacking. Approach. We analyzed epileptiform discharges in microelectrode array recordings of human seizures. The array records multiunit activity and local field potentials at 400 micron spatial resolution, from a small cortical site free of obstructions. We evaluated several computationally efficient statistical methods for calculating traveling wave velocity, benchmarking them to analyses of associated neuronal burst firing. Main results. Over 90% of discharges met statistical criteria for propagation across the sampled cortical territory. Detection rate, direction and speed estimates derived from a multiunit estimator were compared to four field potential-based estimators: negative peak, maximum descent, high gamma power, and cross-correlation. Interestingly, the methods that were computationally simplest and most efficient (negative peak and maximal descent) offer non-inferior results in predicting neuronal traveling wave velocities compared to the other two, more complex methods. Moreover, the negative peak and maximal descent methods proved to be more robust against reduced spatial sampling challenges. Using least absolute deviation in place of least squares error minimized the impact of outliers, and reduced the discrepancies between local field potential-based and multiunit estimators. Significance. Our findings suggest that ictal epileptiform discharges typically take the form of exceptionally strong, rapidly traveling waves, with propagation detectable across millimeter distances. The sequential activation of neurons in space can be inferred from clinically
Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida
2016-10-01
The focus of this study is an analysis of building vulnerability through investigating impacts from the 8 February 2013 flash flood event along the Avenida Venezuela channel in the city of Arequipa, Peru. On this day, 124.5 mm of rain fell within 3 h (monthly mean: 29.3 mm) triggering a flash flood that inundated at least 0.4 km2 of urban settlements along the channel, affecting more than 280 buildings, 23 of a total of 53 bridges (pedestrian, vehicle and railway), and leading to the partial collapse of sections of the main road, paralyzing central parts of the city for more than one week. This study assesses the aspects of building design and site specific environmental characteristics that render a building vulnerable by considering the example of a flash flood event in February 2013. A statistical methodology is developed that enables estimation of damage probability for buildings. The applied method uses observed inundation height as a hazard proxy in areas where more detailed hydrodynamic modeling data is not available. Building design and site-specific environmental conditions determine the physical vulnerability. The mathematical approach considers both physical vulnerability and hazard related parameters and helps to reduce uncertainty in the determination of descriptive parameters, parameter interdependency and respective contributions to damage. This study aims to (1) enable the estimation of damage probability for a certain hazard intensity, and (2) obtain data to visualize variations in damage susceptibility for buildings in flood prone areas. Data collection is based on a post-flood event field survey and the analysis of high (sub-metric) spatial resolution images (Pléiades 2012, 2013). An inventory of 30 city blocks was collated in a GIS database in order to estimate the physical vulnerability of buildings. As many as 1103 buildings were surveyed along the affected drainage and 898 buildings were included in the statistical analysis. Univariate and
International Nuclear Information System (INIS)
Hu, Chao; Jain, Gaurav; Zhang, Puqiang; Schmidt, Craig; Gomadam, Parthasarathy; Gorka, Tom
2014-01-01
Highlights: • We develop a data-driven method for the battery capacity estimation. • Five charge-related features that are indicative of the capacity are defined. • The kNN regression model captures the dependency of the capacity on the features. • Results with 10 years’ continuous cycling data verify the effectiveness of the method. - Abstract: Reliability of lithium-ion (Li-ion) rechargeable batteries used in implantable medical devices has been recognized as of high importance from a broad range of stakeholders, including medical device manufacturers, regulatory agencies, physicians, and patients. To ensure Li-ion batteries in these devices operate reliably, it is important to be able to assess the battery health condition by estimating the battery capacity over the life-time. This paper presents a data-driven method for estimating the capacity of Li-ion battery based on the charge voltage and current curves. The contributions of this paper are three-fold: (i) the definition of five characteristic features of the charge curves that are indicative of the capacity, (ii) the development of a non-linear kernel regression model, based on the k-nearest neighbor (kNN) regression, that captures the complex dependency of the capacity on the five features, and (iii) the adaptation of particle swarm optimization (PSO) to finding the optimal combination of feature weights for creating a kNN regression model that minimizes the cross validation (CV) error in the capacity estimation. Verification with 10 years’ continuous cycling data suggests that the proposed method is able to accurately estimate the capacity of Li-ion battery throughout the whole life-time
Zhu, Xiaofeng; Suk, Heung-Il; Wang, Li; Lee, Seong-Whan; Shen, Dinggang
2017-05-01
In this paper, we focus on joint regression and classification for Alzheimer's disease diagnosis and propose a new feature selection method by embedding the relational information inherent in the observations into a sparse multi-task learning framework. Specifically, the relational information includes three kinds of relationships (such as feature-feature relation, response-response relation, and sample-sample relation), for preserving three kinds of the similarity, such as for the features, the response variables, and the samples, respectively. To conduct feature selection, we first formulate the objective function by imposing these three relational characteristics along with an ℓ 2,1 -norm regularization term, and further propose a computationally efficient algorithm to optimize the proposed objective function. With the dimension-reduced data, we train two support vector regression models to predict the clinical scores of ADAS-Cog and MMSE, respectively, and also a support vector classification model to determine the clinical label. We conducted extensive experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset to validate the effectiveness of the proposed method. Our experimental results showed the efficacy of the proposed method in enhancing the performances of both clinical scores prediction and disease status identification, compared to the state-of-the-art methods. Copyright © 2015 Elsevier B.V. All rights reserved.
INLA goes extreme: Bayesian tail regression for the estimation of high spatio-temporal quantiles
Opitz, Thomas
2018-05-25
This work is motivated by the challenge organized for the 10th International Conference on Extreme-Value Analysis (EVA2017) to predict daily precipitation quantiles at the 99.8% level for each month at observed and unobserved locations. Our approach is based on a Bayesian generalized additive modeling framework that is designed to estimate complex trends in marginal extremes over space and time. First, we estimate a high non-stationary threshold using a gamma distribution for precipitation intensities that incorporates spatial and temporal random effects. Then, we use the Bernoulli and generalized Pareto (GP) distributions to model the rate and size of threshold exceedances, respectively, which we also assume to vary in space and time. The latent random effects are modeled additively using Gaussian process priors, which provide high flexibility and interpretability. We develop a penalized complexity (PC) prior specification for the tail index that shrinks the GP model towards the exponential distribution, thus preventing unrealistically heavy tails. Fast and accurate estimation of the posterior distributions is performed thanks to the integrated nested Laplace approximation (INLA). We illustrate this methodology by modeling the daily precipitation data provided by the EVA2017 challenge, which consist of observations from 40 stations in the Netherlands recorded during the period 1972–2016. Capitalizing on INLA’s fast computational capacity and powerful distributed computing resources, we conduct an extensive cross-validation study to select the model parameters that govern the smoothness of trends. Our results clearly outperform simple benchmarks and are comparable to the best-scoring approaches of the other teams.
CSIR Research Space (South Africa)
Gregor, Luke
2017-12-01
Full Text Available understanding with spatially integrated air–sea flux estimates (Fay and McKinley, 2014). Conversely, ocean biogeochemical process models are good tools for mechanis- tic understanding, but fail to represent the seasonality of CO2 fluxes in the Southern Ocean... of including coordinate variables as proxies of 1pCO2 in the empirical methods. In the inter- comparison study by Rödenbeck et al. (2015) proxies typi- cally include, but are not limited to, sea surface temperature (SST), chlorophyll a (Chl a), mixed layer...
Directory of Open Access Journals (Sweden)
Marjan Čeh
2018-05-01
Full Text Available The goal of this study is to analyse the predictive performance of the random forest machine learning technique in comparison to commonly used hedonic models based on multiple regression for the prediction of apartment prices. A data set that includes 7407 records of apartment transactions referring to real estate sales from 2008–2013 in the city of Ljubljana, the capital of Slovenia, was used in order to test and compare the predictive performances of both models. Apparent challenges faced during modelling included (1 the non-linear nature of the prediction assignment task; (2 input data being based on transactions occurring over a period of great price changes in Ljubljana whereby a 28% decline was noted in six consecutive testing years; and (3 the complex urban form of the case study area. Available explanatory variables, organised as a Geographic Information Systems (GIS ready dataset, including the structural and age characteristics of the apartments as well as environmental and neighbourhood information were considered in the modelling procedure. All performance measures (R2 values, sales ratios, mean average percentage error (MAPE, coefficient of dispersion (COD revealed significantly better results for predictions obtained by the random forest method, which confirms the prospective of this machine learning technique on apartment price prediction.
Nose, Takashi; Kobayashi, Takao
In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.
DEFF Research Database (Denmark)
Fauser, Patrik; Thomsen, Marianne; Pistocchi, Alberto
2010-01-01
chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs......This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume...... are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates...
Directory of Open Access Journals (Sweden)
Tosun Erdi
2017-01-01
Full Text Available This study was aimed at estimating the variation of several engine control parameters within the rotational speed-load map, using regression analysis and artificial neural network techniques. Duration of injection, specific fuel consumption, exhaust gas at turbine inlet, and within the catalytic converter brick were chosen as the output parameters for the models, while engine speed and brake mean effective pressure were selected as independent variables for prediction. Measurements were performed on a turbocharged direct injection spark ignition engine fueled with gasoline. A three-layer feed-forward structure and back-propagation algorithm was used for training the artificial neural network. It was concluded that this technique is capable of predicting engine parameters with better accuracy than linear and non-linear regression techniques.
Suresh, Arumuganainar; Choi, Hong Lim
2011-10-01
Swine waste land application has increased due to organic fertilization, but excess application in an arable system can cause environmental risk. Therefore, in situ characterizations of such resources are important prior to application. To explore this, 41 swine slurry samples were collected from Korea, and wide differences were observed in the physico-biochemical properties. However, significant (Phydrometer, EC meter, drying oven and pH meter were found useful to estimate Mn, Fe, Ca, K, Al, Na, N and 5-day biochemical oxygen demands (BOD₅) at improved R² values of 0.83, 0.82, 0.77, 0.75, 0.67, 0.47, 0.88 and 0.70, respectively. The results from this study suggest that multiple property regressions can facilitate the prediction of micronutrients and organic matter much better than a single property regression for livestock waste. Copyright © 2011 Elsevier Ltd. All rights reserved.
Almirall, Daniel; Griffin, Beth Ann; McCaffrey, Daniel F.; Ramchand, Rajeev; Yuen, Robert A.; Murphy, Susan A.
2014-01-01
This article considers the problem of examining time-varying causal effect moderation using observational, longitudinal data in which treatment, candidate moderators, and possible confounders are time varying. The structural nested mean model (SNMM) is used to specify the moderated time-varying causal effects of interest in a conditional mean model for a continuous response given time-varying treatments and moderators. We present an easy-to-use estimator of the SNMM that combines an existing regression-with-residuals (RR) approach with an inverse-probability-of-treatment weighting (IPTW) strategy. The RR approach has been shown to identify the moderated time-varying causal effects if the time-varying moderators are also the sole time-varying confounders. The proposed IPTW+RR approach provides estimators of the moderated time-varying causal effects in the SNMM in the presence of an additional, auxiliary set of known and measured time-varying confounders. We use a small simulation experiment to compare IPTW+RR versus the traditional regression approach and to compare small and large sample properties of asymptotic versus bootstrap estimators of the standard errors for the IPTW+RR approach. This article clarifies the distinction between time-varying moderators and time-varying confounders. We illustrate the methodology in a case study to assess if time-varying substance use moderates treatment effects on future substance use. PMID:23873437
Watson, Kara M.; McHugh, Amy R.
2014-01-01
Regional regression equations were developed for estimating monthly flow-duration and monthly low-flow frequency statistics for ungaged streams in Coastal Plain and non-coastal regions of New Jersey for baseline and current land- and water-use conditions. The equations were developed to estimate 87 different streamflow statistics, which include the monthly 99-, 90-, 85-, 75-, 50-, and 25-percentile flow-durations of the minimum 1-day daily flow; the August–September 99-, 90-, and 75-percentile minimum 1-day daily flow; and the monthly 7-day, 10-year (M7D10Y) low-flow frequency. These 87 streamflow statistics were computed for 41 continuous-record streamflow-gaging stations (streamgages) with 20 or more years of record and 167 low-flow partial-record stations in New Jersey with 10 or more streamflow measurements. The regression analyses used to develop equations to estimate selected streamflow statistics were performed by testing the relation between flow-duration statistics and low-flow frequency statistics for 32 basin characteristics (physical characteristics, land use, surficial geology, and climate) at the 41 streamgages and 167 low-flow partial-record stations. The regression analyses determined drainage area, soil permeability, average April precipitation, average June precipitation, and percent storage (water bodies and wetlands) were the significant explanatory variables for estimating the selected flow-duration and low-flow frequency statistics. Streamflow estimates were computed for two land- and water-use conditions in New Jersey—land- and water-use during the baseline period of record (defined as the years a streamgage had little to no change in development and water use) and current land- and water-use conditions (1989–2008)—for each selected station using data collected through water year 2008. The baseline period of record is representative of a period when the basin was unaffected by change in development. The current period is
Regression relation for pure quantum states and its implications for efficient computing.
Elsayed, Tarek A; Fine, Boris V
2013-02-15
We obtain a modified version of the Onsager regression relation for the expectation values of quantum-mechanical operators in pure quantum states of isolated many-body quantum systems. We use the insights gained from this relation to show that high-temperature time correlation functions in many-body quantum systems can be controllably computed without complete diagonalization of the Hamiltonians, using instead the direct integration of the Schrödinger equation for randomly sampled pure states. This method is also applicable to quantum quenches and other situations describable by time-dependent many-body Hamiltonians. The method implies exponential reduction of the computer memory requirement in comparison with the complete diagonalization. We illustrate the method by numerically computing infinite-temperature correlation functions for translationally invariant Heisenberg chains of up to 29 spins 1/2. Thereby, we also test the spin diffusion hypothesis and find it in a satisfactory agreement with the numerical results. Both the derivation of the modified regression relation and the justification of the computational method are based on the notion of quantum typicality.
International Nuclear Information System (INIS)
Kim, J. Y.; Shin, C. H.; Kim, J. K.; Lee, J. K.; Park, Y. J.
2003-01-01
The variation transitions of the inventories for the liquid radwaste system and the radioactive gas have being released in containment, and their predictive values according to the operation histories of Yonggwang(YGN) 3 and 4 were analyzed by linear regression analysis methodology. The results show that the variation transitions of the inventories for those systems are linearly increasing according to the operation histories but the inventories released to the environment are considerably lower than the recommended values based on the FSAR suggestions. It is considered that some conservation were presented in the estimation methodology in preparing stage of FSAR
Directory of Open Access Journals (Sweden)
Mohd Faris Dziauddin
2017-07-01
Full Text Available This study estimates the effect of locational attributes on residential property values in Kuala Lumpur, Malaysia. Geographically weighted regression (GWR enables the use of the local parameter rather than the global parameter to be estimated, with the results presented in map form. The results of this study reveal that residential property values are mainly determined by the property’s physical (structural attributes, but proximity to locational attributes also contributes marginally. The use of GWR in this study is considered a better approach than other methods to examine the effect of locational attributes on residential property values. GWR has the capability to produce meaningful results in which different locational attributes have differential spatial effects across a geographical area on residential property values. This method has the ability to determine the factors on which premiums depend, and in turn it can assist the government in taxation matters.
Holtschlag, David J.
2011-01-01
In Michigan, index flow Q50 is a streamflow characteristic defined as the minimum of median flows for July, August, and September. The state of Michigan uses index flow estimates to help regulate large (greater than 100,000 gallons per day) water withdrawals to prevent adverse effects on characteristic fish populations. At sites where long-term streamgages are located, index flows are computed directly from continuous streamflow records as GageQ50. In an earlier study, a multiple-regression equation was developed to estimate index flows IndxQ50 at ungaged sites. The index equation explains about 94 percent of the variability of index flows at 147 (index) streamgages by use of six explanatory variables describing soil type, aquifer transmissivity, land cover, and precipitation characteristics. This report extends the results of the previous study, by use of Monte Carlo simulations, to evaluate alternative flow estimators, DiscQ50, IntgQ50, SiteQ50, and AugmQ50. The Monte Carlo simulations treated each of the available index streamgages, in turn, as a miscellaneous site where streamflow conditions are described by one or more instantaneous measurements of flow. In the simulations, instantaneous flows were approximated by daily mean flows at the corresponding site. All estimators use information that can be obtained from instantaneous flow measurements and contemporaneous daily mean flow data from nearby long-term streamgages. The efficacy of these estimators was evaluated over a set of measurement intensities in which the number of simulated instantaneous flow measurements ranged from 1 to 100 at a site. The discrete measurement estimator DiscQ50 is based on a simple linear regression developed between information on daily mean flows at five or more streamgages near the miscellaneous site and their corresponding GageQ50 index flows. The regression relation then was used to compute a DiscQ50 estimate at the miscellaneous site by use of the simulated instantaneous flow
Martínez Gila, Diego Manuel; Cano Marchal, Pablo; Gómez Ortega, Juan; Gámez García, Javier
2018-03-25
Normally the olive oil quality is assessed by chemical analysis according to international standards. These norms define chemical and organoleptic markers, and depending on the markers, the olive oil can be labelled as lampante, virgin, or extra virgin olive oil (EVOO), the last being an indicator of top quality. The polyphenol content is related to EVOO organoleptic features, and different scientific works have studied the positive influence that these compounds have on human health. The works carried out in this paper are focused on studying relations between the polyphenol content in olive oil samples and its spectral response in the near infrared spectra. In this context, several acquisition parameters have been assessed to optimize the measurement process within the virgin olive oil production process. The best regression model reached a mean error value of 156.14 mg/kg in leave one out cross validation, and the higher regression coefficient was 0.81 through holdout validation.
Directory of Open Access Journals (Sweden)
Diego Manuel Martínez Gila
2018-03-01
Full Text Available Normally the olive oil quality is assessed by chemical analysis according to international standards. These norms define chemical and organoleptic markers, and depending on the markers, the olive oil can be labelled as lampante, virgin, or extra virgin olive oil (EVOO, the last being an indicator of top quality. The polyphenol content is related to EVOO organoleptic features, and different scientific works have studied the positive influence that these compounds have on human health. The works carried out in this paper are focused on studying relations between the polyphenol content in olive oil samples and its spectral response in the near infrared spectra. In this context, several acquisition parameters have been assessed to optimize the measurement process within the virgin olive oil production process. The best regression model reached a mean error value of 156.14 mg/kg in leave one out cross validation, and the higher regression coefficient was 0.81 through holdout validation.
Motlagh, Mohadeseh Ghanbari; Kafaky, Sasan Babaie; Mataji, Asadollah; Akhavan, Reza
2018-05-21
Hyrcanian forests of North of Iran are of great importance in terms of various economic and environmental aspects. In this study, Spot-6 satellite images and regression models were applied to estimate above-ground biomass in these forests. This research was carried out in six compartments in three climatic (semi-arid to humid) types and two altitude classes. In the first step, ground sampling methods at the compartment level were used to estimate aboveground biomass (Mg/ha). Then, by reviewing the results of other studies, the most appropriate vegetation indices were selected. In this study, three indices of NDVI, RVI, and TVI were calculated. We investigated the relationship between the vegetation indices and aboveground biomass measured at sample-plot level. Based on the results, the relationship between aboveground biomass values and vegetation indices was a linear regression with the highest level of significance for NDVI in all compartments. Since at the compartment level the correlation coefficient between NDVI and aboveground biomass was the highest, NDVI was used for mapping aboveground biomass. According to the results of this study, biomass values were highly different in various climatic and altitudinal classes with the highest biomass value observed in humid climate and high-altitude class.
Directory of Open Access Journals (Sweden)
Maria Gabriela Campolina Diniz Peixoto
2014-05-01
Full Text Available The objective of this work was to compare random regression models for the estimation of genetic parameters for Guzerat milk production, using orthogonal Legendre polynomials. Records (20,524 of test-day milk yield (TDMY from 2,816 first-lactation Guzerat cows were used. TDMY grouped into 10-monthly classes were analyzed for additive genetic effect and for environmental and residual permanent effects (random effects, whereas the contemporary group, calving age (linear and quadratic effects and mean lactation curve were analized as fixed effects. Trajectories for the additive genetic and permanent environmental effects were modeled by means of a covariance function employing orthogonal Legendre polynomials ranging from the second to the fifth order. Residual variances were considered in one, four, six, or ten variance classes. The best model had six residual variance classes. The heritability estimates for the TDMY records varied from 0.19 to 0.32. The random regression model that used a second-order Legendre polynomial for the additive genetic effect, and a fifth-order polynomial for the permanent environmental effect is adequate for comparison by the main employed criteria. The model with a second-order Legendre polynomial for the additive genetic effect, and that with a fourth-order for the permanent environmental effect could also be employed in these analyses.
Directory of Open Access Journals (Sweden)
Tetsuya Matsubayashi
Full Text Available Evidence collected in many parts of the world suggests that, compared to older students, students who are relatively younger at school entry tend to have worse academic performance and lower levels of income. This study examined how relative age in a grade affects suicide rates of adolescents and young adults between 15 and 25 years of age using data from Japan.We examined individual death records in the Vital Statistics of Japan from 1989 to 2010. In contrast to other countries, late entry to primary school is not allowed in Japan. We took advantage of the school entry cutoff date to implement a regression discontinuity (RD design, assuming that the timing of births around the school entry cutoff date was randomly determined and therefore that individuals who were born just before and after the cutoff date have similar baseline characteristics.We found that those who were born right before the school cutoff day and thus youngest in their cohort have higher mortality rates by suicide, compared to their peers who were born right after the cutoff date and thus older. We also found that those with relative age disadvantage tend to follow a different career path than those with relative age advantage, which may explain their higher suicide mortality rates.Relative age effects have broader consequences than was previously supposed. This study suggests that policy intervention that alleviates the relative age effect can be important.
Chirombo, James; Lowe, Rachel; Kazembe, Lawrence
2014-01-01
Background After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. Methods We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Results Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. Conclusions The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities. PMID:24991915
Zhang, L; Liu, X J
2016-06-03
With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations.
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days
Energy Technology Data Exchange (ETDEWEB)
Bramer, L. M.; Rounds, J.; Burleyson, C. D.; Fortin, D.; Hathaway, J.; Rice, J.; Kraucunas, I.
2017-11-01
Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions is examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and datasets were examined. A penalized logistic regression model fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at different time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. The methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.
Käyser, Sabine C; Dekkers, Tanja; Groenewoud, Hans J; van der Wilt, Gert Jan; Carel Bakx, J; van der Wel, Mark C; Hermus, Ad R; Lenders, Jacques W; Deinum, Jaap
2016-07-01
For health care planning and allocation of resources, realistic estimation of the prevalence of primary aldosteronism is necessary. Reported prevalences of primary aldosteronism are highly variable, possibly due to study heterogeneity. Our objective was to identify and explain heterogeneity in studies that aimed to establish the prevalence of primary aldosteronism in hypertensive patients. PubMed, EMBASE, Web of Science, Cochrane Library, and reference lists from January 1, 1990, to January 31, 2015, were used as data sources. Description of an adult hypertensive patient population with confirmed diagnosis of primary aldosteronism was included in this study. Dual extraction and quality assessment were the forms of data extraction. Thirty-nine studies provided data on 42 510 patients (nine studies, 5896 patients from primary care). Prevalence estimates varied from 3.2% to 12.7% in primary care and from 1% to 29.8% in referral centers. Heterogeneity was too high to establish point estimates (I(2) = 57.6% in primary care; 97.1% in referral centers). Meta-regression analysis showed higher prevalences in studies 1) published after 2000, 2) from Australia, 3) aimed at assessing prevalence of secondary hypertension, 4) that were retrospective, 5) that selected consecutive patients, and 6) not using a screening test. All studies had minor or major flaws. This study demonstrates that it is pointless to claim low or high prevalence of primary aldosteronism based on published reports. Because of the significant impact of a diagnosis of primary aldosteronism on health care resources and the necessary facilities, our findings urge for a prevalence study whose design takes into account the factors identified in the meta-regression analysis.
Directory of Open Access Journals (Sweden)
N. Zahir
2015-12-01
Full Text Available Lake Urmia is one of the most important ecosystems of the country which is on the verge of elimination. Many factors contribute to this crisis among them is the precipitation, paly important roll. Precipitation has many forms one of them is in the form of snow. The snow on Sahand Mountain is one of the main and important sources of the Lake Urmia’s water. Snow Depth (SD is vital parameters for estimating water balance for future year. In this regards, this study is focused on SD parameter using Special Sensor Microwave/Imager (SSM/I instruments on board the Defence Meteorological Satellite Program (DMSP F16. The usual statistical methods for retrieving SD include linear and non-linear ones. These methods used least square procedure to estimate SD model. Recently, kernel base methods widely used for modelling statistical problem. From these methods, the support vector regression (SVR is achieved the high performance for modelling the statistical problem. Examination of the obtained data shows the existence of outlier in them. For omitting these outliers, wavelet denoising method is applied. After the omission of the outliers it is needed to select the optimum bands and parameters for SVR. To overcome these issues, feature selection methods have shown a direct effect on improving the regression performance. We used genetic algorithm (GA for selecting suitable features of the SSMI bands in order to estimate SD model. The results for the training and testing data in Sahand mountain is [R²_TEST=0.9049 and RMSE= 6.9654] that show the high SVR performance.
Energy Technology Data Exchange (ETDEWEB)
Nakagawa, S [Maizuru National College of Technology, Kyoto (Japan); Kenmoku, Y; Sakakibara, T [Toyohashi University of Technology, Aichi (Japan); Kawamoto, T [Shizuoka University, Shizuoka (Japan). Faculty of Engineering
1996-10-27
Study is under way for a more accurate solar radiation quantity prediction for the enhancement of solar energy utilization efficiency. Utilizing the technique of roughly estimating the day`s clearness index from forecast weather, the forecast weather (constituted of weather conditions such as `clear,` `cloudy,` etc., and adverbs or adjectives such as `afterward,` `temporary,` and `intermittent`) has been quantified relative to the clearness index. This index is named the `weather index` for the purpose of this article. The error high in rate in the weather index relates to cloudy days, which means a weather index falling in 0.2-0.5. It has also been found that there is a high correlation between the clearness index and the north-south wind direction component. A multiple regression analysis has been carried out, under the circumstances, for the estimation of clearness index from the maximum temperature and the north-south wind direction component. As compared with estimation of the clearness index on the basis only of the weather index, estimation using the weather index and maximum temperature achieves a 3% improvement throughout the year. It has also been learned that estimation by use of the weather index and north-south wind direction component enables a 2% improvement for summer and a 5% or higher improvement for winter. 2 refs., 6 figs., 4 tabs.
Estimating water equivalent snow depth from related meteorological variables
International Nuclear Information System (INIS)
Steyaert, L.T.; LeDuc, S.K.; Strommen, N.D.; Nicodemus, M.L.; Guttman, N.B.
1980-05-01
Engineering design must take into consideration natural loads and stresses caused by meteorological elements, such as, wind, snow, precipitation and temperature. The purpose of this study was to determine a relationship of water equivalent snow depth measurements to meteorological variables. Several predictor models were evaluated for use in estimating water equivalent values. These models include linear regression, principal component regression, and non-linear regression models. Linear, non-linear and Scandanavian models are used to generate annual water equivalent estimates for approximately 1100 cooperative data stations where predictor variables are available, but which have no water equivalent measurements. These estimates are used to develop probability estimates of snow load for each station. Map analyses for 3 probability levels are presented
Demirturk Kocasarac, Husniye; Sinanoglu, Alper; Noujeim, Marcel; Helvacioglu Yigit, Dilek; Baydemir, Canan
2016-05-01
For forensic age estimation, radiographic assessment of third molar mineralization is important between 14 and 21 years which coincides with the legal age in most countries. The spheno-occipital synchondrosis (SOS) is an important growth site during development, and its use for age estimation is beneficial when combined with other markers. In this study, we aimed to develop a regression model to estimate and narrow the age range based on the radiologic assessment of third molar and SOS in a Turkish subpopulation. Panoramic radiographs and cone beam CT scans of 349 subjects (182 males, 167 females) with age between 8 and 25 were evaluated. Four-stage system was used to evaluate the fusion degree of SOS, and Demirjian's eight stages of development for calcification for third molars. The Pearson correlation indicated a strong positive relationship between age and third molar calcification for both sexes (r = 0.850 for females, r = 0.839 for males, P < 0.001) and also between age and SOS fusion for females (r = 0.814), but a moderate relationship was found for males (r = 0.599), P < 0.001). Based on the results obtained, an age determination formula using these scores was established.
International Nuclear Information System (INIS)
Edwards, L.L.; Freis, R.P.; Peters, L.G.; Gudiksen, P.H.; Pitovranov, S.E.
1993-01-01
The accuracy associated with assessing the environmental consequences of an accidental release of radioactivity is highly dependent on the knowledge of the source term characteristics, which are generally poorly known. The development of an automated numerical technique that integrates the radiological measurements with atmospheric dispersion modeling for more accurate source term estimation is reported. Often, this process of parameter estimation is performed by an emergency response assessor, who takes an intelligent first guess at the model parameters, then, comparing the model results with whatever measurements are available, makes an intuitive, informed next guess of the model parameters. This process may be repeated any number of times until the assessor feels that the model results are reasonable in terms of the measured observations. A new approach, based on a nonlinear least-squares regression scheme coupled with the existing Atmospheric Release Advisory Capability three-dimensional atmospheric dispersion models, is to supplement the assessor's intuition with automated mathematical methods that do not significantly increase the response time of the existing predictive models. The viability of the approach is evaluated by estimation of the known SF 6 tracer release rates associated with the Mesoscale Atmospheric Transport Studies tracer experiments conducted at the Savannah River Laboratory during 1983. These 19 experiments resulted in 14 successful, separate tracer releases with sampling of the tracer plumes along the cross-plume arc situated ∼30 km from the release site
A History of Regression and Related Model-Fitting in the Earth Sciences (1636?-2000)
International Nuclear Information System (INIS)
Howarth, Richard J.
2001-01-01
roots in meeting the evident need for improved estimators in spatial interpolation. Technical advances in regression analysis during the 1970s embraced the development of regression diagnostics and consequent attention to outliers; the recognition of problems caused by correlated predictors, and the subsequent introduction of ridge regression to overcome them; and techniques for fitting errors-in-variables and mixture models. Improvements in computational power have enabled ever more computer-intensive methods to be applied. These include algorithms which are robust in the presence of outliers, for example Rousseeuw's 1984 Least Median Squares; nonparametric smoothing methods, such as kernel-functions, splines and Cleveland's 1979 LOcally WEighted Scatterplot Smoother (LOWESS); and the Classification and Regression Tree (CART) technique of Breiman and others in 1984. Despite a continuing improvement in the rate of technology-transfer from the statistical to the earth-science community, despite an abrupt drop to a time-lag of about 10 years following the introduction of digital computers, these more recent developments are only just beginning to penetrate beyond the research community of earth scientists. Examples of applications to problem-solving in the earth sciences are given
Juan Collados-Lara, Antonio; Pardo-Iguzquiza, Eulogio; Pulido-Velazquez, David
2016-04-01
The estimation of Snow Water Equivalent (SWE) is essential for an appropriate assessment of the available water resources in Alpine catchment. The hydrologic regime in these areas is dominated by the storage of water in the snowpack, which is discharged to rivers throughout the melt season. An accurate estimation of the resources will be necessary for an appropriate analysis of the system operation alternatives using basin scale management models. In order to obtain an appropriate estimation of the SWE we need to know the spatial distribution snowpack and snow density within the Snow Cover Area (SCA). Data for these snow variables can be extracted from in-situ point measurements and air-borne/space-borne remote sensing observations. Different interpolation and simulation techniques have been employed for the estimation of the cited variables. In this paper we propose to estimate snowpack from a reduced number of ground-truth data (1 or 2 campaigns per year with 23 observation point from 2000-2014) and MODIS satellite-based observations in the Sierra Nevada Mountain (Southern Spain). Regression based methodologies has been used to study snowpack distribution using different kind of explicative variables: geographic, topographic, climatic. 40 explicative variables were considered: the longitude, latitude, altitude, slope, eastness, northness, radiation, maximum upwind slope and some mathematical transformation of each of them [Ln(v), (v)^-1; (v)^2; (v)^0.5). Eight different structure of regression models have been tested (combining 1, 2, 3 or 4 explicative variables). Y=B0+B1Xi (1); Y=B0+B1XiXj (2); Y=B0+B1Xi+B2Xj (3); Y=B0+B1Xi+B2XjXl (4); Y=B0+B1XiXk+B2XjXl (5); Y=B0+B1Xi+B2Xj+B3Xl (6); Y=B0+B1Xi+B2Xj+B3XlXk (7); Y=B0+B1Xi+B2Xj+B3Xl+B4Xk (8). Where: Y is the snow depth; (Xi, Xj, Xl, Xk) are the prediction variables (any of the 40 variables); (B0, B1, B2, B3) are the coefficients to be estimated. The ground data are employed to calibrate the multiple regressions. In
Maximum Correntropy Unscented Kalman Filter for Spacecraft Relative State Estimation
Directory of Open Access Journals (Sweden)
Xi Liu
2016-09-01
Full Text Available A new algorithm called maximum correntropy unscented Kalman filter (MCUKF is proposed and applied to relative state estimation in space communication networks. As is well known, the unscented Kalman filter (UKF provides an efficient tool to solve the non-linear state estimate problem. However, the UKF usually plays well in Gaussian noises. Its performance may deteriorate substantially in the presence of non-Gaussian noises, especially when the measurements are disturbed by some heavy-tailed impulsive noises. By making use of the maximum correntropy criterion (MCC, the proposed algorithm can enhance the robustness of UKF against impulsive noises. In the MCUKF, the unscented transformation (UT is applied to obtain a predicted state estimation and covariance matrix, and a nonlinear regression method with the MCC cost is then used to reformulate the measurement information. Finally, the UT is adopted to the measurement equation to obtain the filter state and covariance matrix. Illustrative examples demonstrate the superior performance of the new algorithm.
Berry, D P; Buckley, F; Dillon, P; Evans, R D; Rath, M; Veerkamp, R F
2003-11-01
Genetic (co)variances between body condition score (BCS), body weight (BW), milk yield, and fertility were estimated using a random regression animal model extended to multivariate analysis. The data analyzed included 81,313 BCS observations, 91,937 BW observations, and 100,458 milk test-day yields from 8725 multiparous Holstein-Friesian cows. A cubic random regression was sufficient to model the changing genetic variances for BCS, BW, and milk across different days in milk. The genetic correlations between BCS and fertility changed little over the lactation; genetic correlations between BCS and interval to first service and between BCS and pregnancy rate to first service varied from -0.47 to -0.31, and from 0.15 to 0.38, respectively. This suggests that maximum genetic gain in fertility from indirect selection on BCS should be based on measurements taken in midlactation when the genetic variance for BCS is largest. Selection for increased BW resulted in shorter intervals to first service, but more services and poorer pregnancy rates; genetic correlations between BW and pregnancy rate to first service varied from -0.52 to -0.45. Genetic selection for higher lactation milk yield alone through selection on increased milk yield in early lactation is likely to have a more deleterious effect on genetic merit for fertility than selection on higher milk yield in late lactation.
Cancer Related-Knowledge - Small Area Estimates
These model-based estimates are produced using statistical models that combine data from the Health Information National Trends Survey, and auxiliary variables obtained from relevant sources and borrow strength from other areas with similar characteristics.
Method-related estimates of sperm vitality.
Cooper, Trevor G; Hellenkemper, Barbara
2009-01-01
Comparison of methods that estimate viability of human spermatozoa by monitoring head membrane permeability revealed that wet preparations (whether using positive or negative phase-contrast microscopy) generated significantly higher percentages of nonviable cells than did air-dried eosin-nigrosin smears. Only with the latter method did the sum of motile (presumed live) and stained (presumed dead) preparations never exceed 100%, making this the method of choice for sperm viability estimates.
Chen, Ming-Jen; Hsu, Hui-Tsung; Lin, Cheng-Li; Ju, Wei-Yuan
2012-10-01
Human exposure to acrylamide (AA) through consumption of French fries and other foods has been recognized as a potential health concern. Here, we used a statistical non-linear regression model, based on the two most influential factors, cooking temperature and time, to estimate AA concentrations in French fries. The R(2) of the predictive model is 0.83, suggesting the developed model was significant and valid. Based on French fry intake survey data conducted in this study and eight frying temperature-time schemes which can produce tasty and visually appealing French fries, the Monte Carlo simulation results showed that if AA concentration is higher than 168 ppb, the estimated cancer risk for adolescents aged 13-18 years in Taichung City would be already higher than the target excess lifetime cancer risk (ELCR), and that by taking into account this limited life span only. In order to reduce the cancer risk associated with AA intake, the AA levels in French fries might have to be reduced even further if the epidemiological observations are valid. Our mathematical model can serve as basis for further investigations on ELCR including different life stages and behavior and population groups. Copyright © 2012 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Lokuge Dona Manori Nimanthika Lokuge
2015-06-01
Full Text Available Due to high prevalence of dietary diseases and malnutrition in Sri Lanka, it is essential to assess food consumption patterns. Because pulses are a major source of nutrients, this paper employed the Linear Approximation of the Almost Ideal Demand System (LA/AIDS to estimate price and expenditure elasticities for six types of pulses, by utilizing the Household Income and Expenditure Survey, 2006/07. The infrequency of purchases, a typical problem encountered in LA/AIDS estimation is circumvented by using a probit regression in the first stage, to capture the effect of demographic factors, in consumption choice. Results reveal that the buying decision of pulses is influenced by the sector (rural, urban and estate, household size, education level, presence of children, prevalence of blood pressure and diabetes. All pulses types except dhal are highly responsive to their own prices. Dhal is identified as the most prominent choice among all other alternatives and hence, it is distinguished as a necessity whereas, the rest show luxurious behavior, with the income. Because dhal is an import product, consumption choices of dhal may be severely affected by any action which exporting countries introduce, while rest of the pulses will be affected by both price and income oriented policies.
Directory of Open Access Journals (Sweden)
Abdolreza Yazdani-Chamzini
2017-12-01
Full Text Available Cost estimation is an essential issue in feasibility studies in civil engineering. Many different methods can be applied to modelling costs. These methods can be divided into several main groups: (1 artificial intelligence, (2 statistical methods, and (3 analytical methods. In this paper, the multivariate regression (MVR method, which is one of the most popular linear models, and the artificial neural network (ANN method, which is widely applied to solving different prediction problems with a high degree of accuracy, have been combined to provide a cost estimate model for a shovel machine. This hybrid methodology is proposed, taking the advantages of MVR and ANN models in linear and nonlinear modelling, respectively. In the proposed model, the unique advantages of the MVR model in linear modelling are used first to recognize the existing linear structure in data, and, then, the ANN for determining nonlinear patterns in preprocessed data is applied. The results with three indices indicate that the proposed model is efficient and capable of increasing the prediction accuracy.
Directory of Open Access Journals (Sweden)
Leif E. Peterson
1997-11-01
Full Text Available A computer program for multifactor relative risks, confidence limits, and tests of hypotheses using regression coefficients and a variance-covariance matrix obtained from a previous additive or multiplicative regression analysis is described in detail. Data used by the program can be stored and input from an external disk-file or entered via the keyboard. The output contains a list of the input data, point estimates of single or joint effects, confidence intervals and tests of hypotheses based on a minimum modified chi-square statistic. Availability of the program is also discussed.
International Nuclear Information System (INIS)
Heinzel, F.; Mueller-Duysing, W.; Blattman, H.; Bacesa, L.; Rao, K.R.; Mindek, G.
In order to be able to test the therapeutic value of the pions in comparison with conventional X-rays, analyses of animal experiments with induced tumors, transplantation tumors, and comparative cellular kinetic studies of tissue cultures will be performed. So that differences in radiation effect and a possible superiority of the pion therapy be objectively acknowledged, the reaction systems to be tested must be as homogenous as possible. For this purpose, the dependence of the radiation related regression on various parameters such as sex, age of hosts, environmental factors radiation conditions (intensity, fractionation, and so on), tumor size, and so on, must be investigated on sterile animals in a sterile environment. The experiments should be conducted under conditions as close as possible to clinical ones. For comparison, the reaction of normal tissue (in vitro and in vivo) and of malignant cells in short-time tissue cultures will be analysed. Cellular kinetics, alteration of chromosomes and metabolic activity of the cells will be studied
Estimated Perennial Streams of Idaho and Related Geospatial Datasets
Rea, Alan; Skinner, Kenneth D.
2009-01-01
The perennial or intermittent status of a stream has bearing on many regulatory requirements. Because of changing technologies over time, cartographic representation of perennial/intermittent status of streams on U.S. Geological Survey (USGS) topographic maps is not always accurate and (or) consistent from one map sheet to another. Idaho Administrative Code defines an intermittent stream as one having a 7-day, 2-year low flow (7Q2) less than 0.1 cubic feet per second. To establish consistency with the Idaho Administrative Code, the USGS developed regional regression equations for Idaho streams for several low-flow statistics, including 7Q2. Using these regression equations, the 7Q2 streamflow may be estimated for naturally flowing streams anywhere in Idaho to help determine perennial/intermittent status of streams. Using these equations in conjunction with a Geographic Information System (GIS) technique known as weighted flow accumulation allows for an automated and continuous estimation of 7Q2 streamflow at all points along a stream, which in turn can be used to determine if a stream is intermittent or perennial according to the Idaho Administrative Code operational definition. The selected regression equations were applied to create continuous grids of 7Q2 estimates for the eight low-flow regression regions of Idaho. By applying the 0.1 ft3/s criterion, the perennial streams have been estimated in each low-flow region. Uncertainty in the estimates is shown by identifying a 'transitional' zone, corresponding to flow estimates of 0.1 ft3/s plus and minus one standard error. Considerable additional uncertainty exists in the model of perennial streams presented in this report. The regression models provide overall estimates based on general trends within each regression region. These models do not include local factors such as a large spring or a losing reach that may greatly affect flows at any given point. Site-specific flow data, assuming a sufficient period of
Paul, Suman; Ali, Muhammad; Chatterjee, Rima
2018-01-01
Velocity of compressional wave ( V P) of coal and non-coal lithology is predicted from five wells from the Bokaro coalfield (CF), India. Shear sonic travel time logs are not recorded for all wells under the study area. Shear wave velocity ( Vs) is available only for two wells: one from east and other from west Bokaro CF. The major lithologies of this CF are dominated by coal, shaly coal of Barakar formation. This paper focuses on the (a) relationship between Vp and Vs, (b) prediction of Vp using regression and neural network modeling and (c) estimation of maximum horizontal stress from image log. Coal characterizes with low acoustic impedance (AI) as compared to the overlying and underlying strata. The cross-plot between AI and Vp/ Vs is able to identify coal, shaly coal, shale and sandstone from wells in Bokaro CF. The relationship between Vp and Vs is obtained with excellent goodness of fit ( R 2) ranging from 0.90 to 0.93. Linear multiple regression and multi-layered feed-forward neural network (MLFN) models are developed for prediction Vp from two wells using four input log parameters: gamma ray, resistivity, bulk density and neutron porosity. Regression model predicted Vp shows poor fit (from R 2 = 0.28) to good fit ( R 2 = 0.79) with the observed velocity. MLFN model predicted Vp indicates satisfactory to good R2 values varying from 0.62 to 0.92 with the observed velocity. Maximum horizontal stress orientation from a well at west Bokaro CF is studied from Formation Micro-Imager (FMI) log. Breakouts and drilling-induced fractures (DIFs) are identified from the FMI log. Breakout length of 4.5 m is oriented towards N60°W whereas the orientation of DIFs for a cumulative length of 26.5 m is varying from N15°E to N35°E. The mean maximum horizontal stress in this CF is towards N28°E.
Directory of Open Access Journals (Sweden)
Hukharnsusatrue, A.
2005-11-01
Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than
Veerkamp, R F; Koenen, E P; De Jong, G
2001-10-01
Twenty type classifiers scored body condition (BCS) of 91,738 first-parity cows from 601 sires and 5518 maternal grandsires. Fertility data during first lactation were extracted for 177,220 cows, of which 67,278 also had a BCS observation, and first-lactation 305-d milk, fat, and protein yields were added for 180,631 cows. Heritabilities and genetic correlations were estimated using a sire-maternal grandsire model. Heritability of BCS was 0.38. Heritabilities for fertility traits were low (0.01 to 0.07), but genetic standard deviations were substantial, 9 d for days to first service and calving interval, 0.25 for number of services, and 5% for first-service conception. Phenotypic correlations between fertility and yield or BCS were small (-0.15 to 0.20). Genetic correlations between yield and all fertility traits were unfavorable (0.37 to 0.74). Genetic correlations with BCS were between -0.4 and -0.6 for calving interval and days to first service. Random regression analysis (RR) showed that correlations changed with days in milk for BCS. Little agreement was found between variances and correlations from RR, and analysis including a single month (mo 1 to 10) of data for BCS, especially during early and late lactation. However, this was due to excluding data from the conventional analysis, rather than due to the polynomials used. RR and a conventional five-traits model where BCS in mo 1, 4, 7, and 10 was treated as a separate traits (plus yield or fertility) gave similar results. Thus a parsimonious random regression model gave more realistic estimates for the (co)variances than a series of bivariate analysis on subsets of the data for BCS. A higher genetic merit for yield has unfavorable effects on fertility, but the genetic correlation suggests that BCS (at some stages of lactation) might help to alleviate the unfavorable effect of selection for higher yield on fertility.
A new relation to estimate nuclear radius
International Nuclear Information System (INIS)
Singh, M.; Kumar, Pradeep; Singh, Y.; Gupta, K.K.; Varshney, A.K.; Gupta, D.K.
2013-01-01
The uncertainty found in Grodzins semi empirical relation may be due to the non - consideration of asymmetry in the relation. In the present work we propose a new relation connecting B(E2; 2 1 + → 0 1 + ) and E2 1 + with asymmetric parameter γ
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
Estimating the temporal distribution of exposure-related cancers
International Nuclear Information System (INIS)
Carter, R.L.; Sposto, R.; Preston, D.L.
1993-09-01
The temporal distribution of exposure-related cancers is relevant to the study of carcinogenic mechanisms. Statistical methods for extracting pertinent information from time-to-tumor data, however, are not well developed. Separation of incidence from 'latency' and the contamination of background cases are two problems. In this paper, we present methods for estimating both the conditional distribution given exposure-related cancers observed during the study period and the unconditional distribution. The methods adjust for confounding influences of background cases and the relationship between time to tumor and incidence. Two alternative methods are proposed. The first is based on a structured, theoretically derived model and produces direct inferences concerning the distribution of interest but often requires more-specialized software. The second relies on conventional modeling of incidence and is implemented through readily available, easily used computer software. Inferences concerning the effects of radiation dose and other covariates, however, are not always obtainable directly. We present three examples to illustrate the use of these two methods and suggest criteria for choosing between them. The first approach was used, with a log-logistic specification of the distribution of interest, to analyze times to bone sarcoma among a group of German patients injected with 224 Ra. Similarly, a log-logistic specification was used in the analysis of time to chronic myelogenous leukemias among male atomic-bomb survivors. We used the alternative approach, involving conventional modeling, to estimate the conditional distribution of exposure-related acute myelogenous leukemias among male atomic-bomb survivors, given occurrence between 1 October 1950 and 31 December 1985. All analyses were performed using Poisson regression methods for analyzing grouped survival data. (J.P.N.)
Directory of Open Access Journals (Sweden)
Corrado Dimauro
2010-01-01
Full Text Available Two methods of SNPs pre-selection based on single marker regression for the estimation of genomic breeding values (G-EBVs were compared using simulated data provided by the XII QTL-MAS workshop: i Bonferroni correction of the significance threshold and ii Permutation test to obtain the reference distribution of the null hypothesis and identify significant markers at P<0.01 and P<0.001 significance thresholds. From the set of markers significant at P<0.001, random subsets of 50% and 25% markers were extracted, to evaluate the effect of further reducing the number of significant SNPs on G-EBV predictions. The Bonferroni correction method allowed the identification of 595 significant SNPs that gave the best G-EBV accuracies in prediction generations (82.80%. The permutation methods gave slightly lower G-EBV accuracies even if a larger number of SNPs resulted significant (2,053 and 1,352 for 0.01 and 0.001 significance thresholds, respectively. Interestingly, halving or dividing by four the number of SNPs significant at P<0.001 resulted in an only slightly decrease of G-EBV accuracies. The genetic structure of the simulated population with few QTL carrying large effects, might have favoured the Bonferroni method.
Akita, Yasuyuki; Baldasano, Jose M; Beelen, Rob; Cirach, Marta; de Hoogh, Kees; Hoek, Gerard; Nieuwenhuijsen, Mark; Serre, Marc L; de Nazelle, Audrey
2014-04-15
In recognition that intraurban exposure gradients may be as large as between-city variations, recent air pollution epidemiologic studies have become increasingly interested in capturing within-city exposure gradients. In addition, because of the rapidly accumulating health data, recent studies also need to handle large study populations distributed over large geographic domains. Even though several modeling approaches have been introduced, a consistent modeling framework capturing within-city exposure variability and applicable to large geographic domains is still missing. To address these needs, we proposed a modeling framework based on the Bayesian Maximum Entropy method that integrates monitoring data and outputs from existing air quality models based on Land Use Regression (LUR) and Chemical Transport Models (CTM). The framework was applied to estimate the yearly average NO2 concentrations over the region of Catalunya in Spain. By jointly accounting for the global scale variability in the concentration from the output of CTM and the intraurban scale variability through LUR model output, the proposed framework outperformed more conventional approaches.
Directory of Open Access Journals (Sweden)
Saeid Mirzaei
2017-10-01
molecular date (as independent variable and morphological data (as dependent variable was performed using multiple regression analysis to identify informative markers associated with the yield related traits. Multiple regression analysis was conducted using stepwise method of linear regression analysis option of SPSS. Student t-test was performed to assess significance difference between mean trait estimates of genotypes where specific markers were present and absent. Markers shown significant regression values were considered to be associated with the trait under consideration. Results and Discussion: Finally 11 primers were polymorphic and a total of 56 pieces (loci were amplified that among these, 36 segments (64.29% showed polymorphism with an average of 5.09% per primers and the rate of this polymorphism ranged from at least 25% for AJ05 primer up to 87.5% for OPAD02 primer. Polymorphic information content ranged from 0.095 (AJ05 and OPAD14 to 0.39 (OPC05, with an average of 0.23. Stepwise regression analysis between molecular data and traits was performed to identify informative markers associated with yield component traits. Nineteen RAPD fragments were found associated with six yield related traits. Some of RAPD markers were associated with more than one trait in multiple regression analysis that may be due to pleiotropic effect of the linked quantitative trait locus on different traits. However, to better understand these relationships, preparation of segregating population and linkage mapping is necessary. Also, these results could be useful in marker-assisted breeding programs when no other genetic information is available. Conclusion: This investigation on molecular markers associated with yield traits in Pistachio has provided clues for identification of the genotypes with higher yield value. In breeding programs selection of quality material is often a time-consuming process, and thus marker-assisted selection could be of great useful in identification of
Alishiri, Gholam Hossein; Bayat, Noushin; Fathi Ashtiani, Ali; Tavallaii, Seyed Abbas; Assari, Shervin; Moharamzad, Yashar
2008-01-01
The aim of this work was to develop two logistic regression models capable of predicting physical and mental health related quality of life (HRQOL) among rheumatoid arthritis (RA) patients. In this cross-sectional study which was conducted during 2006 in the outpatient rheumatology clinic of our university hospital, Short Form 36 (SF-36) was used for HRQOL measurements in 411 RA patients. A cutoff point to define poor versus good HRQOL was calculated using the first quartiles of SF-36 physical and mental component scores (33.4 and 36.8, respectively). Two distinct logistic regression models were used to derive predictive variables including demographic, clinical, and psychological factors. The sensitivity, specificity, and accuracy of each model were calculated. Poor physical HRQOL was positively associated with pain score, disease duration, monthly family income below 300 US$, comorbidity, patient global assessment of disease activity or PGA, and depression (odds ratios: 1.1; 1.004; 15.5; 1.1; 1.02; 2.08, respectively). The variables that entered into the poor mental HRQOL prediction model were monthly family income below 300 US$, comorbidity, PGA, and bodily pain (odds ratios: 6.7; 1.1; 1.01; 1.01, respectively). Optimal sensitivity and specificity were achieved at a cutoff point of 0.39 for the estimated probability of poor physical HRQOL and 0.18 for mental HRQOL. Sensitivity, specificity, and accuracy of the physical and mental models were 73.8, 87, 83.7% and 90.38, 70.36, 75.43%, respectively. The results show that the suggested models can be used to predict poor physical and mental HRQOL separately among RA patients using simple variables with acceptable accuracy. These models can be of use in the clinical decision-making of RA patients and to recognize patients with poor physical or mental HRQOL in advance, for better management.
Coskuntuncel, Orkun
2013-01-01
The purpose of this study is two-fold; the first aim being to show the effect of outliers on the widely used least squares regression estimator in social sciences. The second aim is to compare the classical method of least squares with the robust M-estimator using the "determination of coefficient" (R[superscript 2]). For this purpose,…
Directory of Open Access Journals (Sweden)
C. I. Cho
2016-05-01
Full Text Available The objectives of the study were to estimate genetic parameters for milk production traits of Holstein cattle using random regression models (RRMs, and to compare the goodness of fit of various RRMs with homogeneous and heterogeneous residual variances. A total of 126,980 test-day milk production records of the first parity Holstein cows between 2007 and 2014 from the Dairy Cattle Improvement Center of National Agricultural Cooperative Federation in South Korea were used. These records included milk yield (MILK, fat yield (FAT, protein yield (PROT, and solids-not-fat yield (SNF. The statistical models included random effects of genetic and permanent environments using Legendre polynomials (LP of the third to fifth order (L3–L5, fixed effects of herd-test day, year-season at calving, and a fixed regression for the test-day record (third to fifth order. The residual variances in the models were either homogeneous (HOM or heterogeneous (15 classes, HET15; 60 classes, HET60. A total of nine models (3 orders of polynomials×3 types of residual variance including L3-HOM, L3-HET15, L3-HET60, L4-HOM, L4-HET15, L4-HET60, L5-HOM, L5-HET15, and L5-HET60 were compared using Akaike information criteria (AIC and/or Schwarz Bayesian information criteria (BIC statistics to identify the model(s of best fit for their respective traits. The lowest BIC value was observed for the models L5-HET15 (MILK; PROT; SNF and L4-HET15 (FAT, which fit the best. In general, the BIC values of HET15 models for a particular polynomial order was lower than that of the HET60 model in most cases. This implies that the orders of LP and types of residual variances affect the goodness of models. Also, the heterogeneity of residual variances should be considered for the test-day analysis. The heritability estimates of from the best fitted models ranged from 0.08 to 0.15 for MILK, 0.06 to 0.14 for FAT, 0.08 to 0.12 for PROT, and 0.07 to 0.13 for SNF according to days in milk of first
Baldwin, Austin K.; Robertson, Dale M.; Saad, David A.; Magruder, Christopher
2013-01-01
In 2008, the U.S. Geological Survey and the Milwaukee Metropolitan Sewerage District initiated a study to develop regression models to estimate real-time concentrations and loads of chloride, suspended solids, phosphorus, and bacteria in streams near Milwaukee, Wisconsin. To collect monitoring data for calibration of models, water-quality sensors and automated samplers were installed at six sites in the Menomonee River drainage basin. The sensors continuously measured four potential explanatory variables: water temperature, specific conductance, dissolved oxygen, and turbidity. Discrete water-quality samples were collected and analyzed for five response variables: chloride, total suspended solids, total phosphorus, Escherichia coli bacteria, and fecal coliform bacteria. Using the first year of data, regression models were developed to continuously estimate the response variables on the basis of the continuously measured explanatory variables. Those models were published in a previous report. In this report, those models are refined using 2 years of additional data, and the relative improvement in model predictability is discussed. In addition, a set of regression models is presented for a new site in the Menomonee River Basin, Underwood Creek at Wauwatosa. The refined models use the same explanatory variables as the original models. The chloride models all used specific conductance as the explanatory variable, except for the model for the Little Menomonee River near Freistadt, which used both specific conductance and turbidity. Total suspended solids and total phosphorus models used turbidity as the only explanatory variable, and bacteria models used water temperature and turbidity as explanatory variables. An analysis of covariance (ANCOVA), used to compare the coefficients in the original models to those in the refined models calibrated using all of the data, showed that only 3 of the 25 original models changed significantly. Root-mean-squared errors (RMSEs
Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.
2016-06-01
Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).
ESTIMATION OF INTRINSIC AND EXTRINSIC ENVIRONMENT FACTORS OF AGE-RELATED TOOTH COLOUR CHANGES
Czech Academy of Sciences Publication Activity Database
Hyšpler, P.; Jezbera, D.; Fürst, T.; Mikšík, Ivan; Waclawek, M.
2010-01-01
Roč. 17, č. 4 (2010), s. 515-525 ISSN 1898-6196 Institutional research plan: CEZ:AV0Z50110509 Keywords : age-related colour changes of teeth * intrinsic and extrinsic factors * 3D mathematical regression models * estimation of real age Subject RIV: ED - Physiology Impact factor: 0.294, year: 2010
Lee, Chi M.; Schock, Harold J.
1988-01-01
Currently, the heat transfer equation used in the rotary combustion engine (RCE) simulation model is taken from piston engine studies. These relations have been empirically developed by the experimental input coming from piston engines whose geometry differs considerably from that of the RCE. The objective of this work was to derive equations to estimate heat transfer coefficients in the combustion chamber of an RCE. This was accomplished by making detailed temperature and pressure measurements in a direct injection stratified charge (DISC) RCE under a range of conditions. For each specific measurement point, the local gas velocity was assumed equal to the local rotor tip speed. Local physical properties of the fluids were then calculated. Two types of correlation equations were derived and are described in this paper. The first correlation expresses the Nusselt number as a function of the Prandtl number, Reynolds number, and characteristic temperature ratio; the second correlation expresses the forced convection heat transfer coefficient as a function of fluid temperature, pressure and velocity.
Wiley, Kristofor R.
2013-01-01
Many of the social and emotional needs that have historically been associated with gifted students have been questioned on the basis of recent empirical evidence. Research on the topic, however, is often limited by sample size, selection bias, or definition. This study addressed these limitations by applying linear regression methodology to data…
Hu, L; Liang, M; Mouraux, A; Wise, R G; Hu, Y; Iannetti, G D
2011-12-01
Across-trial averaging is a widely used approach to enhance the signal-to-noise ratio (SNR) of event-related potentials (ERPs). However, across-trial variability of ERP latency and amplitude may contain physiologically relevant information that is lost by across-trial averaging. Hence, we aimed to develop a novel method that uses 1) wavelet filtering (WF) to enhance the SNR of ERPs and 2) a multiple linear regression with a dispersion term (MLR(d)) that takes into account shape distortions to estimate the single-trial latency and amplitude of ERP peaks. Using simulated ERP data sets containing different levels of noise, we provide evidence that, compared with other approaches, the proposed WF+MLR(d) method yields the most accurate estimate of single-trial ERP features. When applied to a real laser-evoked potential data set, the WF+MLR(d) approach provides reliable estimation of single-trial latency, amplitude, and morphology of ERPs and thereby allows performing meaningful correlations at single-trial level. We obtained three main findings. First, WF significantly enhances the SNR of single-trial ERPs. Second, MLR(d) effectively captures and measures the variability in the morphology of single-trial ERPs, thus providing an accurate and unbiased estimate of their peak latency and amplitude. Third, intensity of pain perception significantly correlates with the single-trial estimates of N2 and P2 amplitude. These results indicate that WF+MLR(d) can be used to explore the dynamics between different ERP features, behavioral variables, and other neuroimaging measures of brain activity, thus providing new insights into the functional significance of the different brain processes underlying the brain responses to sensory stimuli.
Directory of Open Access Journals (Sweden)
Stefanie M. Herrmann
2013-10-01
Full Text Available Field trees are an integral part of the farmed parkland landscape in West Africa and provide multiple benefits to the local environment and livelihoods. While field trees have received increasing interest in the context of strengthening resilience to climate variability and change, the actual extent of farmed parkland and spatial patterns of tree cover are largely unknown. We used the rule-based predictive modeling tool Cubist® to estimate field tree cover in the west-central agricultural region of Senegal. A collection of rules and associated multiple linear regression models was constructed from (1 a reference dataset of percent tree cover derived from very high spatial resolution data (2 m Orbview as the dependent variable, and (2 ten years of 10-day 250 m Moderate Resolution Imaging Spectrometer (MODIS Normalized Difference Vegetation Index (NDVI composites and derived phenological metrics as independent variables. Correlation coefficients between modeled and reference percent tree cover of 0.88 and 0.77 were achieved for training and validation data respectively, with absolute mean errors of 1.07 and 1.03 percent tree cover. The resulting map shows a west-east gradient from high tree cover in the peri-urban areas of horticulture and arboriculture to low tree cover in the more sparsely populated eastern part of the study area. A comparison of current (2000s tree cover along this gradient with historic cover as seen on Corona images reveals dynamics of change but also areas of remarkable stability of field tree cover since 1968. The proposed modeling approach can help to identify locations of high and low tree cover in dryland environments and guide ground studies and management interventions aimed at promoting the integration of field trees in agricultural systems.
Passos, Cláudia P; Cardoso, Susana M; Barros, António S; Silva, Carlos M; Coimbra, Manuel A
2010-02-28
Fourier transform infrared (FTIR) spectroscopy has being emphasised as a widespread technique in the quick assess of food components. In this work, procyanidins were extracted with methanol and acetone/water from the seeds of white and red grape varieties. A fractionation by graded methanol/chloroform precipitations allowed to obtain 26 samples that were characterised using thiolysis as pre-treatment followed by HPLC-UV and MS detection. The average degree of polymerisation (DPn) of the procyanidins in the samples ranged from 2 to 11 flavan-3-ol residues. FTIR spectroscopy within the wavenumbers region of 1800-700 cm(-1) allowed to build a partial least squares (PLS1) regression model with 8 latent variables (LVs) for the estimation of the DPn, giving a RMSECV of 11.7%, with a R(2) of 0.91 and a RMSEP of 2.58. The application of orthogonal projection to latent structures (O-PLS1) clarifies the interpretation of the regression model vectors. Moreover, the O-PLS procedure has removed 88% of non-correlated variations with the DPn, allowing to relate the increase of the absorbance peaks at 1203 and 1099 cm(-1) with the increase of the DPn due to the higher proportion of substitutions in the aromatic ring of the polymerised procyanidin molecules. Copyright 2009 Elsevier B.V. All rights reserved.
2017-03-23
Logistic Regression to Estimate the Median Will-Cost and Probability of Cost and Schedule Overrun for Program Managers Ryan C. Trudelle, B.S...not the other. We are able to give logistic regression models to program managers that identify several program characteristics for either...considered acceptable. We recommend the use of our logistic models as a tool to manage a portfolio of programs in order to gain potential elusive
Toy, Brian C; Krishnadev, Nupura; Indaram, Maanasa; Cunningham, Denise; Cukras, Catherine A; Chew, Emily Y; Wong, Wai T
2013-09-01
To investigate the association of spontaneous drusen regression in intermediate age-related macular degeneration (AMD) with changes on fundus photography and fundus autofluorescence (FAF) imaging. Prospective observational case series. Fundus images from 58 eyes (in 58 patients) with intermediate AMD and large drusen were assessed over 2 years for areas of drusen regression that exceeded the area of circle C1 (diameter 125 μm; Age-Related Eye Disease Study grading protocol). Manual segmentation and computer-based image analysis were used to detect and delineate areas of drusen regression. Delineated regions were graded as to their appearance on fundus photographs and FAF images, and changes in FAF signal were graded manually and quantitated using automated image analysis. Drusen regression was detected in approximately half of study eyes using manual (48%) and computer-assisted (50%) techniques. At year-2, the clinical appearance of areas of drusen regression on fundus photography was mostly unremarkable, with a majority of eyes (71%) demonstrating no detectable clinical abnormalities, and the remainder (29%) showing minor pigmentary changes. However, drusen regression areas were associated with local changes in FAF that were significantly more prominent than changes on fundus photography. A majority of eyes (64%-66%) demonstrated a predominant decrease in overall FAF signal, while 14%-21% of eyes demonstrated a predominant increase in overall FAF signal. FAF imaging demonstrated that drusen regression in intermediate AMD was often accompanied by changes in local autofluorescence signal. Drusen regression may be associated with concurrent structural and physiologic changes in the outer retina. Published by Elsevier Inc.
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Lorenzo-Seva, Urbano; Ferrando, Pere J
2011-03-01
We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J
2015-09-01
This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
Bartoli, Francesco; Clerici, Massimo; Di Brita, Carmen; Riboldi, Ilaria; Crocamo, Cristina; Carrà, Giuseppe
2018-04-01
Randomised placebo-controlled trials investigating treatments for bipolar disorder have been hampered by wide variations of active drugs and placebo clinical response rates. It is important to estimate whether the active drug or placebo response has a greater influence in determining the relative efficacy of drugs for psychosis (antipsychotics) and relapse prevention (mood stabilisers) for bipolar depression and mania. We identified 53 randomised, placebo-controlled trials assessing antipsychotic or mood stabiliser monotherapy ('active drugs') for bipolar depression or mania. We carried out random-effects meta-regressions, estimating the influence of active drugs and placebo response rates on treatment relative efficacy. Meta-regressions showed that treatment relative efficacy for bipolar mania was influenced by the magnitude of clinical response to active drugs ( p=0.002), but not to placebo ( p=0.60). On the other hand, treatment relative efficacy for bipolar depression was influenced by response to placebo ( p=0.047), but not to active drugs ( p=0.98). Despite several limitations, our unexpected findings showed that antipsychotics / mood stabilisers relative efficacy for bipolar depression seems unrelated to active drugs response rates, depending only on clinical response to placebo. Future research should explore strategies to reduce placebo-related issues in randomised, placebo-controlled trials for bipolar depression.
International Nuclear Information System (INIS)
Fang, Yu-Hua; Kao, Tsair; Liu, Ren-Shyan; Wu, Liang-Chih
2004-01-01
A novel statistical method, namely Regression-Estimated Input Function (REIF), is proposed in this study for the purpose of non-invasive estimation of the input function for fluorine-18 2-fluoro-2-deoxy-d-glucose positron emission tomography (FDG-PET) quantitative analysis. We collected 44 patients who had undergone a blood sampling procedure during their FDG-PET scans. First, we generated tissue time-activity curves of the grey matter and the whole brain with a segmentation technique for every subject. Summations of different intervals of these two curves were used as a feature vector, which also included the net injection dose. Multiple linear regression analysis was then applied to find the correlation between the input function and the feature vector. After a simulation study with in vivo data, the data of 29 patients were applied to calculate the regression coefficients, which were then used to estimate the input functions of the other 15 subjects. Comparing the estimated input functions with the corresponding real input functions, the averaged error percentages of the area under the curve and the cerebral metabolic rate of glucose (CMRGlc) were 12.13±8.85 and 16.60±9.61, respectively. Regression analysis of the CMRGlc values derived from the real and estimated input functions revealed a high correlation (r=0.91). No significant difference was found between the real CMRGlc and that derived from our regression-estimated input function (Student's t test, P>0.05). The proposed REIF method demonstrated good abilities for input function and CMRGlc estimation, and represents a reliable replacement for the blood sampling procedures in FDG-PET quantification. (orig.)
International Nuclear Information System (INIS)
Díaz, Santiago; Carta, José A.; Matías, José M.
2017-01-01
Highlights: • Eight measure-correlate-predict (MCP) models used to estimate the wind power densities (WPDs) at a target site are compared. • Support vector regressions are used as the main prediction techniques in the proposed MCPs. • The most precise MCP uses two sub-models which predict wind speed and air density in an unlinked manner. • The most precise model allows to construct a bivariable (wind speed and air density) WPD probability density function. • MCP models trained to minimise wind speed prediction error do not minimise WPD prediction error. - Abstract: The long-term annual mean wind power density (WPD) is an important indicator of wind as a power source which is usually included in regional wind resource maps as useful prior information to identify potentially attractive sites for the installation of wind projects. In this paper, a comparison is made of eight proposed Measure-Correlate-Predict (MCP) models to estimate the WPDs at a target site. Seven of these models use the Support Vector Regression (SVR) and the eighth the Multiple Linear Regression (MLR) technique, which serves as a basis to compare the performance of the other models. In addition, a wrapper technique with 10-fold cross-validation has been used to select the optimal set of input features for the SVR and MLR models. Some of the eight models were trained to directly estimate the mean hourly WPDs at a target site. Others, however, were firstly trained to estimate the parameters on which the WPD depends (i.e. wind speed and air density) and then, using these parameters, the target site mean hourly WPDs. The explanatory features considered are different combinations of the mean hourly wind speeds, wind directions and air densities recorded in 2014 at ten weather stations in the Canary Archipelago (Spain). The conclusions that can be drawn from the study undertaken include the argument that the most accurate method for the long-term estimation of WPDs requires the execution of a
Rasmussen, Teresa J.; Lee, Casey J.; Ziegler, Andrew C.
2008-01-01
Johnson County is one of the most rapidly developing counties in Kansas. Population growth and expanding urban land use affect the quality of county streams, which are important for human and environmental health, water supply, recreation, and aesthetic value. This report describes estimates of streamflow and constituent concentrations, loads, and yields in relation to watershed characteristics in five Johnson County streams using continuous in-stream sensor measurements. Specific conductance, pH, water temperature, turbidity, and dissolved oxygen were monitored in five watersheds from October 2002 through December 2006. These continuous data were used in conjunction with discrete water samples to develop regression models for continuously estimating concentrations of other constituents. Continuous regression-based concentrations were estimated for suspended sediment, total suspended solids, dissolved solids and selected major ions, nutrients (nitrogen and phosphorus species), and fecal-indicator bacteria. Continuous daily, monthly, seasonal, and annual loads were calculated from concentration estimates and streamflow. The data are used to describe differences in concentrations, loads, and yields and to explain these differences relative to watershed characteristics. Water quality at the five monitoring sites varied according to hydrologic conditions; contributing drainage area; land use (including degree of urbanization); relative contributions from point and nonpoint constituent sources; and human activity within each watershed. Dissolved oxygen (DO) concentrations were less than the Kansas aquatic-life-support criterion of 5.0 mg/L less than 10 percent of the time at all sites except Indian Creek, which had DO concentrations less than the criterion about 15 percent of the time. Concentrations of suspended sediment, chloride (winter only), indicator bacteria, and pesticides were substantially larger during periods of increased streamflow. Suspended
Walker, Mary Ellen; Anonson, June; Szafron, Michael
2015-01-01
The relationship between political environment and health services accessibility (HSA) has not been the focus of any specific studies. The purpose of this study was to address this gap in the literature by examining the relationship between political environment and HSA. This relationship that HSA indicators (physicians, nurses and hospital beds per 10 000 people) has with political environment was analyzed with multiple least-squares regression using the components of democracy (electoral processes and pluralism, functioning of government, political participation, political culture, and civil liberties). The components of democracy were represented by the 2011 Economist Intelligence Unit Democracy Index (EIUDI) sub-scores. The EIUDI sub-scores and the HSA indicators were evaluated for significant relationships with multiple least-squares regression. While controlling for a country's geographic location and level of democracy, we found that two components of a nation's political environment: functioning of government and political participation, and their interaction had significant relationships with the three HSA indicators. These study findings are of significance to health professionals because they examine the political contexts in which citizens access health services, they come from research that is the first of its kind, and they help explain the effect political environment has on health. © The Author 2014. Published by Oxford University Press on behalf of Royal Society of Tropical Medicine and Hygiene. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Regression models for categorical, count, and related variables an applied approach
Hoffmann, John P
2016-01-01
Social science and behavioral science students and researchers are often confronted with data that are categorical, count a phenomenon, or have been collected over time. Sociologists examining the likelihood of interracial marriage, political scientists studying voting behavior, criminologists counting the number of offenses people commit, health scientists studying the number of suicides across neighborhoods, and psychologists modeling mental health treatment success are all interested in outcomes that are not continuous. Instead, they must measure and analyze these events and phenomena in a discrete manner. This book provides an introduction and overview of several statistical models designed for these types of outcomes--all presented with the assumption that the reader has only a good working knowledge of elementary algebra and has taken introductory statistics and linear regression analysis. Numerous examples from the social sciences demonstrate the practical applications of these models. The chapte...
Dons, Evi; Van Poppel, Martine; Kochan, Bruno; Wets, Geert; Int Panis, Luc
2013-08-01
Land use regression (LUR) modeling is a statistical technique used to determine exposure to air pollutants in epidemiological studies. Time-activity diaries can be combined with LUR models, enabling detailed exposure estimation and limiting exposure misclassification, both in shorter and longer time lags. In this study, the traffic related air pollutant black carbon was measured with μ-aethalometers on a 5-min time base at 63 locations in Flanders, Belgium. The measurements show that hourly concentrations vary between different locations, but also over the day. Furthermore the diurnal pattern is different for street and background locations. This suggests that annual LUR models are not sufficient to capture all the variation. Hourly LUR models for black carbon are developed using different strategies: by means of dummy variables, with dynamic dependent variables and/or with dynamic and static independent variables. The LUR model with 48 dummies (weekday hours and weekend hours) performs not as good as the annual model (explained variance of 0.44 compared to 0.77 in the annual model). The dataset with hourly concentrations of black carbon can be used to recalibrate the annual model, resulting in many of the original explaining variables losing their statistical significance, and certain variables having the wrong direction of effect. Building new independent hourly models, with static or dynamic covariates, is proposed as the best solution to solve these issues. R2 values for hourly LUR models are mostly smaller than the R2 of the annual model, ranging from 0.07 to 0.8. Between 6 a.m. and 10 p.m. on weekdays the R2 approximates the annual model R2. Even though models of consecutive hours are developed independently, similar variables turn out to be significant. Using dynamic covariates instead of static covariates, i.e. hourly traffic intensities and hourly population densities, did not significantly improve the models' performance.
Parametric Bayesian Estimation of Differential Entropy and Relative Entropy
Directory of Open Access Journals (Sweden)
Maya Gupta
2010-04-01
Full Text Available Given iid samples drawn from a distribution with known parametric form, we propose the minimization of expected Bregman divergence to form Bayesian estimates of differential entropy and relative entropy, and derive such estimators for the uniform, Gaussian, Wishart, and inverse Wishart distributions. Additionally, formulas are given for a log gamma Bregman divergence and the differential entropy and relative entropy for the Wishart and inverse Wishart. The results, as always with Bayesian estimates, depend on the accuracy of the prior parameters, but example simulations show that the performance can be substantially improved compared to maximum likelihood or state-of-the-art nonparametric estimators.
Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...
Berry, D.P.; Buckley, F.; Dillon, P.; Evans, R.D.; Rath, M.; Veerkamp, R.F.
2003-01-01
(Co)variance components for milk yield, body condition score (BCS), body weight (BW), BCS change and BW change over different herd-year mean milk yields (HMY) and nutritional environments (concentrate feeding level, grazing severity and silage quality) were estimated using a random regression model.
Directory of Open Access Journals (Sweden)
Aaloke Mody
2018-06-01
Full Text Available Although randomized trials have established the clinical efficacy of treating all persons living with HIV (PLWHs, expanding treatment eligibility in the real world may have additional behavioral effects (e.g., changes in retention or lead to unintended consequences (e.g., crowding out sicker patients owing to increased patient volume. Using a regression discontinuity design, we sought to assess the effects of a previous change to Zambia's HIV treatment guidelines increasing the threshold for treatment eligibility from 350 to 500 cells/μL to anticipate effects of current global efforts to treat all PLWHs.We analyzed antiretroviral therapy (ART-naïve adults who newly enrolled in HIV care in a network of 64 clinics operated by the Zambian Ministry of Health and supported by the Centre for Infectious Disease Research in Zambia (CIDRZ. Patients were restricted to those enrolling in a narrow window around the April 1, 2014 change to Zambian HIV treatment guidelines that raised the CD4 threshold for treatment from 350 to 500 cells/μL (i.e., August 1, 2013, to November 1, 2014. Clinical and sociodemographic data were obtained from an electronic medical record system used in routine care. We used a regression discontinuity design to estimate the effects of this change in treatment eligibility on ART initiation within 3 months of enrollment, retention in care at 6 months (defined as clinic attendance between 3 and 9 months after enrollment, and a composite of both ART initiation by 3 months and retention in care at 6 months in all new enrollees. We also performed an instrumental variable (IV analysis to quantify the effect of actually initiating ART because of this guideline change on retention. Overall, 34,857 ART-naïve patients (39.1% male, median age 34 years [IQR 28-41], median CD4 268 cells/μL [IQR 134-430] newly enrolled in HIV care during this period; 23,036 were analyzed after excluding patients around the threshold to allow for clinic
Masselot, Pierre; Chebana, Fateh; Bélanger, Diane; St-Hilaire, André; Abdous, Belkacem; Gosselin, Pierre; Ouarda, Taha B. M. J.
2018-01-01
In a number of environmental studies, relationships between natural processes are often assessed through regression analyses, using time series data. Such data are often multi-scale and non-stationary, leading to a poor accuracy of the resulting regression models and therefore to results with moderate reliability. To deal with this issue, the present paper introduces the EMD-regression methodology consisting in applying the empirical mode decomposition (EMD) algorithm on data series and then using the resulting components in regression models. The proposed methodology presents a number of advantages. First, it accounts of the issues of non-stationarity associated to the data series. Second, this approach acts as a scan for the relationship between a response variable and the predictors at different time scales, providing new insights about this relationship. To illustrate the proposed methodology it is applied to study the relationship between weather and cardiovascular mortality in Montreal, Canada. The results shed new knowledge concerning the studied relationship. For instance, they show that the humidity can cause excess mortality at the monthly time scale, which is a scale not visible in classical models. A comparison is also conducted with state of the art methods which are the generalized additive models and distributed lag models, both widely used in weather-related health studies. The comparison shows that EMD-regression achieves better prediction performances and provides more details than classical models concerning the relationship.
DEFF Research Database (Denmark)
Frutiger, Jerome; Marcarie, Camille; Abildskov, Jens
2016-01-01
regression and outlier treatment have been applied to achieve high accuracy. Furthermore, linear error propagation based on covariance matrix of estimated parameters was performed. Therefore, every estimated property value of the flammability-related properties is reported together with its corresponding 95......%-confidence interval of the prediction. Compared to existing models the developed ones have a higher accuracy, are simple to apply and provide uncertainty information on the calculated prediction. The average relative error and correlation coefficient are 11.5% and 0.99 for LFL, 15.9% and 0.91 for UFL, 2...
Semantic relations and compound transparency: A regression study in CARIN theory
Directory of Open Access Journals (Sweden)
Pham Hien
2013-01-01
Full Text Available According to the CARIN theory of Gagné and Shoben (1997, conceptual relations play an important role in compound interpretation. This study develops three measures gauging the role of conceptual relations, and pits these measures against measures based on latent semantic analysis (Landauer & Dumais, 1997. The CARIN measures successfully predict response latencies in a familiarity categorization task, in a semantic transparency task, and in visual lexical decision. Of the measures based on latent semantic analysis, only a measure orthogonal to the conceptual relations, which instead gauges the extent to which the concepts for the compound’s head and the compound itself are discriminated, also reached significance. Results further indicate that in tasks requiring careful assessment of the meaning of the compound, general knowledge of conceptual relations plays a central role, whereas in the lexical decision task, attention shifts to co-activated meanings and the specifics of the conceptual relations realized in the compound’s modifier family.
Directory of Open Access Journals (Sweden)
Zhi-Sai Ma
2017-01-01
Full Text Available Modal parameter estimation plays an important role in vibration-based damage detection and is worth more attention and investigation, as changes in modal parameters are usually being used as damage indicators. This paper focuses on the problem of output-only modal parameter recursive estimation of time-varying structures based upon parameterized representations of the time-dependent autoregressive moving average (TARMA. A kernel ridge regression functional series TARMA (FS-TARMA recursive identification scheme is proposed and subsequently employed for the modal parameter estimation of a numerical three-degree-of-freedom time-varying structural system and a laboratory time-varying structure consisting of a simply supported beam and a moving mass sliding on it. The proposed method is comparatively assessed against an existing recursive pseudolinear regression FS-TARMA approach via Monte Carlo experiments and shown to be capable of accurately tracking the time-varying dynamics in a recursive manner.
Fragkaki, A G; Farmaki, E; Thomaidis, N; Tsantili-Kakoulidou, A; Angelis, Y S; Koupparis, M; Georgakopoulos, C
2012-09-21
The comparison among different modelling techniques, such as multiple linear regression, partial least squares and artificial neural networks, has been performed in order to construct and evaluate models for prediction of gas chromatographic relative retention times of trimethylsilylated anabolic androgenic steroids. The performance of the quantitative structure-retention relationship study, using the multiple linear regression and partial least squares techniques, has been previously conducted. In the present study, artificial neural networks models were constructed and used for the prediction of relative retention times of anabolic androgenic steroids, while their efficiency is compared with that of the models derived from the multiple linear regression and partial least squares techniques. For overall ranking of the models, a novel procedure [Trends Anal. Chem. 29 (2010) 101-109] based on sum of ranking differences was applied, which permits the best model to be selected. The suggested models are considered useful for the estimation of relative retention times of designer steroids for which no analytical data are available. Copyright © 2012 Elsevier B.V. All rights reserved.
Fitzpatrick, Cole D; Rakasi, Saritha; Knodler, Michael A
2017-01-01
Speed is one of the most important factors in traffic safety as higher speeds are linked to increased crash risk and higher injury severities. Nearly a third of fatal crashes in the United States are designated as "speeding-related", which is defined as either "the driver behavior of exceeding the posted speed limit or driving too fast for conditions." While many studies have utilized the speeding-related designation in safety analyses, no studies have examined the underlying accuracy of this designation. Herein, we investigate the speeding-related crash designation through the development of a series of logistic regression models that were derived from the established speeding-related crash typologies and validated using a blind review, by multiple researchers, of 604 crash narratives. The developed logistic regression model accurately identified crashes which were not originally designated as speeding-related but had crash narratives that suggested speeding as a causative factor. Only 53.4% of crashes designated as speeding-related contained narratives which described speeding as a causative factor. Further investigation of these crashes revealed that the driver contributing code (DCC) of "driving too fast for conditions" was being used in three separate situations. Additionally, this DCC was also incorrectly used when "exceeding the posted speed limit" would likely have been a more appropriate designation. Finally, it was determined that the responding officer only utilized one DCC in 82% of crashes not designated as speeding-related but contained a narrative indicating speed as a contributing causal factor. The use of logistic regression models based upon speeding-related crash typologies offers a promising method by which all possible speeding-related crashes could be identified. Published by Elsevier Ltd.
International Nuclear Information System (INIS)
Briggs, D.J.; De Hoogh, C.; Elliot, P.; Gulliver, J.; Wills, J.; Kingham, S.; Smallbone, K.
2000-01-01
Accurate, high-resolution maps of traffic-related air pollution are needed both as a basis for assessing exposures as part of epidemiological studies, and to inform urban air-quality policy and traffic management. This paper assesses the use of a GIS-based, regression mapping technique to model spatial patterns of traffic-related air pollution. The model - developed using data from 80 passive sampler sites in Huddersfield, as part of the SAVIAH (Small Area Variations in Air Quality and Health) project - uses data on traffic flows and land cover in the 300-m buffer zone around each site, and altitude of the site, as predictors of NO 2 concentrations. It was tested here by application in four urban areas in the UK: Huddersfield (for the year following that used for initial model development), Sheffield, Northampton, and part of London. In each case, a GIS was built in ArcInfo, integrating relevant data on road traffic, urban land use and topography. Monitoring of NO 2 was undertaken using replicate passive samplers (in London, data were obtained from surveys carried out as part of the London network). In Huddersfield, Sheffield and Northampton, the model was first calibrated by comparing modelled results with monitored NO 2 concentrations at 10 randomly selected sites; the calibrated model was then validated against data from a further 10-28 sites. In London, where data for only 11 sites were available, validation was not undertaken. Results showed that the model performed well in all cases. After local calibration, the model gave estimates of mean annual NO 2 concentrations within a factor of 1.5 of the actual mean (approx. 70-90%) of the time and within a factor of 2 between 70 and 100% of the time. r 2 values between modelled and observed concentrations are in the range of 0.58-0.76. These results are comparable to those achieved by more sophisticated dispersion models. The model also has several advantages over dispersion modelling. It is able, for example, to
Parametric Bayesian Estimation of Differential Entropy and Relative Entropy
Gupta; Srivastava
2010-01-01
Given iid samples drawn from a distribution with known parametric form, we propose the minimization of expected Bregman divergence to form Bayesian estimates of differential entropy and relative entropy, and derive such estimators for the uniform, Gaussian, Wishart, and inverse Wishart distributions. Additionally, formulas are given for a log gamma Bregman divergence and the differential entropy and relative entropy for the Wishart and inverse Wishart. The results, as always with Bayesian est...
Najera-Zuloaga, Josu; Lee, Dae-Jin; Arostegui, Inmaculada
2017-01-01
Health-related quality of life has become an increasingly important indicator of health status in clinical trials and epidemiological research. Moreover, the study of the relationship of health-related quality of life with patients and disease characteristics has become one of the primary aims of many health-related quality of life studies. Health-related quality of life scores are usually assumed to be distributed as binomial random variables and often highly skewed. The use of the beta-binomial distribution in the regression context has been proposed to model such data; however, the beta-binomial regression has been performed by means of two different approaches in the literature: (i) beta-binomial distribution with a logistic link; and (ii) hierarchical generalized linear models. None of the existing literature in the analysis of health-related quality of life survey data has performed a comparison of both approaches in terms of adequacy and regression parameter interpretation context. This paper is motivated by the analysis of a real data application of health-related quality of life outcomes in patients with Chronic Obstructive Pulmonary Disease, where the use of both approaches yields to contradictory results in terms of covariate effects significance and consequently the interpretation of the most relevant factors in health-related quality of life. We present an explanation of the results in both methodologies through a simulation study and address the need to apply the proper approach in the analysis of health-related quality of life survey data for practitioners, providing an R package.
Bolarinwa, O A; Adeola, O
2012-12-01
Digestible and metabolizable energy contents of feed ingredients for pigs can be determined by direct or indirect methods. There are situations when only the indirect approach is suitable and the regression method is a robust indirect approach. This study was conducted to compare the direct and regression methods for determining the energy value of wheat for pigs. Twenty-four barrows with an average initial BW of 31 kg were assigned to 4 diets in a randomized complete block design. The 4 diets consisted of 969 g wheat/kg plus minerals and vitamins (sole wheat) for the direct method, corn (Zea mays)-soybean (Glycine max) meal reference diet (RD), RD + 300 g wheat/kg, and RD + 600 g wheat/kg. The 3 corn-soybean meal diets were used for the regression method and wheat replaced the energy-yielding ingredients, corn and soybean meal, so that the same ratio of corn and soybean meal across the experimental diets was maintained. The wheat used was analyzed to contain 883 g DM, 15.2 g N, and 3.94 Mcal GE/kg. Each diet was fed to 6 barrows in individual metabolism crates for a 5-d acclimation followed by a 5-d total but separate collection of feces and urine. The DE and ME for the sole wheat diet were 3.83 and 3.77 Mcal/kg DM, respectively. Because the sole wheat diet contained 969 g wheat/kg, these translate to 3.95 Mcal DE/kg DM and 3.89 Mcal ME/kg DM. The RD used for the regression approach yielded 4.00 Mcal DE and 3.91 Mcal ME/kg DM diet. Increasing levels of wheat in the RD linearly reduced (P direct method (3.95 and 3.89 Mcal/kg DM) did not differ (0.78 < P < 0.89) from those obtained using the regression method (3.96 and 3.88 Mcal/kg DM).
DEFF Research Database (Denmark)
Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise...
Blind estimation of a ship's relative wave heading
DEFF Research Database (Denmark)
Nielsen, Ulrik Dam; Iseki, Toshio
2012-01-01
This article proposes a method to estimate a ship’s relative heading against the waves. The procedure relies purely on ship- board measurements of global responses such as motion components, accelerations and the bending moment amidships. There is no particular (mathematical) model connected to t...... to the estimate, and therefore it is called a ’blind estimate’. The approach is in this introductory study tested by analysing simulated data. The analysis reveals that it is possible to estimate a ship’s relative heading on the basis of shipboard measurements only....
Oden, Timothy D.; Asquith, William H.; Milburn, Matthew S.
2009-01-01
In December 2005, the U.S. Geological Survey in cooperation with the City of Houston, Texas, began collecting discrete water-quality samples for nutrients, total organic carbon, bacteria (total coliform and Escherichia coli), atrazine, and suspended sediment at two U.S. Geological Survey streamflow-gaging stations upstream from Lake Houston near Houston (08068500 Spring Creek near Spring, Texas, and 08070200 East Fork San Jacinto River near New Caney, Texas). The data from the discrete water-quality samples collected during 2005-07, in conjunction with monitored real-time data already being collected - physical properties (specific conductance, pH, water temperature, turbidity, and dissolved oxygen), streamflow, and rainfall - were used to develop regression models for predicting water-quality constituent concentrations for inflows to Lake Houston. Rainfall data were obtained from a rain gage monitored by Harris County Homeland Security and Emergency Management and colocated with the Spring Creek station. The leaps and bounds algorithm was used to find the best subsets of possible regression models (minimum residual sum of squares for a given number of variables). The potential explanatory or predictive variables included discharge (streamflow), specific conductance, pH, water temperature, turbidity, dissolved oxygen, rainfall, and time (to account for seasonal variations inherent in some water-quality data). The response variables at each site were nitrite plus nitrate nitrogen, total phosphorus, organic carbon, Escherichia coli, atrazine, and suspended sediment. The explanatory variables provide easily measured quantities as a means to estimate concentrations of the various constituents under investigation, with accompanying estimates of measurement uncertainty. Each regression equation can be used to estimate concentrations of a given constituent in real time. In conjunction with estimated concentrations, constituent loads were estimated by multiplying the
Arjmand, N; Ekrami, O; Shirazi-Adl, A; Plamondon, A; Parnianpour, M
2013-05-31
Two artificial neural networks (ANNs) are constructed, trained, and tested to map inputs of a complex trunk finite element (FE) model to its outputs for spinal loads and muscle forces. Five input variables (thorax flexion angle, load magnitude, its anterior and lateral positions, load handling technique, i.e., one- or two-handed static lifting) and four model outputs (L4-L5 and L5-S1 disc compression and anterior-posterior shear forces) for spinal loads and 76 model outputs (forces in individual trunk muscles) are considered. Moreover, full quadratic regression equations mapping input-outputs of the model developed here for muscle forces and previously for spine loads are used to compare the relative accuracy of these two mapping tools (ANN and regression equations). Results indicate that the ANNs are more accurate in mapping input-output relationships of the FE model (RMSE= 20.7 N for spinal loads and RMSE= 4.7 N for muscle forces) as compared to regression equations (RMSE= 120.4 N for spinal loads and RMSE=43.2 N for muscle forces). Quadratic regression equations map up to second order variations of outputs with inputs while ANNs capture higher order variations too. Despite satisfactory achievement in estimating overall muscle forces by the ANN, some inadequacies are noted including assigning force to antagonistic muscles with no activity in the optimization algorithm of the FE model or predicting slightly different forces in bilateral pair muscles in symmetric lifting activities. Using these user-friendly tools spine loads and trunk muscle forces during symmetric and asymmetric static lifts can be easily estimated. Copyright © 2013 Elsevier Ltd. All rights reserved.
Ali, Emad Abdulgabbar; Zhandi, Mahdi; Towhidi, Armin; Zaghari, Mojtaba; Ansari, Mahdi; Najafi, Mojtaba; Deldar, Hamid
2017-08-01
This study was designed to evaluate orally administrated Letrozole (Lz) on reproductive performance, plasma testosterone and estradiol concentrations and relative abundance of mRNA of GnRH, FSH and LH in roosters. Ross 308 roosters (n=32) that were 40-weeks of age were individually housed and received a basal standard diet supplemented different amounts of capsulated Lz [0 (Lz-0), 0.5 (Lz-0.5), 1 (Lz-1) or 1.5 (Lz-1.5), mg Lz/bird/day] for 12 weeks. Sperm quality variables and plasma testosterone and estradiol concentrations were assessed from the first to the tenth week of the treatment period. Semen samples from the 11th to 12th week were used for artificial insemination and eggs were collected and allotted to assess fertility and hatchability rates. Relative abundance of hypothalamic and pituitary GnRH, LH and FSH mRNA was evaluated at the end of 12th week. The results indicated that total and forward sperm motility as well as egg hatchability rate were greater in the Lz-0.5 group. Greater sperm concentrations, ejaculate volume, sperm plasma membrane integrity, testis index and fertility rates were recorded for both Lz-0.5 and Lz-1 groups compared with the Lz-0 group (Proosters. Copyright © 2017 Elsevier B.V. All rights reserved.
Support vector regression model for the estimation of γ-ray buildup factors for multi-layer shields
International Nuclear Information System (INIS)
Trontl, Kresimir; Smuc, Tomislav; Pevec, Dubravko
2007-01-01
The accuracy of the point-kernel method, which is a widely used practical tool for γ-ray shielding calculations, strongly depends on the quality and accuracy of buildup factors used in the calculations. Although, buildup factors for single-layer shields comprised of a single material are well known, calculation of buildup factors for stratified shields, each layer comprised of different material or a combination of materials, represent a complex physical problem. Recently, a new compact mathematical model for multi-layer shield buildup factor representation has been suggested for embedding into point-kernel codes thus replacing traditionally generated complex mathematical expressions. The new regression model is based on support vector machines learning technique, which is an extension of Statistical Learning Theory. The paper gives complete description of the novel methodology with results pertaining to realistic engineering multi-layer shielding geometries. The results based on support vector regression machine learning confirm that this approach provides a framework for general, accurate and computationally acceptable multi-layer buildup factor model
Shinozuka, Jun; Awaguni, Hitoshi; Tanaka, Shin-Ichiro; Makino, Shigeru; Maruyama, Rikken; Inaba, Tohru; Imashuku, Shinsaku
2016-07-01
Pulmonary nodules associated with Epstein-Barr virus (EBV)-related atypical infectious mononucleosis have rarely been described. A 12-year-old Japanese boy, upon admission, revealed multiple small round nodules (a total of 7 nodules in 4 to 8 mm size) in the lungs on computed tomography. The hemorrhagic pharyngeal tonsils with hot signals on 18F-fluorodeoxyglucose-positron emission tomography-computed tomography were biopsied revealing the presence of EBV-encoded small nuclear RNA (EBER)-positive cells; however, no lymphoma was noted. The patient was diagnosed as having atypical EBV-infectious mononucleosis associated with primary EBV infection. Pulmonary nodules markedly reduced in numbers and sizes spontaneously over a 2-year period. Differential diagnosis of pulmonary nodules in childhood should include atypical EBV infection.
Baker, Ronald J.; Chepiga, Mary M.; Cauller, Stephen J.
2015-01-01
Nitrate-concentration data are used in conjunction with land-use and land-cover data to estimate median nitrate concentrations in groundwater underlying the New Jersey (NJ) Highlands Region. Sources of data on nitrate in 19,670 groundwater samples are from the U.S. Geological Survey (USGS) National Water Information System (NWIS) and the NJ Private Well Testing Act (PWTA).
State-Level Estimates of Cancer-Related Absenteeism Costs
Tangka, Florence K.; Trogdon, Justin G.; Nwaise, Isaac; Ekwueme, Donatus U.; Guy, Gery P.; Orenstein, Diane
2016-01-01
Background Cancer is one of the top five most costly diseases in the United States and leads to substantial work loss. Nevertheless, limited state-level estimates of cancer absenteeism costs have been published. Methods In analyses of data from the 2004–2008 Medical Expenditure Panel Survey, the 2004 National Nursing Home Survey, the U.S. Census Bureau for 2008, and the 2009 Current Population Survey, we used regression modeling to estimate annual state-level absenteeism costs attributable to cancer from 2004 to 2008. Results We estimated that the state-level median number of days of absenteeism per year among employed cancer patients was 6.1 days and that annual state-level cancer absenteeism costs ranged from $14.9 million to $915.9 million (median = $115.9 million) across states in 2010 dollars. Absenteeism costs are approximately 6.5% of the costs of premature cancer mortality. Conclusions The results from this study suggest that lost productivity attributable to cancer is a substantial cost to employees and employers and contributes to estimates of the overall impact of cancer in a state population. PMID:23969498
Nazeer, Majid; Bilal, Muhammad
2018-04-01
Landsat-5 Thematic Mapper (TM) dataset have been used to estimate salinity in the coastal area of Hong Kong. Four adjacent Landsat TM images were used in this study, which was atmospherically corrected using the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) radiative transfer code. The atmospherically corrected images were further used to develop models for salinity using Ordinary Least Square (OLS) regression and Geographically Weighted Regression (GWR) based on in situ data of October 2009. Results show that the coefficient of determination ( R 2) of 0.42 between the OLS estimated and in situ measured salinity is much lower than that of the GWR model, which is two times higher ( R 2 = 0.86). It indicates that the GWR model has more ability than the OLS regression model to predict salinity and show its spatial heterogeneity better. It was observed that the salinity was high in Deep Bay (north-western part of Hong Kong) which might be due to the industrial waste disposal, whereas the salinity was estimated to be constant (32 practical salinity units) towards the open sea.
Tang, Yongqiang
2015-01-01
A sample size formula is derived for negative binomial regression for the analysis of recurrent events, in which subjects can have unequal follow-up time. We obtain sharp lower and upper bounds on the required size, which is easy to compute. The upper bound is generally only slightly larger than the required size, and hence can be used to approximate the sample size. The lower and upper size bounds can be decomposed into two terms. The first term relies on the mean number of events in each group, and the second term depends on two factors that measure, respectively, the extent of between-subject variability in event rates, and follow-up time. Simulation studies are conducted to assess the performance of the proposed method. An application of our formulae to a multiple sclerosis trial is provided.
NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel
2017-08-01
Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Estimating maneuvers for precise relative orbit determination using GPS
Allende-Alba, Gerardo; Montenbruck, Oliver; Ardaens, Jean-Sébastien; Wermuth, Martin; Hugentobler, Urs
2017-01-01
Precise relative orbit determination is an essential element for the generation of science products from distributed instrumentation of formation flying satellites in low Earth orbit. According to the mission profile, the required formation is typically maintained and/or controlled by executing maneuvers. In order to generate consistent and precise orbit products, a strategy for maneuver handling is mandatory in order to avoid discontinuities or precision degradation before, after and during maneuver execution. Precise orbit determination offers the possibility of maneuver estimation in an adjustment of single-satellite trajectories using GPS measurements. However, a consistent formulation of a precise relative orbit determination scheme requires the implementation of a maneuver estimation strategy which can be used, in addition, to improve the precision of maneuver estimates by drawing upon the use of differential GPS measurements. The present study introduces a method for precise relative orbit determination based on a reduced-dynamic batch processing of differential GPS pseudorange and carrier phase measurements, which includes maneuver estimation as part of the relative orbit adjustment. The proposed method has been validated using flight data from space missions with different rates of maneuvering activity, including the GRACE, TanDEM-X and PRISMA missions. The results show the feasibility of obtaining precise relative orbits without degradation in the vicinity of maneuvers as well as improved maneuver estimates that can be used for better maneuver planning in flight dynamics operations.
Estimating one's own and one's relatives' multiple intelligence: a study from Argentina.
Furnham, Adrian; Chamorro-Premuzic, Tomas
2005-05-01
Participants from Argentina (N = 217) estimated their own, their partner's, their parents' and their grandparents' overall and multiple intelligences. The Argentinean data showed that men gave higher overall estimates than women (M = 110.4 vs. 105.1) as well as higher estimates on mathematical and spatial intelligence. Participants thought themselves slightly less bright than their fathers (2 IQ points) but brighter than their mothers (6 points), their grandfathers (8 points), but especially their grandmothers (11 points). Regressions showed that participants thought verbal and mathematical IQ to be the best predictors of overall IQ. Results were broadly in agreement with other studies in the area. A comparison was also made with British data using the same questionnaire. British participants tended to give significantly higher self-estimates than for relatives, though the pattern was generally similar. Results are discussed in terms of the studies in the field.
Directory of Open Access Journals (Sweden)
Yaxiong Ma
2018-03-01
Full Text Available Housing is a key component of urban sustainability. The objective of this study was to assess the significance of key spatial determinants of median home price in towns in Massachusetts that impact sustainable growth. Our analysis investigates the presence or absence of spatial non-stationarity in the relationship between sustainable growth, measured in terms of the relationship between home values and various parameters including the amount of unprotected forest land, residential land, unemployment, education, vehicle ownership, accessibility to commuter rail stations, school district performance, and senior population. We use the standard geographically weighted regression (GWR and Mixed GWR models to analyze the effects of spatial non-stationarity. Mixed GWR performed better than GWR in terms of Akaike Information Criterion (AIC values. Our findings highlight the nature and spatial extent of the non-stationary vs. stationary qualities of key environmental and social determinants of median home price. Understanding the key determinants of housing values, such as valuation of green spaces, public school performance metrics, and proximity to public transport, enable towns to use different strategies of sustainable urban planning, while understanding urban housing determinants—such as unemployment and senior population—can help modify urban sustainable housing policies.
Anand, Jasdeep S.; Monks, Paul S.
2017-07-01
Land use regression (LUR) models have been used in epidemiology to determine the fine-scale spatial variation in air pollutants such as nitrogen dioxide (NO2) in cities and larger regions. However, they are often limited in their temporal resolution, which may potentially be rectified by employing the synoptic coverage provided by satellite measurements. In this work a mixed-effects LUR model is developed to model daily surface NO2 concentrations over the Hong Kong SAR during the period 2005-2015. In situ measurements from the Hong Kong Air Quality Monitoring Network, along with tropospheric vertical column density (VCD) data from the OMI, GOME-2A, and SCIAMACHY satellite instruments were combined with fine-scale land use parameters to provide the spatiotemporal information necessary to predict daily surface concentrations. Cross-validation with the in situ data shows that the mixed-effects LUR model using OMI data has a high predictive power (adj. R2 = 0. 84), especially when compared with surface concentrations derived using the MACC-II reanalysis model dataset (adj. R2 = 0. 11). Time series analysis shows no statistically significant trend in NO2 concentrations during 2005-2015, despite a reported decline in NOx emissions. This study demonstrates the utility in combining satellite data with LUR models to derive daily maps of ambient surface NO2 for use in exposure studies.
Estimating relative demand for wildlife: Conservation activity indicators
Gray, Gary G.; Larson, Joseph S.
1982-09-01
An alternative method of estimating relative demand among nonconsumptive uses of wildlife and among wildlife species is proposed. A demand intensity score (DIS), derived from the relative extent of an individual's involvement in outdoor recreation and conservation activities, is used as a weighting device to adjust the importance of preference rankings for wildlife uses and wildlife species relative to other members of a survey population. These adjusted preference rankings were considered to reflect relative demand levels (RDLs) for wildlife uses and for species by the survey population. This technique may be useful where it is not possible or desirable to estimate demand using traditional economic means. In one of the findings from a survey of municipal conservation commission members in Massachusetts, presented as an illustration of this methodology, poisonous snakes were ranked third in preference among five groups of reptiles. The relative demand level for poisonous snakes, however, was last among the five groups.
Lee, Eugenia E; Stewart, Barclay; Zha, Yuanting A; Groen, Thomas A; Burkle, Frederick M; Kushner, Adam L
2016-08-10
Climate extremes will increase the frequency and severity of natural disasters worldwide. Climate-related natural disasters were anticipated to affect 375 million people in 2015, more than 50% greater than the yearly average in the previous decade. To inform surgical assistance preparedness, we estimated the number of surgical procedures needed. The numbers of people affected by climate-related disasters from 2004 to 2014 were obtained from the Centre for Research of the Epidemiology of Disasters database. Using 5,000 procedures per 100,000 persons as the minimum, baseline estimates were calculated. A linear regression of the number of surgical procedures performed annually and the estimated number of surgical procedures required for climate-related natural disasters was performed. Approximately 140 million people were affected by climate-related natural disasters annually requiring 7.0 million surgical procedures. The greatest need for surgical care was in the People's Republic of China, India, and the Philippines. Linear regression demonstrated a poor relationship between national surgical capacity and estimated need for surgical care resulting from natural disaster, but countries with the least surgical capacity will have the greatest need for surgical care for persons affected by climate-related natural disasters. As climate extremes increase the frequency and severity of natural disasters, millions will need surgical care beyond baseline needs. Countries with insufficient surgical capacity will have the most need for surgical care for persons affected by climate-related natural disasters. Estimates of surgical are particularly important for countries least equipped to meet surgical care demands given critical human and physical resource deficiencies.
Bolarinwa, O A; Adeola, O
2016-02-01
Direct or indirect methods can be used to determine the DE and ME of feed ingredients for pigs. In situations when only the indirect approach is suitable, the regression method presents a robust indirect approach. Three experiments were conducted to compare the direct and regression methods for determining the DE and ME values of barley, sorghum, and wheat for pigs. In each experiment, 24 barrows with an average initial BW of 31, 32, and 33 kg were assigned to 4 diets in a randomized complete block design. The 4 diets consisted of 969 g barley, sorghum, or wheat/kg plus minerals and vitamins for the direct method; a corn-soybean meal reference diet (RD); the RD + 300 g barley, sorghum, or wheat/kg; and the RD + 600 g barley, sorghum, or wheat/kg. The 3 corn-soybean meal diets were used for the regression method. Each diet was fed to 6 barrows in individual metabolism crates for a 5-d acclimation followed by a 5-d period of total but separate collection of feces and urine in each experiment. Graded substitution of barley or wheat, but not sorghum, into the RD linearly reduced ( direct method-derived DE and ME for barley were 3,669 and 3,593 kcal/kg DM, respectively. The regressions of barley contribution to DE and ME in kilocalories against the quantity of barley DMI in kilograms generated 3,746 kcal DE/kg DM and 3,647 kcal ME/kg DM. The DE and ME for sorghum by the direct method were 4,097 and 4,042 kcal/kg DM, respectively; the corresponding regression-derived estimates were 4,145 and 4,066 kcal/kg DM. Using the direct method, energy values for wheat were 3,953 kcal DE/kg DM and 3,889 kcal ME/kg DM. The regressions of wheat contribution to DE and ME in kilocalories against the quantity of wheat DMI in kilograms generated 3,960 kcal DE/kg DM and 3,874 kcal ME/kg DM. The DE and ME of barley using the direct method were not different (0.3 direct method-derived DE and ME of sorghum were not different (0.5 direct method- and regression method-derived DE (3,953 and 3
Relative azimuth inversion by way of damped maximum correlation estimates
Ringler, A.T.; Edwards, J.D.; Hutt, C.R.; Shelly, F.
2012-01-01
Horizontal seismic data are utilized in a large number of Earth studies. Such work depends on the published orientations of the sensitive axes of seismic sensors relative to true North. These orientations can be estimated using a number of different techniques: SensOrLoc (Sensitivity, Orientation and Location), comparison to synthetics (Ekstrom and Busby, 2008), or by way of magnetic compass. Current methods for finding relative station azimuths are unable to do so with arbitrary precision quickly because of limitations in the algorithms (e.g. grid search methods). Furthermore, in order to determine instrument orientations during station visits, it is critical that any analysis software be easily run on a large number of different computer platforms and the results be obtained quickly while on site. We developed a new technique for estimating relative sensor azimuths by inverting for the orientation with the maximum correlation to a reference instrument, using a non-linear parameter estimation routine. By making use of overlapping windows, we are able to make multiple azimuth estimates, which helps to identify the confidence of our azimuth estimate, even when the signal-to-noise ratio (SNR) is low. Finally, our algorithm has been written as a stand-alone, platform independent, Java software package with a graphical user interface for reading and selecting data segments to be analyzed.
The Hurst Phenomenon in Error Estimates Related to Atmospheric Turbulence
Dias, Nelson Luís; Crivellaro, Bianca Luhm; Chamecki, Marcelo
2018-05-01
The Hurst phenomenon is a well-known feature of long-range persistence first observed in hydrological and geophysical time series by E. Hurst in the 1950s. It has also been found in several cases in turbulence time series measured in the wind tunnel, the atmosphere, and in rivers. Here, we conduct a systematic investigation of the value of the Hurst coefficient H in atmospheric surface-layer data, and its impact on the estimation of random errors. We show that usually H > 0.5 , which implies the non-existence (in the statistical sense) of the integral time scale. Since the integral time scale is present in the Lumley-Panofsky equation for the estimation of random errors, this has important practical consequences. We estimated H in two principal ways: (1) with an extension of the recently proposed filtering method to estimate the random error (H_p ), and (2) with the classical rescaled range introduced by Hurst (H_R ). Other estimators were tried but were found less able to capture the statistical behaviour of the large scales of turbulence. Using data from three micrometeorological campaigns we found that both first- and second-order turbulence statistics display the Hurst phenomenon. Usually, H_R is larger than H_p for the same dataset, raising the question that one, or even both, of these estimators, may be biased. For the relative error, we found that the errors estimated with the approach adopted by us, that we call the relaxed filtering method, and that takes into account the occurrence of the Hurst phenomenon, are larger than both the filtering method and the classical Lumley-Panofsky estimates. Finally, we found that there is no apparent relationship between H and the Obukhov stability parameter. The relative errors, however, do show stability dependence, particularly in the case of the error of the kinematic momentum flux in unstable conditions, and that of the kinematic sensible heat flux in stable conditions.
Abdulredha, Muhammad; Al Khaddar, Rafid; Jordan, David; Kot, Patryk; Abdulridha, Ali; Hashim, Khalid
2018-04-26
Major-religious festivals hosted in the city of Kerbala, Iraq, annually generate large quantities of Municipal Solid Waste (MSW) which negatively impacts the environment and human health when poorly managed. The hospitality sector, specifically hotels, is one of the major sources of MSW generated during these festivals. Because it is essential to establish a proper waste management system for such festivals, accurate information regarding MSW generation is required. This study therefore investigated the rate of production of MSW from hotels in Kerbala during major festivals. A field questionnaire survey was conducted with 150 hotels during the Arba'een festival, one of the largest festivals in the world, attended by about 18 million participants, to identify how much MSW is produced and what features of hotels impact on this. Hotel managers responded to questions regarding features of the hotel such as size (Hs), expenditure (Hex), area (Ha) and number of staff (Hst). An on-site audit was also carried out with all participating hotels to estimate the mass of MSW generated from these hotels. The results indicate that MSW produced by hotels varies widely. In general, it was found that each hotel guest produces an estimated 0.89 kg of MSW per day. However, this figure varies according to the hotels' rating. Average rates of MSW production from one and four star hotels were 0.83 and 1.22 kg per guest per day, respectively. Statistically, it was found that the relationship between MSW production and hotel features can be modelled with an R 2 of 0.799, where the influence of hotel feature on MSW production followed the order Hs > Hex > Hst. Copyright © 2018 Elsevier Ltd. All rights reserved.
Estimating Body Related Soft Biometric Traits in Video Frames
Directory of Open Access Journals (Sweden)
Olasimbo Ayodeji Arigbabu
2014-01-01
Full Text Available Soft biometrics can be used as a prescreening filter, either by using single trait or by combining several traits to aid the performance of recognition systems in an unobtrusive way. In many practical visual surveillance scenarios, facial information becomes difficult to be effectively constructed due to several varying challenges. However, from distance the visual appearance of an object can be efficiently inferred, thereby providing the possibility of estimating body related information. This paper presents an approach for estimating body related soft biometrics; specifically we propose a new approach based on body measurement and artificial neural network for predicting body weight of subjects and incorporate the existing technique on single view metrology for height estimation in videos with low frame rate. Our evaluation on 1120 frame sets of 80 subjects from a newly compiled dataset shows that the mentioned soft biometric information of human subjects can be adequately predicted from set of frames.
Galloway, Joel M.
2014-01-01
The Red River of the North (hereafter referred to as “Red River”) Basin is an important hydrologic region where water is a valuable resource for the region’s economy. Continuous water-quality monitors have been operated by the U.S. Geological Survey, in cooperation with the North Dakota Department of Health, Minnesota Pollution Control Agency, City of Fargo, City of Moorhead, City of Grand Forks, and City of East Grand Forks at the Red River at Fargo, North Dakota, from 2003 through 2012 and at Grand Forks, N.Dak., from 2007 through 2012. The purpose of the monitoring was to provide a better understanding of the water-quality dynamics of the Red River and provide a way to track changes in water quality. Regression equations were developed that can be used to estimate concentrations and loads for dissolved solids, sulfate, chloride, nitrate plus nitrite, total phosphorus, and suspended sediment using explanatory variables such as streamflow, specific conductance, and turbidity. Specific conductance was determined to be a significant explanatory variable for estimating dissolved solids concentrations at the Red River at Fargo and Grand Forks. The regression equations provided good relations between dissolved solid concentrations and specific conductance for the Red River at Fargo and at Grand Forks, with adjusted coefficients of determination of 0.99 and 0.98, respectively. Specific conductance, log-transformed streamflow, and a seasonal component were statistically significant explanatory variables for estimating sulfate in the Red River at Fargo and Grand Forks. Regression equations provided good relations between sulfate concentrations and the explanatory variables, with adjusted coefficients of determination of 0.94 and 0.89, respectively. For the Red River at Fargo and Grand Forks, specific conductance, streamflow, and a seasonal component were statistically significant explanatory variables for estimating chloride. For the Red River at Grand Forks, a time
Directory of Open Access Journals (Sweden)
Ajay Singh
2016-06-01
Full Text Available A single trait linear mixed random regression test-day model was applied for the first time for analyzing the first lactation monthly test-day milk yield records in Karan Fries cattle. The test-day milk yield data was modeled using a random regression model (RRM considering different order of Legendre polynomial for the additive genetic effect (4th order and the permanent environmental effect (5th order. Data pertaining to 1,583 lactation records spread over a period of 30 years were recorded and analyzed in the study. The variance component, heritability and genetic correlations among test-day milk yields were estimated using RRM. RRM heritability estimates of test-day milk yield varied from 0.11 to 0.22 in different test-day records. The estimates of genetic correlations between different test-day milk yields ranged 0.01 (test-day 1 [TD-1] and TD-11 to 0.99 (TD-4 and TD-5. The magnitudes of genetic correlations between test-day milk yields decreased as the interval between test-days increased and adjacent test-day had higher correlations. Additive genetic and permanent environment variances were higher for test-day milk yields at both ends of lactation. The residual variance was observed to be lower than the permanent environment variance for all the test-day milk yields.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
Steganalysis using logistic regression
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Shastri, Niket; Pathak, Kamlesh
2018-05-01
The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.
Barunik, Jozef; Barunikova, Michaela
2015-01-01
This paper revisits the fractional co-integrating relationship between ex-ante implied volatility and ex-post realized volatility. Previous studies on stock index options have found biases and inefficiencies in implied volatility as a forecast of future volatility. It is argued that the concept of corridor implied volatility (CIV) should be used instead of the popular model-free option-implied volatility (MFIV) when assessing the relation as the latter may introduce bias to the estimation. In...
Estimation of nuclear power-related expenditures in fiscal 1982
International Nuclear Information System (INIS)
1981-01-01
In fiscal 1982 (April to March, 1983), the research and development on nuclear power should be promoted actively and extensively by taking the appropriate measures. In view of the importance, the budgetary expenditures are to be estimated duly for the purpose, considering also the stringent financial situation. The budgetary expenditures for nuclear power estimated for the fiscal year 1982 are about 292,800 Million in total and the obligation act limit is about 139,900 Million. The following matters are described: nuclear power-related measures for securing nuclear power safety, promotion of nuclear power generation, establishment of the nuclear fuel cycle, development of power reactors, research on nuclear fusion, strengthening of the foundation in nuclear power research, development and utilization, promotion of international cooperation, etc.; estimated budgetary expenditures; tables of budgetary demands in various categories. (J.P.N.)
Ichii, Kazuhito; Ueyama, Masahito; Kondo, Masayuki; Saigusa, Nobuko; Kim, Joon; Alberto, Ma. Carmelita; Ardö, Jonas; Euskirchen, Eugénie S.; Kang, Minseok; Hirano, Takashi; Joiner, Joanna; Kobayashi, Hideki; Marchesini, Luca Belelli; Merbold, Lutz; Miyata, Akira; Saitoh, Taku M.; Takagi, Kentaro; Varlagin, Andrej; Bret-Harte, M. Syndonia; Kitamura, Kenzo; Kosugi, Yoshiko; Kotani, Ayumi; Kumar, Kireet; Li, Sheng-Gong; Machimura, Takashi; Matsuura, Yojiro; Mizoguchi, Yasuko; Ohta, Takeshi; Mukherjee, Sandipan; Yanagi, Yuji; Yasuda, Yukio; Zhang, Yiping; Zhao, Fenghua
2017-04-01
The lack of a standardized database of eddy covariance observations has been an obstacle for data-driven estimation of terrestrial CO2 fluxes in Asia. In this study, we developed such a standardized database using 54 sites from various databases by applying consistent postprocessing for data-driven estimation of gross primary productivity (GPP) and net ecosystem CO2 exchange (NEE). Data-driven estimation was conducted by using a machine learning algorithm: support vector regression (SVR), with remote sensing data for 2000 to 2015 period. Site-level evaluation of the estimated CO2 fluxes shows that although performance varies in different vegetation and climate classifications, GPP and NEE at 8 days are reproduced (e.g., r2 = 0.73 and 0.42 for 8 day GPP and NEE). Evaluation of spatially estimated GPP with Global Ozone Monitoring Experiment 2 sensor-based Sun-induced chlorophyll fluorescence shows that monthly GPP variations at subcontinental scale were reproduced by SVR (r2 = 1.00, 0.94, 0.91, and 0.89 for Siberia, East Asia, South Asia, and Southeast Asia, respectively). Evaluation of spatially estimated NEE with net atmosphere-land CO2 fluxes of Greenhouse Gases Observing Satellite (GOSAT) Level 4A product shows that monthly variations of these data were consistent in Siberia and East Asia; meanwhile, inconsistency was found in South Asia and Southeast Asia. Furthermore, differences in the land CO2 fluxes from SVR-NEE and GOSAT Level 4A were partially explained by accounting for the differences in the definition of land CO2 fluxes. These data-driven estimates can provide a new opportunity to assess CO2 fluxes in Asia and evaluate and constrain terrestrial ecosystem models.
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Einav, Sharon; Alon, Gady; Kaufman, Nechama; Braunstein, Rony; Carmel, Sara; Varon, Joseph; Hersch, Moshe
2012-09-01
To determine whether variables in physicians' backgrounds influenced their decision to forego resuscitating a patient they did not previously know. Questionnaire survey of a convenience sample of 204 physicians working in the departments of internal medicine, anaesthesiology and cardiology in 11 hospitals in Israel. Twenty per cent of the participants had elected to forego resuscitating a patient they did not previously know without additional consultation. Physicians who had more frequently elected to forego resuscitation had practised medicine for more than 5 years (p=0.013), estimated the number of resuscitations they had performed as being higher (p=0.009), and perceived their experience in resuscitation as sufficient (p=0.001). The variable that predicted the outcome of always performing resuscitation in the logistic regression model was less than 5 years of experience in medicine (OR 0.227, 95% CI 0.065 to 0.793; p=0.02). Physicians' level of experience may affect the probability of a patient's receiving resuscitation, whereas the physicians' personal beliefs and values did not seem to affect this outcome.
Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.
2017-01-01
Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...
Directory of Open Access Journals (Sweden)
Kowal Robert
2016-12-01
Full Text Available A simple linear regression model is one of the pillars of classic econometrics. Multiple areas of research function within its scope. One of the many fundamental questions in the model concerns proving the efficiency of the most commonly used OLS estimators and examining their properties. In the literature of the subject one can find taking back to this scope and certain solutions in that regard. Methodically, they are borrowed from the multiple regression model or also from a boundary partial model. Not everything, however, is here complete and consistent. In the paper a completely new scheme is proposed, based on the implementation of the Cauchy-Schwarz inequality in the arrangement of the constraint aggregated from calibrated appropriately secondary constraints of unbiasedness which in a result of choice the appropriate calibrator for each variable directly leads to showing this property. A separate range-is a matter of choice of such a calibrator. These deliberations, on account of the volume and kinds of the calibration, were divided into a few parts. In the one the efficiency of OLS estimators is proven in a mixed scheme of the calibration by averages, that is preliminary, and in the most basic frames of the proposed methodology. In these frames the future outlines and general premises constituting the base of more distant generalizations are being created.
Farhadian, Maryam; Aliabadi, Mohsen; Darvishi, Ebrahim
2015-01-01
Prediction models are used in a variety of medical domains, and they are frequently built from experience which constitutes data acquired from actual cases. This study aimed to analyze the potential of artificial neural networks and logistic regression techniques for estimation of hearing impairment among industrial workers. A total of 210 workers employed in a steel factory (in West of Iran) were selected, and their occupational exposure histories were analyzed. The hearing loss thresholds of the studied workers were determined using a calibrated audiometer. The personal noise exposures were also measured using a noise dosimeter in the workstations. Data obtained from five variables, which can influence the hearing loss, were used as input features, and the hearing loss thresholds were considered as target feature of the prediction methods. Multilayer feedforward neural networks and logistic regression were developed using MATLAB R2011a software. Based on the World Health Organization classification for the grades of hearing loss, 74.2% of the studied workers have normal hearing thresholds, 23.4% have slight hearing loss, and 2.4% have moderate hearing loss. The accuracy and kappa coefficient of the best developed neural networks for prediction of the grades of hearing loss were 88.6 and 66.30, respectively. The accuracy and kappa coefficient of the logistic regression were also 84.28 and 51.30, respectively. Neural networks could provide more accurate predictions of the hearing loss than logistic regression. The prediction method can provide reliable and comprehensible information for occupational health and medicine experts.
Markham, Francis; Young, Martin; Doran, Bruce; Sugden, Mark
2017-05-23
Many jurisdictions regularly conduct surveys to estimate the prevalence of problem gambling in their adult populations. However, the comparison of such estimates is problematic due to methodological variations between studies. Total consumption theory suggests that an association between mean electronic gaming machine (EGM) and casino gambling losses and problem gambling prevalence estimates may exist. If this is the case, then changes in EGM losses may be used as a proxy indicator for changes in problem gambling prevalence. To test for this association this study examines the relationship between aggregated losses on electronic gaming machines (EGMs) and problem gambling prevalence estimates for Australian states and territories between 1994 and 2016. A Bayesian meta-regression analysis of 41 cross-sectional problem gambling prevalence estimates was undertaken using EGM gambling losses, year of survey and methodological variations as predictor variables. General population studies of adults in Australian states and territory published before 1 July 2016 were considered in scope. 41 studies were identified, with a total of 267,367 participants. Problem gambling prevalence, moderate-risk problem gambling prevalence, problem gambling screen, administration mode and frequency threshold were extracted from surveys. Administrative data on EGM and casino gambling loss data were extracted from government reports and expressed as the proportion of household disposable income lost. Money lost on EGMs is correlated with problem gambling prevalence. An increase of 1% of household disposable income lost on EGMs and in casinos was associated with problem gambling prevalence estimates that were 1.33 times higher [95% credible interval 1.04, 1.71]. There was no clear association between EGM losses and moderate-risk problem gambling prevalence estimates. Moderate-risk problem gambling prevalence estimates were not explained by the models (I 2 ≥ 0.97; R 2 ≤ 0.01). The
Directory of Open Access Journals (Sweden)
Francis Markham
2017-05-01
Full Text Available Abstract Background Many jurisdictions regularly conduct surveys to estimate the prevalence of problem gambling in their adult populations. However, the comparison of such estimates is problematic due to methodological variations between studies. Total consumption theory suggests that an association between mean electronic gaming machine (EGM and casino gambling losses and problem gambling prevalence estimates may exist. If this is the case, then changes in EGM losses may be used as a proxy indicator for changes in problem gambling prevalence. To test for this association this study examines the relationship between aggregated losses on electronic gaming machines (EGMs and problem gambling prevalence estimates for Australian states and territories between 1994 and 2016. Methods A Bayesian meta-regression analysis of 41 cross-sectional problem gambling prevalence estimates was undertaken using EGM gambling losses, year of survey and methodological variations as predictor variables. General population studies of adults in Australian states and territory published before 1 July 2016 were considered in scope. 41 studies were identified, with a total of 267,367 participants. Problem gambling prevalence, moderate-risk problem gambling prevalence, problem gambling screen, administration mode and frequency threshold were extracted from surveys. Administrative data on EGM and casino gambling loss data were extracted from government reports and expressed as the proportion of household disposable income lost. Results Money lost on EGMs is correlated with problem gambling prevalence. An increase of 1% of household disposable income lost on EGMs and in casinos was associated with problem gambling prevalence estimates that were 1.33 times higher [95% credible interval 1.04, 1.71]. There was no clear association between EGM losses and moderate-risk problem gambling prevalence estimates. Moderate-risk problem gambling prevalence estimates were not explained by
Relative Pose Estimation and Accuracy Verification of Spherical Panoramic Image
Directory of Open Access Journals (Sweden)
XIE Donghai
2017-11-01
Full Text Available This paper improves the method of the traditional 5-point relative pose estimation algorithm, and proposes a relative pose estimation algorithm which is suitable for spherical panoramic images. The algorithm firstly computes the essential matrix, then decomposes the essential matrix to obtain the rotation matrix and the translation vector using SVD, and finally the reconstructed three-dimensional points are used to eliminate the error solution. The innovation of the algorithm lies the derivation of panorama epipolar formula and the use of the spherical distance from the point to the epipolar plane as the error term for the spherical panorama co-planarity function. The simulation experiment shows that when the random noise of the image feature points is within the range of pixel, the error of the three Euler angles is about 0.1°, and the error between the relative translational displacement and the simulated value is about 1.5°. The result of the experiment using the data obtained by the vehicle panorama camera and the POS shows that:the error of the roll angle and pitch angle can be within 0.2°, the error of the heading angle can be within 0.4°, and the error between the relative translational displacement and the POS can be within 2°. The result of our relative pose estimation algorithm is used to generate the spherical panoramic epipolar images, then we extract the key points between the spherical panoramic images and calculate the errors in the column direction. The result shows that the errors is less than 1 pixel.
Directory of Open Access Journals (Sweden)
Marek Vochozka
2017-12-01
Full Text Available Purpose of the article: Palladium is presently used for producing electronics, industrial products or jewellery, as well as products in the medical field. Its value is raised especially by its unique physical and chemical characteristics. Predicting the value of such a metal is not an easy matter (with regard to the fact that prices may change significantly in time. Methodology/methods: To carry out the analysis, London Fix Price PM data was used, i.e. amounts reported in the afternoon for a period longer than 10 years. To process the data, Statistica software is used. Linear regression is carried out using a whole range of functions, and subsequently regression via neural structures is performed, where several distributional functions are used again. Subsequently, 1000 neural networks are generated, out of which 5 proving the best characteristics are chosen. Scientific aim: The aim of the paper is to perform a regression analysis of the development of the palladium price on the New York Stock Exchange using neural structures and linear regression, then to compare the two methods and determine the more suitable one for a possible prediction of the future development of the palladium price on the New York Stock Exchange. Findings: Results are compared on the level of an expert perspective and the evaluator’s – economist’s experience. Within regression time lines, the curve obtained by the least squares methods via negative-exponential smoothing gets closest to Palladium price line development. Out of the neural networks, all 5 chosen networks prove to be the most practically useful. Conclusions: Because it is not possible to predict extraordinary situations and their impact on the palladium price (at most in the short term, but certainly not over a long period of time, simplification and the creation of a relatively simple model is appropriate and the result is useful.
Kernel PLS Estimation of Single-trial Event-related Potentials
Rosipal, Roman; Trejo, Leonard J.
2004-01-01
Nonlinear kernel partial least squaes (KPLS) regressior, is a novel smoothing approach to nonparametric regression curve fitting. We have developed a KPLS approach to the estimation of single-trial event related potentials (ERPs). For improved accuracy of estimation, we also developed a local KPLS method for situations in which there exists prior knowledge about the approximate latency of individual ERP components. To assess the utility of the KPLS approach, we compared non-local KPLS and local KPLS smoothing with other nonparametric signal processing and smoothing methods. In particular, we examined wavelet denoising, smoothing splines, and localized smoothing splines. We applied these methods to the estimation of simulated mixtures of human ERPs and ongoing electroencephalogram (EEG) activity using a dipole simulator (BESA). In this scenario we considered ongoing EEG to represent spatially and temporally correlated noise added to the ERPs. This simulation provided a reasonable but simplified model of real-world ERP measurements. For estimation of the simulated single-trial ERPs, local KPLS provided a level of accuracy that was comparable with or better than the other methods. We also applied the local KPLS method to the estimation of human ERPs recorded in an experiment on co,onitive fatigue. For these data, the local KPLS method provided a clear improvement in visualization of single-trial ERPs as well as their averages. The local KPLS method may serve as a new alternative to the estimation of single-trial ERPs and improvement of ERP averages.
Directory of Open Access Journals (Sweden)
Soon Jae Lee
2016-09-01
Full Text Available Some recent studies have found regression of liver cirrhosis after antiviral therapy in patients with hepatitis C virus (HCV-related liver cirrhosis, but there have been no reports of complete regression of esophageal varices after interferon/peg-interferon and ribavirin combination therapy. We describe two cases of complete regression of esophageal varices and splenomegaly after interferon-alpha and ribavirin combination therapy in patients with HCV-related liver cirrhosis. Esophageal varices and splenomegaly regressed after 3 and 8 years of sustained virologic responses in cases 1 and 2, respectively. To our knowledge, this is the first study demonstrating that complications of liver cirrhosis, such as esophageal varices and splenomegaly, can regress after antiviral therapy in patients with HCV-related liver cirrhosis.
Sethuramalingam, Prabhu; Vinayagam, Babu Kupusamy
2016-07-01
Carbon nanotube mixed grinding wheel is used in the grinding process to analyze the surface characteristics of AISI D2 tool steel material. Till now no work has been carried out using carbon nanotube based grinding wheel. Carbon nanotube based grinding wheel has excellent thermal conductivity and good mechanical properties which are used to improve the surface finish of the workpiece. In the present study, the multi response optimization of process parameters like surface roughness and metal removal rate of grinding process of single wall carbon nanotube (CNT) in mixed cutting fluids is undertaken using orthogonal array with grey relational analysis. Experiments are performed with designated grinding conditions obtained using the L9 orthogonal array. Based on the results of the grey relational analysis, a set of optimum grinding parameters is obtained. Using the analysis of variance approach the significant machining parameters are found. Empirical model for the prediction of output parameters has been developed using regression analysis and the results are compared empirically, for conditions of with and without CNT grinding wheel in grinding process.
Directory of Open Access Journals (Sweden)
Baxter Lisa K
2008-05-01
Full Text Available Abstract Background There is a growing body of literature linking GIS-based measures of traffic density to asthma and other respiratory outcomes. However, no consensus exists on which traffic indicators best capture variability in different pollutants or within different settings. As part of a study on childhood asthma etiology, we examined variability in outdoor concentrations of multiple traffic-related air pollutants within urban communities, using a range of GIS-based predictors and land use regression techniques. Methods We measured fine particulate matter (PM2.5, nitrogen dioxide (NO2, and elemental carbon (EC outside 44 homes representing a range of traffic densities and neighborhoods across Boston, Massachusetts and nearby communities. Multiple three to four-day average samples were collected at each home during winters and summers from 2003 to 2005. Traffic indicators were derived using Massachusetts Highway Department data and direct traffic counts. Multivariate regression analyses were performed separately for each pollutant, using traffic indicators, land use, meteorology, site characteristics, and central site concentrations. Results PM2.5 was strongly associated with the central site monitor (R2 = 0.68. Additional variability was explained by total roadway length within 100 m of the home, smoking or grilling near the monitor, and block-group population density (R2 = 0.76. EC showed greater spatial variability, especially during winter months, and was predicted by roadway length within 200 m of the home. The influence of traffic was greater under low wind speed conditions, and concentrations were lower during summer (R2 = 0.52. NO2 showed significant spatial variability, predicted by population density and roadway length within 50 m of the home, modified by site characteristics (obstruction, and with higher concentrations during summer (R2 = 0.56. Conclusion Each pollutant examined displayed somewhat different spatial patterns
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Santos-Concejero, Jordan; Tucker, Ross; Granados, Cristina; Irazusta, Jon; Bidaurrazaga-Letona, Iraia; Zabala-Lili, Jon; Gil, Susana María
2014-01-01
This study investigated the influence of the regression model and initial intensity during an incremental test on the relationship between the lactate threshold estimated by the maximal-deviation method and performance in elite-standard runners. Twenty-three well-trained runners completed a discontinuous incremental running test on a treadmill. Speed started at 9 km · h(-1) and increased by 1.5 km · h(-1) every 4 min until exhaustion, with a minute of recovery for blood collection. Lactate-speed data were fitted by exponential and polynomial models. The lactate threshold was determined for both models, using all the co-ordinates, excluding the first and excluding the first and second points. The exponential lactate threshold was greater than the polynomial equivalent in any co-ordinate condition (P performance and is independent of the initial intensity of the test.
Southard, Rodney E.
2013-01-01
The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical
Directory of Open Access Journals (Sweden)
Perinetti, G.
2016-04-01
Full Text Available Introduction: The identification of the onset of the pubertal growth spurt has major clinical implications when dealing with orthodontic treatment in growing subjects. Aim: Through multivariate methods, this study evaluated possible relationships between the gingival crevicular fluid (GCF alkaline phosphatase (ALP activity and pubertal growth spurt and dentition phase. Materials and methods: One hundred healthy growing subjects (62 females, 38 males; mean age, 11.5±2.4 years were enrolled into this doubleblind, prospective, cross-sectional-design study. Phases of skeletal maturation (pre - pubertal, pubertal, post - pubertal was assessed using the cervical vertebral maturation method. Samples of GCF for the ALP activity determination were collected at the mesial and distal sites of the mandibular central incisors. The phases of the dentition were recorded as intermediate mixed, late mixed, or permanent. A multinomial multiple logistic regression model was used to assess relationships of the enzymatic activity to growth phases and dentition phases. Results: The GCF ALP activity was greater in the pubertal growth phase as compared to the pre - pubertal and post - pubertal growth phases. Significant adjusted odds ratios for the GCF ALP activity for the pre - pubertal and post - pubertal subjects, in relation to the pubertal group, were 0.76 and 0.84, respectively. No significant correlations were seen for the dentition phase. Conclusions: The GCF ALP activity is a valid candidate as a non - invasive biomarker for the identification of the pubertal growth spurt irrespective of the dentition phase.
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Uncertainty related to Environmental Data and Estimated Extreme Events
DEFF Research Database (Denmark)
Burcharth, H. F.
The design loads on rubble mound breakwaters are almost entirely determined by the environmental conditions, i.e. sea state, water levels, sea bed characteristics, etc. It is the objective of sub-group B to identify the most important environmental parameters and evaluate the related uncertainties...... including those corresponding to extreme estimates typically used for design purposes. Basically a design condition is made up of a set of parameter values stemming from several environmental parameters. To be able to evaluate the uncertainty related to design states one must know the corresponding joint....... Consequently this report deals mainly with each parameter separately. Multi parameter problems are briefly discussed in section 9. It is important to notice that the quantified uncertainties reported in section 7.7 represent what might be regarded as typical figures to be used only when no more qualified...
Risk Estimates and Risk Factors Related to Psychiatric Inpatient Suicide
DEFF Research Database (Denmark)
Madsen, Trine; Erlangsen, Annette; Nordentoft, Merete
2017-01-01
trends, and socio-demographic and clinical risk factors of suicide in psychiatric inpatients. Psychiatric inpatients have a very high risk of suicide relative to the background population, but it remains challenging for clinicians to identify those patients that are most likely to die from suicide during......People with mental illness have an increased risk of suicide. The aim of this paper is to provide an overview of suicide risk estimates among psychiatric inpatients based on the body of evidence found in scientific peer-reviewed literature; primarily focusing on the relative risks, rates, time...... admission. Most studies are based on low power, thus compromising quality and generalisability. The few studies with sufficient statistical power mainly identified non-modifiable risk predictors such as male gender, diagnosis, or recent deliberate self-harm. Also, the predictive value of these predictors...
Risk Estimates and Risk Factors Related to Psychiatric Inpatient Suicide
DEFF Research Database (Denmark)
Madsen, Trine; Erlangsen, Annette; Nordentoft, Merete
2017-01-01
People with mental illness have an increased risk of suicide. The aim of this paper is to provide an overview of suicide risk estimates among psychiatric inpatients based on the body of evidence found in scientific peer-reviewed literature; primarily focusing on the relative risks, rates, time...... trends, and socio-demographic and clinical risk factors of suicide in psychiatric inpatients. Psychiatric inpatients have a very high risk of suicide relative to the background population, but it remains challenging for clinicians to identify those patients that are most likely to die from suicide during...... is low. It would be of great benefit if future studies would be based on large samples while focusing on modifiable predictors over the course of an admission, such as hopelessness, depressive symptoms, and family/social situations. This would improve our chances of developing better risk assessment...
Work related injuries: estimating the incidence among illegally employed immigrants
Directory of Open Access Journals (Sweden)
Fadda Emanuela
2010-12-01
Full Text Available Abstract Background Statistics on occupational accidents are based on data from registered employees. With the increasing number of immigrants employed illegally and/or without regular working visas in many developed countries, it is of interest to estimate the injury rate among such unregistered workers. Findings The current study was conducted in an area of North-Eastern Italy. The sources of information employed in the present study were the Accidents and Emergencies records of a hospital; the population data on foreign-born residents in the hospital catchment area (Health Care District 4, Primary Care Trust 20, Province of Verona, Veneto Region, North-Eastern Italy; and the estimated proportion of illegally employed workers in representative samples from the Province of Verona and the Veneto Region. Of the 419 A&E records collected between January and December 2004 among non European Union (non-EU immigrants, 146 aroused suspicion by reporting the home, rather than the workplace, as the site of the accident. These cases were the numerator of the rate. The number of illegally employed non-EU workers, denominator of the rate, was estimated according to different assumptions and ranged from between 537 to 1,338 individuals. The corresponding rates varied from 109.1 to 271.8 per 1,000 non-EU illegal employees, against 65 per 1,000 reported in Italy in 2004. Conclusions The results of this study suggest that there is an unrecorded burden of illegally employed immigrants suffering from work related injuries. Additional efforts for prevention of injuries in the workplace are required to decrease this number. It can be concluded that the Italian National Institute for the Insurance of Work Related Injuries (INAIL probably underestimates the incidence of these accidents in Italy.
Work related injuries: estimating the incidence among illegally employed immigrants.
Mastrangelo, Giuseppe; Rylander, Ragnar; Buja, Alessandra; Marangi, Gianluca; Fadda, Emanuela; Fedeli, Ugo; Cegolon, Luca
2010-12-08
Statistics on occupational accidents are based on data from registered employees. With the increasing number of immigrants employed illegally and/or without regular working visas in many developed countries, it is of interest to estimate the injury rate among such unregistered workers. The current study was conducted in an area of North-Eastern Italy. The sources of information employed in the present study were the Accidents and Emergencies records of a hospital; the population data on foreign-born residents in the hospital catchment area (Health Care District 4, Primary Care Trust 20, Province of Verona, Veneto Region, North-Eastern Italy); and the estimated proportion of illegally employed workers in representative samples from the Province of Verona and the Veneto Region. Of the 419 A&E records collected between January and December 2004 among non European Union (non-EU) immigrants, 146 aroused suspicion by reporting the home, rather than the workplace, as the site of the accident. These cases were the numerator of the rate. The number of illegally employed non-EU workers, denominator of the rate, was estimated according to different assumptions and ranged from between 537 to 1,338 individuals. The corresponding rates varied from 109.1 to 271.8 per 1,000 non-EU illegal employees, against 65 per 1,000 reported in Italy in 2004. The results of this study suggest that there is an unrecorded burden of illegally employed immigrants suffering from work related injuries. Additional efforts for prevention of injuries in the workplace are required to decrease this number. It can be concluded that the Italian National Institute for the Insurance of Work Related Injuries (INAIL) probably underestimates the incidence of these accidents in Italy.
Mack, W J; Krauss, R M; Hodis, H N
1996-05-01
Accumulating evidence suggests that triglyceride-rich lipoproteins contribute to coronary artery disease. Using data from the Monitored Atherosclerosis Regression Study, an angiographic trial of middle-aged men and women randomized to lovastatin or placebo, we investigated relationships between lipoprotein subclasses and progression of coronary artery atherosclerosis. Coronary artery lesion progression was determined by quantitative coronary angiography in low-grade ( or = 50% diameter stenosis), and all coronary artery lesions in 220 baseline/2-year angiogram pairs. Analytical ultracentrifugation was used to measure lipoprotein masses that were statistically evaluated for treatment group differences and relationships to progression of coronary artery atherosclerosis. All low density lipoprotein (LDL), intermediate density lipoprotein (IDL), and very low density lipoprotein (VLDL) masses were significantly lowered and all high density lipoprotein (HDL) masses were significantly raised with lovastatin therapy. The mass of smallest LDL (Svedberg flotation rate [Sf] 0 to 3), IDL (Sf 12 to 20), all VLDL subclasses (Sf 20 to 60, Sf 60 to 100, and Sf 100 to 400), and peak LDL flotation rate were significantly related to the progression of coronary artery lesions, specifically low-grade lesions. Greater baseline levels of HDL3, were related to a lower likelihood of coronary artery lesion progression. In multivariate analyses, small VLDL (Sf 20 to 60) and HDL3 mass were the most important correlates of coronary artery lesion progression. These results provide further evidence for the importance of triglyceride-rich lipoproteins in the progression of coronary artery disease. In addition, these results present new evidence for the possible protective role of HDL3 in the progression of coronary artery lesions. More specific information on coronary artery lesion progression may be obtained through the study of specific apolipoprotein B-containing lipoproteins.
How Do Different Aspects of Spatial Skills Relate to Early Arithmetic and Number Line Estimation?
Directory of Open Access Journals (Sweden)
Véronique Cornu
2017-12-01
Full Text Available The present study investigated the predictive role of spatial skills for arithmetic and number line estimation in kindergarten children (N = 125. Spatial skills are known to be related to mathematical development, but due to the construct’s non-unitary nature, different aspects of spatial skills need to be differentiated. In the present study, a spatial orientation task, a spatial visualization task and visuo-motor integration task were administered to assess three different aspects of spatial skills. Furthermore, we assessed counting abilities, knowledge of Arabic numerals, quantitative knowledge, as well as verbal working memory and verbal intelligence in kindergarten. Four months later, the same children performed an arithmetic and a number line estimation task to evaluate how the abilities measured at Time 1 predicted early mathematics outcomes. Hierarchical regression analysis revealed that children’s performance in arithmetic was predicted by their performance on the spatial orientation and visuo-motor integration task, as well as their knowledge of the Arabic numerals. Performance in number line estimation was significantly predicted by the children’s spatial orientation performance. Our findings emphasize the role of spatial skills, notably spatial orientation, in mathematical development. The relation between spatial orientation and arithmetic was partially mediated by the number line estimation task. Our results further show that some aspects of spatial skills might be more predictive of mathematical development than others, underlining the importance to differentiate within the construct of spatial skills when it comes to understanding numerical development.
The relative pose estimation of aircraft based on contour model
Fu, Tai; Sun, Xiangyi
2017-02-01
This paper proposes a relative pose estimation approach based on object contour model. The first step is to obtain a two-dimensional (2D) projection of three-dimensional (3D)-model-based target, which will be divided into 40 forms by clustering and LDA analysis. Then we proceed by extracting the target contour in each image and computing their Pseudo-Zernike Moments (PZM), thus a model library is constructed in an offline mode. Next, we spot a projection contour that resembles the target silhouette most in the present image from the model library with reference of PZM; then similarity transformation parameters are generated as the shape context is applied to match the silhouette sampling location, from which the identification parameters of target can be further derived. Identification parameters are converted to relative pose parameters, in the premise that these values are the initial result calculated via iterative refinement algorithm, as the relative pose parameter is in the neighborhood of actual ones. At last, Distance Image Iterative Least Squares (DI-ILS) is employed to acquire the ultimate relative pose parameters.
Thevissen, P W; Fieuws, S; Willems, G
2010-01-01
Dental age estimation methods based on the radiologically detected third molar developmental stages are implemented in forensic age assessments to discriminate between juveniles and adults considering the judgment of young unaccompanied asylum seekers. Accurate and unbiased age estimates combined with appropriate quantified uncertainties are the required properties for accurate forensic reporting. In this study, a subset of 910 individuals uniformly distributed in age between 16 and 22 years was selected from an existing dataset collected by Gunst et al. containing 2,513 panoramic radiographs with known third molar developmental stages of Belgian Caucasian men and women. This subset was randomly split in a training set to develop a classical regression analysis and a Bayesian model for the multivariate distribution of the third molar developmental stages conditional on age and in a test set to assess the performance of both models. The aim of this study was to verify if the Bayesian approach differentiates the age of maturity more precisely and removes the bias, which disadvantages the systematically overestimated young individuals. The Bayesian model offers the discrimination of subjects being older than 18 years more appropriate and produces more meaningful prediction intervals but does not strongly outperform the classical approaches.
Robust estimation of event-related potentials via particle filter.
Fukami, Tadanori; Watanabe, Jun; Ishikawa, Fumito
2016-03-01
In clinical examinations and brain-computer interface (BCI) research, a short electroencephalogram (EEG) measurement time is ideal. The use of event-related potentials (ERPs) relies on both estimation accuracy and processing time. We tested a particle filter that uses a large number of particles to construct a probability distribution. We constructed a simple model for recording EEG comprising three components: ERPs approximated via a trend model, background waves constructed via an autoregressive model, and noise. We evaluated the performance of the particle filter based on mean squared error (MSE), P300 peak amplitude, and latency. We then compared our filter with the Kalman filter and a conventional simple averaging method. To confirm the efficacy of the filter, we used it to estimate ERP elicited by a P300 BCI speller. A 400-particle filter produced the best MSE. We found that the merit of the filter increased when the original waveform already had a low signal-to-noise ratio (SNR) (i.e., the power ratio between ERP and background EEG). We calculated the amount of averaging necessary after applying a particle filter that produced a result equivalent to that associated with conventional averaging, and determined that the particle filter yielded a maximum 42.8% reduction in measurement time. The particle filter performed better than both the Kalman filter and conventional averaging for a low SNR in terms of both MSE and P300 peak amplitude and latency. For EEG data produced by the P300 speller, we were able to use our filter to obtain ERP waveforms that were stable compared with averages produced by a conventional averaging method, irrespective of the amount of averaging. We confirmed that particle filters are efficacious in reducing the measurement time required during simulations with a low SNR. Additionally, particle filters can perform robust ERP estimation for EEG data produced via a P300 speller. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Mehmet Siraç Özerdem
2017-04-01
Full Text Available Determining the soil moisture in agricultural fields is a significant parameter to use irrigation systems efficiently. In contrast to standard soil moisture measurements, good results might be acquired in a shorter time over large areas by remote sensing tools. In order to estimate the soil moisture over vegetated agricultural areas, a relationship between Radarsat-2 data and measured ground soil moistures was established by polarimetric decomposition models and a generalized regression neural network (GRNN. The experiments were executed over two agricultural sites on the Tigris Basin, Turkey. The study consists of four phases. In the first stage, Radarsat-2 data were acquired on different dates and in situ measurements were implemented simultaneously. In the second phase, the Radarsat-2 data were pre-processed and the GPS coordinates of the soil sample points were imported to this data. Then the standard sigma backscattering coefficients with the Freeman–Durden and H/A/α polarimetric decomposition models were employed for feature extraction and a feature vector with four sigma backscattering coefficients (σhh, σhv, σvh, and σvv and six polarimetric decomposition parameters (entropy, anisotropy, alpha angle, volume scattering, odd bounce, and double bounce were generated for each pattern. In the last stage, GRNN was used to estimate the regional soil moisture with the aid of feature vectors. The results indicated that radar is a strong remote sensing tool for soil moisture estimation, with mean absolute errors around 2.31 vol %, 2.11 vol %, and 2.10 vol % for Datasets 1–3, respectively; and 2.46 vol %, 2.70 vol %, 7.09 vol %, and 5.70 vol % on Datasets 1 & 2, 2 & 3, 1 & 3, and 1 & 2 & 3, respectively.
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Estimates of Leaf Relative Water Content from Optical Polarization Measurements
Dahlgren, R. P.; Vanderbilt, V. C.; Daughtry, C. S. T.
2017-12-01
Remotely sensing the water status of plant canopies remains a long term goal of remote sensing research. Existing approaches to remotely sensing canopy water status, such as the Crop Water Stress Index (CWSI) and the Equivalent Water Thickness (EWT), have limitations. The CWSI, based upon remotely sensing canopy radiant temperature in the thermal infrared spectral region, does not work well in humid regions, requires estimates of the vapor pressure deficit near the canopy during the remote sensing over-flight and, once stomata close, provides little information regarding the canopy water status. The EWT is based upon the physics of water-light interaction in the 900-2000nm spectral region, not plant physiology. Our goal, development of a remote sensing technique for estimating plant water status based upon measurements in the VIS/NIR spectral region, would potentially provide remote sensing access to plant dehydration physiology - to the cellular photochemistry and structural changes associated with water deficits in leaves. In this research, we used optical, crossed polarization filters to measure the VIS/NIR light reflected from the leaf interior, R, as well as the leaf transmittance, T, for 78 corn (Zea mays) and soybean (Glycine max) leaves having relative water contents (RWC) between 0.60 and 0.98. Our results show that as RWC decreases R increases while T decreases. Our results tie R and T changes in the VIS/NIR to leaf physiological changes - linking the light scattered out of the drying leaf interior to its relative water content and to changes in leaf cellular structure and pigments. Our results suggest remotely sensing the physiological water status of a single leaf - and perhaps of a plant canopy - might be possible in the future.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Directory of Open Access Journals (Sweden)
Javier Hernandez
2015-02-01
Full Text Available Plant breeding based on grain yield (GY is an expensive and time-consuming method, so new indirect estimation techniques to evaluate the performance of crops represent an alternative method to improve grain yield. The present study evaluated the ability of canopy reflectance spectroscopy at the range from 350 to 2500 nm to predict GY in a large panel (368 genotypes of wheat (Triticum aestivum L. through multivariate ridge regression models. Plants were treated under three water regimes in the Mediterranean conditions of central Chile: severe water stress (SWS, rain fed, mild water stress (MWS; one irrigation event around booting and full irrigation (FI with mean GYs of 1655, 4739, and 7967 kg∙ha−1, respectively. Models developed from reflectance data during anthesis and grain filling under all water regimes explained between 77% and 91% of the GY variability, with the highest values in SWS condition. When individual models were used to predict yield in the rest of the trials assessed, models fitted during anthesis under MWS performed best. Combined models using data from different water regimes and each phenological stage were used to predict grain yield, and the coefficients of determination (R2 increased to 89.9% and 92.0% for anthesis and grain filling, respectively. The model generated during anthesis in MWS was the best at predicting yields when it was applied to other conditions. Comparisons against conventional reflectance indices were made, showing lower predictive abilities. It was concluded that a Ridge Regression Model using a data set based on spectral reflectance at anthesis or grain filling represents an effective method to predict grain yield in genotypes under different water regimes.
Estimation of Relative Economic Weights of Hanwoo Carcass Traits Based on Carcass Market Price
Choy, Yun Ho; Park, Byoung Ho; Choi, Tae Jung; Choi, Jae Gwan; Cho, Kwang Hyun; Lee, Seung Soo; Choi, You Lim; Koh, Kyung Chul; Kim, Hyo Sun
2012-01-01
The objective of this study was to estimate economic weights of Hanwoo carcass traits that can be used to build economic selection indexes for selection of seedstocks. Data from carcass measures for determining beef yield and quality grades were collected and provided by the Korean Institute for Animal Products Quality Evaluation (KAPE). Out of 1,556,971 records, 476,430 records collected from 13 abattoirs from 2008 to 2010 after deletion of outlying observations were used to estimate relative economic weights of bid price per kg carcass weight on cold carcass weight (CW), eye muscle area (EMA), backfat thickness (BF) and marbling score (MS) and the phenotypic relationships among component traits. Price of carcass tended to increase linearly as yield grades or quality grades, in marginal or in combination, increased. Partial regression coefficients for MS, EMA, BF, and for CW in original scales were +948.5 won/score, +27.3 won/cm2, −95.2 won/mm and +7.3 won/kg when all three sex categories were taken into account. Among four grade determining traits, relative economic weight of MS was the greatest. Variations in partial regression coefficients by sex categories were great but the trends in relative weights for each carcass measures were similar. Relative economic weights of four traits in integer values when standardized measures were fit into covariance model were +4:+1:−1:+1 for MS:EMA:BF:CW. Further research is required to account for the cost of production per unit carcass weight or per unit production under different economic situations. PMID:25049531
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Report on estimated nuclear energy related cost for fiscal 1991
International Nuclear Information System (INIS)
1991-01-01
The report first describes major actions planned to be taken in Japan in fiscal 1991 in the field of nuclear energy utilization. Major activities to be made for comprehensive strengthening of safety assurance measures are described, focusing on improvement of nuclear energy related safety regulations, promotion of research for safety assurance, improvement and strengthening of disaster prevention measures, environmental radioactivity surveys, control of exposure of workers engaged in radioactivity related jobs, etc. The report then describes actions required for the establishment of a nuclear fuel cycle, focusing on the procurement of uranium resources, establishment of a uranium enrichment process, reprocessing of spent fuel, application of recovered uranium, etc. Other activities are required for the development of new type reactors, effective application of plutonium, development of basic techniques, international contributions, cooperation with the public. Then, the report summarizes estimated costs required for the activities to be performed by the Japan Atomic Energy Research Institute, Power Reactor and Nuclear Fuel Development Corporation, National Institute of Radiological Sciences, Institute of Physical and Chemical Research. (N.K.)
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
SEPARATION PHENOMENA LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Ikaro Daniel de Carvalho Barreto
2014-03-01
Full Text Available This paper proposes an application of concepts about the maximum likelihood estimation of the binomial logistic regression model to the separation phenomena. It generates bias in the estimation and provides different interpretations of the estimates on the different statistical tests (Wald, Likelihood Ratio and Score and provides different estimates on the different iterative methods (Newton-Raphson and Fisher Score. It also presents an example that demonstrates the direct implications for the validation of the model and validation of variables, the implications for estimates of odds ratios and confidence intervals, generated from the Wald statistics. Furthermore, we present, briefly, the Firth correction to circumvent the phenomena of separation.
Callén, M S; López, J M; Mastral, A M
2010-08-15
The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R(2)=0.817, PRESS/SSY=0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q(CV)(2)=0.813, PRESS/SSY=0.187) and with the maximal external prediction for the 2001-2002 campaign (Q(ext)(2)=0.679 and PRESS/SSY=0.321) versus the 2001-2004 campaign (Q(ext)(2)=0.551, PRESS/SSY=0.449). Copyright 2010 Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Callen, M.S.; Lopez, J.M.; Mastral, A.M.
2010-01-01
The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R 2 = 0.817, PRESS/SSY = 0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q CV 2 =0.813, PRESS/SSY = 0.187) and with the maximal external prediction for the 2001-2002 campaign (Q ext 2 =0.679 and PRESS/SSY = 0.321) versus the 2001-2004 campaign (Q ext 2 =0.551, PRESS/SSY = 0.449).
Directory of Open Access Journals (Sweden)
Chun-Yuan Yeh
2017-09-01
Full Text Available Abstract Background European Union public healthcare expenditure on treating smoking and attributable diseases is estimated at over €25bn annually. The reduction of tobacco consumption has thus become one of the major social policies of the EU. This study investigates the effects of price hikes on cigarette consumption, tobacco tax revenues and smoking-caused deaths in 28 EU countries. Methods Employing panel data for the years 2005 to 2014 from Euromonitor International, the World Bank and the World Health Organization, we used income as a threshold variable and applied threshold regression modelling to estimate the elasticity of cigarette prices and to simulate the effect of price fluctuations. Results The results showed that there was an income threshold effect on cigarette prices in the 28 EU countries that had a gross national income (GNI per capita lower than US$5418, with a maximum cigarette price elasticity of −1.227. The results of the simulated analysis showed that a rise of 10% in cigarette price would significantly reduce cigarette consumption as well the total death toll caused by smoking in all the observed countries, but would be most effective in Bulgaria and Romania, followed by Latvia and Poland. Additionally, an increase in the number of MPOWER tobacco control policies at the highest level of achievment would help reduce cigarette consumption. Conclusions It is recommended that all EU countries levy higher tobacco taxes to increase cigarette prices, and thus in effect reduce cigarette consumption. The subsequent increase in tobacco tax revenues would be instrumental in covering expenditures related to tobacco prevention and control programs.
Yeh, Chun-Yuan; Schafferer, Christian; Lee, Jie-Min; Ho, Li-Ming; Hsieh, Chi-Jung
2017-09-21
European Union public healthcare expenditure on treating smoking and attributable diseases is estimated at over €25bn annually. The reduction of tobacco consumption has thus become one of the major social policies of the EU. This study investigates the effects of price hikes on cigarette consumption, tobacco tax revenues and smoking-caused deaths in 28 EU countries. Employing panel data for the years 2005 to 2014 from Euromonitor International, the World Bank and the World Health Organization, we used income as a threshold variable and applied threshold regression modelling to estimate the elasticity of cigarette prices and to simulate the effect of price fluctuations. The results showed that there was an income threshold effect on cigarette prices in the 28 EU countries that had a gross national income (GNI) per capita lower than US$5418, with a maximum cigarette price elasticity of -1.227. The results of the simulated analysis showed that a rise of 10% in cigarette price would significantly reduce cigarette consumption as well the total death toll caused by smoking in all the observed countries, but would be most effective in Bulgaria and Romania, followed by Latvia and Poland. Additionally, an increase in the number of MPOWER tobacco control policies at the highest level of achievment would help reduce cigarette consumption. It is recommended that all EU countries levy higher tobacco taxes to increase cigarette prices, and thus in effect reduce cigarette consumption. The subsequent increase in tobacco tax revenues would be instrumental in covering expenditures related to tobacco prevention and control programs.
Directory of Open Access Journals (Sweden)
Apif M. Hajji
2017-09-01
Full Text Available Heavy duty diesel (HDD construction equipment which includes bulldozer is important in infrastructure development. This equipment consumes large amount of diesel fuel and emits high level of carbon dioxide (CO2. The total emissions are dependent upon the fuel use, and the fuel use is dependent upon the productivity of the equipment. This paper proposes a methodology and tool for estimating CO2 emissions from bulldozer based on the productivity rate. The methodology is formulated by using the result of multiple linear regressions (MLR of CAT’s data for obtaining the productivity model and combined with the EPA’s NONROAD model. The emission factors from NONROAD model were used to quantify the CO2 emissions. To display the function of the model, a case study and sensitivity analysis for a bulldozer’s activity is also presented. MLR results indicate that the productivity model generated from CAT’s data can be used as the basis for quantifying the total CO2 emissions for an earthwork activity.
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.; Carroll, R.J.; Wand, M.P.
2010-01-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Li, Saijiao; He, Aiyan; Yang, Jing; Yin, TaiLang; Xu, Wangming
2011-01-01
To investigate factors that can affect compliance with treatment of polycystic ovary syndrome (PCOS) in infertile patients and to provide a basis for clinical treatment, specialist consultation and health education. Patient compliance was assessed via a questionnaire based on the Morisky-Green test and the treatment principles of PCOS. Then interviews were conducted with 99 infertile patients diagnosed with PCOS at Renmin Hospital of Wuhan University in China, from March to September 2009. Finally, these data were analyzed using logistic regression analysis. Logistic regression analysis revealed that a total of 23 (25.6%) of the participants showed good compliance. Factors that significantly (p < 0.05) affected compliance with treatment were the patient's body mass index, convenience of medical treatment and concerns about adverse drug reactions. Patients who are obese, experience inconvenient medical treatment or are concerned about adverse drug reactions are more likely to exhibit noncompliance. Treatment education and intervention aimed at these patients should be strengthened in the clinic to improve treatment compliance. Further research is needed to better elucidate the compliance behavior of patients with PCOS.
Directory of Open Access Journals (Sweden)
Koyin Chang
2013-09-01
Full Text Available To understand the impact of drinking and driving laws on drinking and driving fatality rates, this study explored the different effects these laws have on areas with varying severity rates for drinking and driving. Unlike previous studies, this study employed quantile regression analysis. Empirical results showed that policies based on local conditions must be used to effectively reduce drinking and driving fatality rates; that is, different measures should be adopted to target the specific conditions in various regions. For areas with low fatality rates (low quantiles, people’s habits and attitudes toward alcohol should be emphasized instead of transportation safety laws because “preemptive regulations” are more effective. For areas with high fatality rates (or high quantiles, “ex-post regulations” are more effective, and impact these areas approximately 0.01% to 0.05% more than they do areas with low fatality rates.
The HINTS is designed to produce reliable estimates at the national and regional levels. GIS maps using HINTS data have been used to provide a visual representation of possible geographic relationships in HINTS cancer-related variables.
The relative efficiency of three methods of estimating herbage mass ...
African Journals Online (AJOL)
The methods involved were randomly placed circular quadrats; randomly placed narrow strips; and disc meter sampling. Disc meter and quadrat sampling appear to be more efficient than strip sampling. In a subsequent small plot grazing trial the estimates of herbage mass, using the disc meter, had a consistent precision ...
Factoring vs linear modeling in rate estimation: a simulation study of relative accuracy.
Maldonado, G; Greenland, S
1998-07-01
A common strategy for modeling dose-response in epidemiology is to transform ordered exposures and covariates into sets of dichotomous indicator variables (that is, to factor the variables). Factoring tends to increase estimation variance, but it also tends to decrease bias and thus may increase or decrease total accuracy. We conducted a simulation study to examine the impact of factoring on the accuracy of rate estimation. Factored and unfactored Poisson regression models were fit to follow-up study datasets that were randomly generated from 37,500 population model forms that ranged from subadditive to supramultiplicative. In the situations we examined, factoring sometimes substantially improved accuracy relative to fitting the corresponding unfactored model, sometimes substantially decreased accuracy, and sometimes made little difference. The difference in accuracy between factored and unfactored models depended in a complicated fashion on the difference between the true and fitted model forms, the strength of exposure and covariate effects in the population, and the study size. It may be difficult in practice to predict when factoring is increasing or decreasing accuracy. We recommend, therefore, that the strategy of factoring variables be supplemented with other strategies for modeling dose-response.
Directory of Open Access Journals (Sweden)
Chang-Qing Duan
2008-11-01
Full Text Available Color is one of the key characteristics used to evaluate the sensory quality of red wine, and anthocyanins are the main contributors to color. Monomeric anthocyanins and CIELAB color values were investigated by HPLC-MS and spectrophotometry during fermentation of Cabernet Sauvignon red wine, and principal component regression (PCR, a statistical tool, was used to establish a linkage between the detected anthocyanins and wine coloring. The results showed that 14 monomeric anthocyanins could be identified in wine samples, and all of these anthocyanins were negatively correlated with the L*, b* and H*ab values, but positively correlated with a* and C*ab values. On an equal concentration basis for each detected anthocyanin, cyanidin-3-O-glucoside (Cy3-glu had the most influence on CIELAB color value, while malvidin 3-O-glucoside (Mv3-glu had the least. The color values of various monomeric anthocyanins were influenced by their structures, substituents on the B-ring, acyl groups on the glucoside and the molecular steric structure. This work develops a statistical method for evaluating correlation between wine color and monomeric anthocyanins, and also provides a basis for elucidating the effect of intramolecular copigmentation on wine coloring.
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Effects of exposure estimation errors on estimated exposure-response relations for PM2.5.
Cox, Louis Anthony Tony
2018-07-01
Associations between fine particulate matter (PM2.5) exposure concentrations and a wide variety of undesirable outcomes, from autism and auto theft to elderly mortality, suicide, and violent crime, have been widely reported. Influential articles have argued that reducing National Ambient Air Quality Standards for PM2.5 is desirable to reduce these outcomes. Yet, other studies have found that reducing black smoke and other particulate matter by as much as 70% and dozens of micrograms per cubic meter has not detectably affected all-cause mortality rates even after decades, despite strong, statistically significant positive exposure concentration-response (C-R) associations between them. This paper examines whether this disconnect between association and causation might be explained in part by ignored estimation errors in estimated exposure concentrations. We use EPA air quality monitor data from the Los Angeles area of California to examine the shapes of estimated C-R functions for PM2.5 when the true C-R functions are assumed to be step functions with well-defined response thresholds. The estimated C-R functions mistakenly show risk as smoothly increasing with concentrations even well below the response thresholds, thus incorrectly predicting substantial risk reductions from reductions in concentrations that do not affect health risks. We conclude that ignored estimation errors obscure the shapes of true C-R functions, including possible thresholds, possibly leading to unrealistic predictions of the changes in risk caused by changing exposures. Instead of estimating improvements in public health per unit reduction (e.g., per 10 µg/m 3 decrease) in average PM2.5 concentrations, it may be essential to consider how interventions change the distributions of exposure concentrations. Copyright © 2018 Elsevier Inc. All rights reserved.
Improving Relative Combat Power Estimation: The Road to Victory
2014-06-13
was unthinkable before. Napoleon Bonaparte achieved a superior warfighting system compared to his opponents, which resulted in SOF. Napoleon’s...observations about combat power estimation and force empoloyment, remain valid. Napoleon also offered thoughts about combat power and superiority whe he...force. However, Napoleon did not think one- sidedly about the problem. He also said: “The moral is to the physical as three to one.”11 This dual
Patient absorbed radiation doses estimation related to irradiation anatomy
International Nuclear Information System (INIS)
Soares, Flavio Augusto Penna; Soares, Amanda Anastacio; Kahl, Gabrielly Gomes
2014-01-01
Developed a direct equation to estimate the absorbed dose to the patient in x-ray examinations, using electric, geometric parameters and filtering combined with data from irradiated anatomy. To determine the absorbed dose for each examination, the entrance skin dose (ESD) is adjusted to the thickness of the patient's specific anatomy. ESD is calculated from the estimated KERMA greatness in the air. Beer-Lambert equations derived from power data mass absorption coefficients obtained from the NIST / USA, were developed for each tissue: bone, muscle, fat and skin. Skin thickness was set at 2 mm and the bone was estimated in the central ray of the site, in the anteroposterior view. Because they are similar in density and attenuation coefficients, muscle and fat are treated as a single tissue. For evaluation of the full equations, we chose three different anatomies: chest, hand and thigh. Although complex in its shape, the equations simplify direct determination of absorbed dose from the characteristics of the equipment and patient. The input data is inserted at a single time and total absorbed dose (mGy) is calculated instantly. The average error, when compared with available data, is less than 5% in any combination of device data and exams. In calculating the dose for an exam and patient, the operator can choose the variables that will deposit less radiation to the patient through the prior analysis of each combination of variables, using the ALARA principle in routine diagnostic radiology sector
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....
METHODOLOGY RELATED TO ESTIMATION OF INVESTMENT APPEAL OF RURAL SETTLEMENTS
Directory of Open Access Journals (Sweden)
A. S. Voshev
2010-03-01
Full Text Available Conditions for production activity vary considerably from region to region, from area to area, from settlement to settlement. In this connection, investors are challenged to choose an optimum site for a new enterprise. To make the decision, investors follow such references as: investment potential and risk level; their interrelation determines investment appeal of a country, region, area, city or rural settlement. At present Russia faces a problem of «black boxes» represented by a lot of rural settlements. No effective and suitable techniques of quantitative estimation of investment potential, rural settlement risks and systems to make the given information accessible for potential investors exist until now.
Beelen, R.M.J.; Voogt, M.; Duyzer, J.; Zandveld, P.; Hoek, G.
2010-01-01
The performance of a Land Use Regression (LUR) model and a dispersion model (URBIS - URBis Information System) was compared in a Dutch urban area. For the Rijnmond area, i.e. Rotterdam and surroundings, nitrogen dioxide (NO2) concentrations for 2001 were estimated for nearly 70 000 centroids of a
Uncertainty in estimating and mitigating industrial related GHG emissions
International Nuclear Information System (INIS)
El-Fadel, M.; Zeinati, M.; Ghaddar, N.; Mezher, T.
2001-01-01
Global climate change has been one of the challenging environmental concerns facing policy makers in the past decade. The characterization of the wide range of greenhouse gas emissions sources and sinks as well as their behavior in the atmosphere remains an on-going activity in many countries. Lebanon, being a signatory to the Framework Convention on Climate Change, is required to submit and regularly update a national inventory of greenhouse gas emissions sources and removals. Accordingly, an inventory of greenhouse gases from various sectors was conducted following the guidelines set by the United Nations Intergovernmental Panel on Climate Change (IPCC). The inventory indicated that the industrial sector contributes about 29% to the total greenhouse gas emissions divided between industrial processes and energy requirements at 12 and 17%, respectively. This paper describes major mitigation scenarios to reduce emissions from this sector based on associated technical, economic, environmental, and social characteristics. Economic ranking of these scenarios was conducted and uncertainty in emission factors used in the estimation process was emphasized. For this purpose, theoretical and experimental emission factors were used as alternatives to default factors recommended by the IPCC and the significance of resulting deviations in emission estimation is presented. (author)
Heritability estimates for yield and related traits in bread wheat
International Nuclear Information System (INIS)
Din, R.; Jehan, S.; Ibraullah, A.
2009-01-01
A set of 22 experimental wheat lines along with four check cultivars were evaluated in in-irrigated and unirrgated environments with objectives to determine genetic and phenotypic variation and heritability estimates for yield and its traits- The two environments were statistically at par for physiological maturity, plant height, spikes m/sub -2/. spike lets spike/sup -1/ and 1000-grain weight. Highly significant genetic variability existed among wheat lines (P < 0.0 I) in the combined analysis across two test environments for traits except 1000- grain weight. Genotypes x environment interactions were non-significant for traits indicating consistent performance of lines in two test environments. However lines and check cultivars were two to five days early in maturity under unirrigated environment. Plant height, spikes m/sup -2/ and 1000-grain weight also reduced under unirrigated environments. Genetic variances were greater than Environmental variances for most of traits- Heritability estimates were of higher magnitude (0.74 to 0.96) for plant height, medium (0.31 to 0.56) for physiological maturity. spikelets spike/sup -1/ (unirrigated) and 1000-grain weight, and low for spikes m/sup -2/. (author)
Miller, Arthur L; Weakley, Andrew Todd; Griffiths, Peter R; Cauda, Emanuele G; Bayman, Sean
2017-05-01
In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in
Relative estimation of the mineral ages using uranium migration
International Nuclear Information System (INIS)
Danis, A.
1990-01-01
Using the uranium fission track micro mapping technique the correlation between the age and uranium migration from inclusions was studied. It is shown that during geological time, as function of the mineral, its age and its uranium migration speed, the pattern of the track, clusters corresponding to the uranium inclusions got a typical feature. Thus for a bulk polished geological sample it is possible to establish an age succession of the constituent minerals as a function of the track cluster patterns. Also, it is shown that knowing the migration speed of the uranium in a mineral it is possible to estimate the age of this mineral by measuring the migration distance on the micro mapping. (Author)
Lee, Michael T.; Asquith, William H.; Oden, Timothy D.
2012-01-01
In December 2005, the U.S. Geological Survey (USGS), in cooperation with the City of Houston, Texas, began collecting discrete water-quality samples for nutrients, total organic carbon, bacteria (Escherichia coli and total coliform), atrazine, and suspended sediment at two USGS streamflow-gaging stations that represent watersheds contributing to Lake Houston (08068500 Spring Creek near Spring, Tex., and 08070200 East Fork San Jacinto River near New Caney, Tex.). Data from the discrete water-quality samples collected during 2005–9, in conjunction with continuously monitored real-time data that included streamflow and other physical water-quality properties (specific conductance, pH, water temperature, turbidity, and dissolved oxygen), were used to develop regression models for the estimation of concentrations of water-quality constituents of substantial source watersheds to Lake Houston. The potential explanatory variables included discharge (streamflow), specific conductance, pH, water temperature, turbidity, dissolved oxygen, and time (to account for seasonal variations inherent in some water-quality data). The response variables (the selected constituents) at each site were nitrite plus nitrate nitrogen, total phosphorus, total organic carbon, E. coli, atrazine, and suspended sediment. The explanatory variables provide easily measured quantities to serve as potential surrogate variables to estimate concentrations of the selected constituents through statistical regression. Statistical regression also facilitates accompanying estimates of uncertainty in the form of prediction intervals. Each regression model potentially can be used to estimate concentrations of a given constituent in real time. Among other regression diagnostics, the diagnostics used as indicators of general model reliability and reported herein include the adjusted R-squared, the residual standard error, residual plots, and p-values. Adjusted R-squared values for the Spring Creek models ranged
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Nonlinear Forecasting With Many Predictors Using Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter; Groenen, Patrick J.F.; Heij, Christiaan
This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation of the predi...
Statistical analysis of sediment toxicity by additive monotone regression splines
Boer, de W.J.; Besten, den P.J.; Braak, ter C.J.F.
2002-01-01
Modeling nonlinearity and thresholds in dose-effect relations is a major challenge, particularly in noisy data sets. Here we show the utility of nonlinear regression with additive monotone regression splines. These splines lead almost automatically to the estimation of thresholds. We applied this
Analytical Estimation of Water-Oil Relative Permeabilities through Fractures
Directory of Open Access Journals (Sweden)
Saboorian-Jooybari Hadi
2016-05-01
Full Text Available Modeling multiphase flow through fractures is a key issue for understanding flow mechanism and performance prediction of fractured petroleum reservoirs, geothermal reservoirs, underground aquifers and carbon-dioxide sequestration. One of the most challenging subjects in modeling of fractured petroleum reservoirs is quantifying fluids competition for flow in fracture network (relative permeability curves. Unfortunately, there is no standard technique for experimental measurement of relative permeabilities through fractures and the existing methods are very expensive, time consuming and erroneous. Although, several formulations were presented to calculate fracture relative permeability curves in the form of linear and power functions of flowing fluids saturation, it is still unclear what form of relative permeability curves must be used for proper modeling of flow through fractures and consequently accurate reservoir simulation. Basically, the classic linear relative permeability (X-type curves are used in almost all of reservoir simulators. In this work, basic fluid flow equations are combined to develop a new simple analytical model for water-oil two phase flow in a single fracture. The model gives rise to simple analytic formulations for fracture relative permeabilities. The model explicitly proves that water-oil relative permeabilities in fracture network are functions of fluids saturation, viscosity ratio, fluids density, inclination of fracture plane from horizon, pressure gradient along fracture and rock matrix wettability, however they were considered to be only functions of saturations in the classic X-type and power (Corey [35] and Honarpour et al. [28, 29] models. Eventually, validity of the proposed formulations is checked against literature experimental data. The proposed fracture relative permeability functions have several advantages over the existing ones. Firstly, they are explicit functions of the parameters which are known for
Directory of Open Access Journals (Sweden)
Mir Mustafizur Rahman
2014-11-01
Full Text Available Thermal Infrared (TIR remote sensing images of urban environments are increasingly available from airborne and satellite platforms. However, limited access to high-spatial resolution (H-res: ~1 m TIR satellite images requires the use of TIR airborne sensors for mapping large complex urban surfaces, especially at micro-scales. A critical limitation of such H-res mapping is the need to acquire a large scene composed of multiple flight lines and mosaic them together. This results in the same scene components (e.g., roads, buildings, green space and water exhibiting different temperatures in different flight lines. To mitigate these effects, linear relative radiometric normalization (RRN techniques are often applied. However, the Earth’s surface is composed of features whose thermal behaviour is characterized by complexity and non-linearity. Therefore, we hypothesize that non-linear RRN techniques should demonstrate increased radiometric agreement over similar linear techniques. To test this hypothesis, this paper evaluates four (linear and non-linear RRN techniques, including: (i histogram matching (HM; (ii pseudo-invariant feature-based polynomial regression (PIF_Poly; (iii no-change stratified random sample-based linear regression (NCSRS_Lin; and (iv no-change stratified random sample-based polynomial regression (NCSRS_Poly; two of which (ii and iv are newly proposed non-linear techniques. When applied over two adjacent flight lines (~70 km2 of TABI-1800 airborne data, visual and statistical results show that both new non-linear techniques improved radiometric agreement over the previously evaluated linear techniques, with the new fully-automated method, NCSRS-based polynomial regression, providing the highest improvement in radiometric agreement between the master and the slave images, at ~56%. This is ~5% higher than the best previously evaluated linear technique (NCSRS-based linear regression.
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
Granato, Gregory E.
2006-01-01
The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and
Directory of Open Access Journals (Sweden)
Stefan J. Teipel
2015-01-01
Penalized regression yielded more parsimonious models than unpenalized stepwise regression for the integration of multiregional and multimodal imaging information. The advantage of penalized regression was particularly strong with a high number of collinear predictors.
Thevissen, P. W.; FIEUWS, Steffen; Willems, G.
2010-01-01
Dental age estimation methods based on the radiologically detected third molar developmental stages are implemented in forensic age assessments to discriminate between juveniles and adults considering the judgment of young unaccompanied asylum seekers. Accurate and unbiased age estimates combined with appropriate quantified uncertainties are the required properties for accurate forensic reporting. In this study, a subset of 910 individuals uniformly distributed in age between 16 and 22 years ...
New Estimates of Numerical Values Related to a Simplex
Directory of Open Access Journals (Sweden)
Mikhail V. Nevskii
2017-01-01
if \\(\\xi_n=n.\\ This statement is valid only in one direction. There exists a simplex \\(S\\subset Q_5\\ such that the boundary of the simplex \\(5S\\ contains all the vertices of the cube \\(Q_5\\. We describe a one-parameter family of simplices contained in \\(Q_5\\ with the property \\(\\alpha(S=\\xi(S=5.\\ These simplices were found with the use of numerical and symbolic computations. %Numerical experiments allow to discover Another new result is an inequality \\(\\xi_6\\ <6.0166\\. %Прежняя оценка имела вид \\(6\\leq \\xi_6\\leq 6.6\\. We also systematize some of our estimates of numbers \\(\\xi_n\\, \\(\\theta_n\\, \\(\\varkappa_n\\ derived by~now. The symbol \\(\\theta_n\\ denotes the minimal norm of interpolation projection on the space of linear functions of \\(n\\ variables as~an~operator from \\(C(Q_n\\ to~\\(C(Q_n\\.
Directory of Open Access Journals (Sweden)
Mónica Bocco
2010-09-01
Full Text Available The incident solar radiation on soil is an important variable used in agricultural applications; it is also relevant in hydrology, meteorology and soil physics, among others. To estimate this variable, empirical models have been developed using several parameters and, recently, prognostic and prediction models based on artificial intelligence techniques such as neural networks. The aim of this work was to develop linear models and neural networks, multilayer perceptron, to estimate daily global solar radiation and compare their efficiency in its application to a region of the Province of Salta, Argentina. Relative sunshine duration, maximum and minimum temperature, rainfall, binary rainfall and extraterrestrial solar radiation data for the period 1996-2002, were used. All data were supplied by Experimental Station Salta, Instituto Nacional de Tecnología Agropecuaria (INTA, Argentina. For both, neural networks models and linear regressions, three alternative combinations of meteorological parameters were considered. Good results with both prediction methods were obtained, with root mean square error (RMSE values between 1.99 and 1.66 MJ m-2 d-1 for linear regressions and neural networks, and coefficients of correlation (r² between 0.88 and 0.92, respectively. Even though neural networks and linear regression models can be used to predict the daily global solar radiation appropriately, neural networks produced better estimates.La radiación solar incidente en el suelo es una variable importante usada en aplicaciones agronómicas, además es relevante en hidrología, meteorología y física del suelo, entre otros. Para estimarla se han desarrollado modelos empíricos que utilizan distintos parámetros meteorológicos y, recientemente, modelos de pronóstico y predicción basados en técnicas de inteligencia artificial tales como redes neuronales. El objetivo de este trabajo fue desarrollar modelos lineales y de redes neuronales, del tipo perceptr
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Cross-property relations and permeability estimation in model porous media
International Nuclear Information System (INIS)
Schwartz, L.M.; Martys, N.; Bentz, D.P.; Garboczi, E.J.; Torquato, S.
1993-01-01
Results from a numerical study examining cross-property relations linking fluid permeability to diffusive and electrical properties are presented. Numerical solutions of the Stokes equations in three-dimensional consolidated granular packings are employed to provide a basis of comparison between different permeability estimates. Estimates based on the Λ parameter (a length derived from electrical conduction) and on d c (a length derived from immiscible displacement) are found to be considerably more reliable than estimates based on rigorous permeability bounds related to pore space diffusion. We propose two hybrid relations based on diffusion which provide more accurate estimates than either of the rigorous permeability bounds
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Ramoelo, A.; Skidmore, A.K.; Cho, M.A.; Mathieu, R.; Heitkonig, I.M.A.; Dudeni-Tlhone, N.; Schlerf, M.; Prins, H.H.T.
2013-01-01
Grass nitrogen (N) and phosphorus (P) concentrations are direct indicators of rangeland quality and provide imperative information for sound management of wildlife and livestock. It is challenging to estimate grass N and P concentrations using remote sensing in the savanna ecosystems. These areas
CSIR Research Space (South Africa)
Ramoelo, Abel
2013-06-01
Full Text Available in situ hyperspectral and environmental variables yielded the highest grass N and P estimation accuracy (R2 = 0.81, root mean square error (RMSE) = 0.08, and R2 = 0.80, RMSE = 0.03, respectively) as compared to using remote sensing variables only...
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Leszek Klukowski
2012-01-01
Full Text Available This paper presents a review of results of the author in the area of estimation of the relations of equivalence, tolerance and preference within a finite set based on multiple, independent (in a stochastic way pairwise comparisons with random errors, in binary and multivalent forms. These estimators require weaker assumptions than those used in the literature on the subject. Estimates of the relations are obtained based on solutions to problems from discrete optimization. They allow application of both types of comparisons - binary and multivalent (this fact relates to the tolerance and preference relations. The estimates can be verified in a statistical way; in particular, it is possible to verify the type of the relation. The estimates have been applied by the author to problems regarding forecasting, financial engineering and bio-cybernetics. (original abstract
Directory of Open Access Journals (Sweden)
Hossein Mojaddadi Rizeei
2018-01-01
Full Text Available The current study proposes a new method for oil palm age estimation and counting. A support vector machine algorithm (SVM of object-based image analysis (OBIA was implemented for oil palm counting. It was integrated with height model and multiregression methods to accurately estimate the age of trees based on their heights in five different plantation blocks. Multiregression and multi-kernel size models were examined over five different oil palm plantation blocks to achieve the most optimized model for age estimation. The sensitivity analysis was conducted on four SVM kernel types (sigmoid (SIG, linear (LN, radial basis function (RBF, and polynomial (PL with associated parameters (threshold values, gamma γ, and penalty factor (c to obtain the optimal OBIA classification approaches for each plantation block. Very high-resolution imageries of WorldView-3 (WV-3 and light detection and range (LiDAR were used for oil palm detection and age assessment. The results of oil palm detection had an overall accuracy of 98.27%, 99.48%, 99.28%, 99.49%, and 97.49% for blocks A, B, C, D, and E, respectively. Moreover, the accuracy of age estimation analysis showed 90.1% for 3-year-old, 87.9% for 4-year-old, 88.0% for 6-year-old, 87.6% for 8-year-old, 79.1% for 9-year-old, and 76.8% for 22-year-old trees. Overall, the study revealed that remote sensing techniques can be useful to monitor and detect oil palm plantation for sustainable agricultural management.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Differentiating regressed melanoma from regressed lichenoid keratosis.
Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A
2017-04-01
Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Callén Romero, Mª Soledad; López Sebastián, José Manuel; Mastral Lamarca, Ana María
2010-01-01
The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography-mass spectrometry-mass spectrometry (...
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Directory of Open Access Journals (Sweden)
J. B. Liu
2016-11-01
Full Text Available Forty-eight barrows with an average initial body weight of 25.5±0.3 kg were assigned to 6 dietary treatments arranged in a 3×2 factorial of 3 graded levels of P at 1.42, 2.07, or 2.72 g/kg, and 2 levels of casein at 0 or 50 g/kg to compare the estimates of true total tract digestibility (TTTD of P in soybean meal (SBM for pigs fed diets with or without casein supplementation. The SBM is the only source of P in diets without casein, and in the diet with added casein, 1.0 to 2.4 g/kg of total dietary P was supplied by SBM as dietary level of SBM increased. The experiment consisted of a 5-d adjustment period and a 5-d total collection period with ferric oxide as a maker to indicate the initiation and termination of fecal collection. There were interactive effects of casein supplementation and total dietary P level on the apparent total tract digestibility (ATTD and retention of P (p0.05. In summary, our results demonstrate that the estimates of TTTD of P in SBM for pigs were not affected by constant casein inclusion in the basal diets.
Saputro, Dewi Retno Sari; Widyaningsih, Purnami
2017-08-01
In general, the parameter estimation of GWOLR model uses maximum likelihood method, but it constructs a system of nonlinear equations, making it difficult to find the solution. Therefore, an approximate solution is needed. There are two popular numerical methods: the methods of Newton and Quasi-Newton (QN). Newton's method requires large-scale time in executing the computation program since it contains Jacobian matrix (derivative). QN method overcomes the drawback of Newton's method by substituting derivative computation into a function of direct computation. The QN method uses Hessian matrix approach which contains Davidon-Fletcher-Powell (DFP) formula. The Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is categorized as the QN method which has the DFP formula attribute of having positive definite Hessian matrix. The BFGS method requires large memory in executing the program so another algorithm to decrease memory usage is needed, namely Low Memory BFGS (LBFGS). The purpose of this research is to compute the efficiency of the LBFGS method in the iterative and recursive computation of Hessian matrix and its inverse for the GWOLR parameter estimation. In reference to the research findings, we found out that the BFGS and LBFGS methods have arithmetic operation schemes, including O(n2) and O(nm).
Regression: The Apple Does Not Fall Far From the Tree.
Vetter, Thomas R; Schober, Patrick
2018-05-15
Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
Messier, K. P.; Serre, M. L.
2015-12-01
Radon (222Rn) is a naturally occurring chemically inert, colorless, and odorless radioactive gas produced from the decay of uranium (238U), which is ubiquitous in rocks and soils worldwide. Exposure to 222Rn is likely the second leading cause of lung cancer after cigarette smoking via inhalation; however, exposure through untreated groundwater is also a contributing factor to both inhalation and ingestion routes. A land use regression (LUR) model for groundwater 222Rn with anisotropic geological and 238U based explanatory variables is developed, which helps elucidate the factors contributing to elevated 222Rn across North Carolina. Geological and uranium based variables are constructed in elliptical buffers surrounding each observation such that they capture the lateral geometric anisotropy present in groundwater 222Rn. Moreover, geological features are defined at three different geological spatial scales to allow the model to distinguish between large area and small area effects of geology on groundwater 222Rn. The LUR is also integrated into the Bayesian Maximum Entropy (BME) geostatistical framework to increase accuracy and produce a point-level LUR-BME model of groundwater 222Rn across North Carolina including prediction uncertainty. The LUR-BME model of groundwater 222Rn results in a leave-one out cross-validation of 0.46 (Pearson correlation coefficient= 0.68), effectively predicting within the spatial covariance range. Modeled results of 222Rn concentrations show variability among Intrusive Felsic geological formations likely due to average bedrock 238U defined on the basis of overlying stream-sediment 238U concentrations that is a widely distributed consistently analyzed point-source data.
Directory of Open Access Journals (Sweden)
Mohamed Gar Alalm
2018-05-01
Full Text Available Chlorothalonil is a pesticide that can contaminate water bodies, detriment aquatic organisms, and cause cancers of the forestomach and kidney. In this study, a powdered activated carbon prepared from casuarina wood was used for the adsorption of chlorothalonil from aqueous solutions. Based on Scanning Electron microscopy and Fourier Transform Infrared Spectroscopy analyses, the adsorbent material comprised pores and multiple functional groups that favored the entrapment of chlorothalonil onto its surface. At initial chlorothalonil concentration of 480 mg L−1, the equilibrium uptake capacity was 187 mg g−1 at pH: 7, adsorbent dosage: 0.5 g L−1, contact time: 40 min, and room temperature (25 ± 4 °C. The kinetic and isotherm studies indicated that the rate constant of pseudo-second-order model (k2 was 0.003 g mg−1 min−1, and the monolayer adsorption capacity was 192 mg g−1. Results from a quadratic model demonstrated that the plot of adsorption capacity versus pH, chlorothalonil concentration, adsorbent dosage, and contact time caused quadratic-concave, linear-up, flat, and quadratic-linear concave up curves, respectively. An artificial neural network with a structure of 4–5–1 was able to predict the adsorption capacity (R2: 0.982, and the sensitivity analysis using connection weights showed that pH was the most influential factor. An economic estimation using amortization and operating costs revealed that an adsorption unit subjected to 100 m3 d−1 containing chlorothalonil concentration of 250 ± 50 mg L−1 could cost 1.18 $ m−3. Keywords: Activated carbon, Artificial neural network, Chlorothalonil pesticide, Cost estimation, Kinetics and isotherms
Institute of Scientific and Technical Information of China (English)
高鸿云; 冯金英; 徐俊冕; 郑士俊
2001-01-01
Objective: To identify the related psychosocial risk factors of emotional disorders in children. Methods:To use case-control approach in which. Diagnosis was made by clinical interview according to ICD-10 criteria. Eighty eight cases and controls separately filled out general condition inventory. The results were put into Logistic regression model for analysis. Results: The children with timid personality, without kindergarten education, or with parents who were administrative or technical personnel, were apt to have emotional disorders. The children who were usually counseled by their mothers had less emotional disorders than those were beaten. Conclusion: The emotional disorders were the results of multiple factors. Prevention of children's emotional disorders should be focused on the children's personality and family education.
Alexandrowicz, Rainer W; Jahn, Rebecca; Friedrich, Fabian; Unger, Anne
2016-06-01
Various studies have shown that caregiving relatives of schizophrenic patients are at risk of suffering from depression. These studies differ with respect to the applied statistical methods, which could influence the findings. Therefore, the present study analyzes to which extent different methods may cause differing results. The present study contrasts by means of one data set the results of three different modelling approaches, Rasch Modelling (RM), Structural Equation Modelling (SEM), and Linear Regression Modelling (LRM). The results of the three models varied considerably, reflecting the different assumptions of the respective models. Latent trait models (i. e., RM and SEM) generally provide more convincing results by correcting for measurement error and the RM specifically proves superior for it treats ordered categorical data most adequately.
Oscillation estimates relative to p-homogeneous forms and Kato measures data
Directory of Open Access Journals (Sweden)
Marco Biroli
2006-11-01
Full Text Available We state pointwise estimate for the positive subsolutions associated to a p-homogeneous form and nonnegative Radon measures data. As a by-product we establish an oscillation’s estimate for the solutions relative to Kato measures data.
Estimates of the relative specific yield of aquifers from geo-electrical ...
African Journals Online (AJOL)
This paper discusses a method of estimating aquifer specific yield based on surface resistivity sounding measurements supplemented with data on water conductivity. The practical aim of the method is to suggest a parallel low cost method of estimating aquifer properties. The starting point is the Archie's law, which relates ...
An estimator for the relative entropy rate of path measures for stochastic differential equations
Energy Technology Data Exchange (ETDEWEB)
Opper, Manfred, E-mail: manfred.opper@tu-berlin.de
2017-02-01
We address the problem of estimating the relative entropy rate (RER) for two stochastic processes described by stochastic differential equations. For the case where the drift of one process is known analytically, but one has only observations from the second process, we use a variational bound on the RER to construct an estimator.
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Tong, Xuming; Chen, Jinghang; Miao, Hongyu; Li, Tingting; Zhang, Le
2015-01-01
Agent-based models (ABM) and differential equations (DE) are two commonly used methods for immune system simulation. However, it is difficult for ABM to estimate key parameters of the model by incorporating experimental data, whereas the differential equation model is incapable of describing the complicated immune system in detail. To overcome these problems, we developed an integrated ABM regression model (IABMR). It can combine the advantages of ABM and DE by employing ABM to mimic the multi-scale immune system with various phenotypes and types of cells as well as using the input and output of ABM to build up the Loess regression for key parameter estimation. Next, we employed the greedy algorithm to estimate the key parameters of the ABM with respect to the same experimental data set and used ABM to describe a 3D immune system similar to previous studies that employed the DE model. These results indicate that IABMR not only has the potential to simulate the immune system at various scales, phenotypes and cell types, but can also accurately infer the key parameters like DE model. Therefore, this study innovatively developed a complex system development mechanism that could simulate the complicated immune system in detail like ABM and validate the reliability and efficiency of model like DE by fitting the experimental data. PMID:26535589
Directory of Open Access Journals (Sweden)
Feng Qi
2014-10-01
Full Text Available The authors find the absolute monotonicity and complete monotonicity of some functions involving trigonometric functions and related to estimates the lower bounds of the first eigenvalue of Laplace operator on Riemannian manifolds.
Significance of relative velocity in drag force or drag power estimation for a tethered float
Digital Repository Service at National Institute of Oceanography (India)
Vethamony, P.; Sastry, J.S.
There is difference in opinion regarding the use of relative velocity instead of particle velocity alone in the estimation of drag force or power. In the present study, a tethered spherical float which undergoes oscillatory motion in regular waves...
ESTIMATION OF THE KNOWLEDGE SPILLOVER EFFECTS BETWEEN FIRMS IN BIO-RELATED INDUSTRIES
Kim, Hanho; Kim, Jae-Kyung
2005-01-01
Knowledge spillover is a kind of externality originating from imperfect appropriation of R&D performances, which implies that the knowledge created by one agent could be transmitted to other related agents by affecting their R&D or other economic performances. For the estimation of knowledge spillover effects based on firm-level patent data between firms in bio-related industries, patents production function, as a proxy of knowledge production function, is formulated and estimated. Knowledge ...
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
A gentle introduction to quantile regression for ecologists
Cade, B.S.; Noon, B.R.
2003-01-01
Quantile regression is a way to estimate the conditional quantiles of a response variable distribution in the linear model that provides a more complete view of possible causal relationships between variables in ecological processes. Typically, all the factors that affect ecological processes are not measured and included in the statistical models used to investigate relationships between variables associated with those processes. As a consequence, there may be a weak or no predictive relationship between the mean of the response variable (y) distribution and the measured predictive factors (X). Yet there may be stronger, useful predictive relationships with other parts of the response variable distribution. This primer relates quantile regression estimates to prediction intervals in parametric error distribution regression models (eg least squares), and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of the estimates for homogeneous and heterogeneous regression models.
Linear regression and the normality assumption.
Schmidt, Amand F; Finan, Chris
2017-12-16
Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.
Satellite rainfall retrieval by logistic regression
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Stanišić Stojić, Svetlana; Stanišić, Nemanja; Stojić, Andreja
2016-07-11
To propose a new method for including the cumulative mid-term effects of air pollution in the traditional Poisson regression model and compare the temperature-related mortality risk estimates, before and after including air pollution data. The analysis comprised a total of 56,920 residents aged 65 years or older who died from circulatory and respiratory diseases in Belgrade, Serbia, and daily mean PM10, NO2, SO2 and soot concentrations obtained for the period 2009-2014. After accounting for the cumulative effects of air pollutants, the risk associated with cold temperatures was significantly lower and the overall temperature-attributable risk decreased from 8.80 to 3.00 %. Furthermore, the optimum range of temperature, within which no excess temperature-related mortality is expected to occur, was very broad, between -5 and 21 °C, which differs from the previous findings that most of the attributable deaths were associated with mild temperatures. These results suggest that, in polluted areas of developing countries, most of the mortality risk, previously attributed to cold temperatures, can be explained by the mid-term effects of air pollution. The results also showed that the estimated relative importance of PM10 was the smallest of four examined pollutant species, and thus, including PM10 data only is clearly not the most effective way to control for the effects of air pollution.
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
Background stratified Poisson regression analysis of cohort data
International Nuclear Information System (INIS)
Richardson, David B.; Langholz, Bryan
2012-01-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. (orig.)
Fawzy, Manal; Nasr, Mahmoud; Adel, Samar; Helmi, Shacker
2018-03-21
This study investigated the application of Potamogeton pectinatus for Ni(II)-ions biosorption from aqueous solutions. FTIR spectra showed that the functional groups of -OH, C-H, -C = O, and -COO- could form an organometallic complex with Ni(II)-ions on the biomaterial surface. SEM/EDX analysis indicated that the voids on the biosorbent surface were blocked due to Ni(II)-ions uptake via an ion exchange mechanism. For Ni(II)-ions of 50 mg/L, the adsorption efficiency recorded 63.4% at pH: 5, biosorbent dosage: 10 g/L, and particle-diameter: 0.125-0.25 mm within 180 minutes. A quadratic model depicted that the plot of removal efficiency against pH or contact time caused quadratic-linear concave up curves, whereas the curve of initial Ni(II)-ions was quadratic-linear convex down. Artificial neural network with a structure of 5 - 6 - 1 was able to predict the adsorption efficiency (R 2 : 0.967). The relative importance of inputs was: initial Ni(II)-ions > pH > contact time > biosorbent dosage > particle-size. Freundlich isotherm described well the adsorption mechanism (R 2 : 0.974), which indicated a multilayer adsorption onto energetically heterogeneous surfaces. The net cost of using P. pectinatus for the removal of Ni(II)-ions (4.25 ± 1.26 mg/L) from real industrial effluents within 30 minutes was 3.4 $USD/m 3 .
Fridman, M; Hodgkins, P S; Kahle, J S; Erder, M H
2015-06-01
There are few approved therapies for adults with attention-deficit/hyperactivity disorder (ADHD) in Europe. Lisdexamfetamine (LDX) is an effective treatment for ADHD; however, no clinical trials examining the efficacy of LDX specifically in European adults have been conducted. Therefore, to estimate the efficacy of LDX in European adults we performed a meta-regression of existing clinical data. A systematic review identified US- and Europe-based randomized efficacy trials of LDX, atomoxetine (ATX), or osmotic-release oral system methylphenidate (OROS-MPH) in children/adolescents and adults. A meta-regression model was then fitted to the published/calculated effect sizes (Cohen's d) using medication, geographical location, and age group as predictors. The LDX effect size in European adults was extrapolated from the fitted model. Sensitivity analyses performed included using adult-only studies and adding studies with placebo designs other than a standard pill-placebo design. Twenty-two of 2832 identified articles met inclusion criteria. The model-estimated effect size of LDX for European adults was 1.070 (95% confidence interval: 0.738, 1.401), larger than the 0.8 threshold for large effect sizes. The overall model fit was adequate (80%) and stable in the sensitivity analyses. This model predicts that LDX may have a large treatment effect size in European adults with ADHD. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Gibbons, S. J.; Pabian, F.; Näsholm, S. P.; Kværna, T.; Mykkeltveit, S.
2017-01-01
Declared North Korean nuclear tests in 2006, 2009, 2013 and 2016 were observed seismically at regional and teleseismic distances. Waveform similarity allows the events to be located relatively with far greater accuracy than the absolute locations can be determined from seismic data alone. There is now significant redundancy in the data given the large number of regional and teleseismic stations that have recorded multiple events, and relative location estimates can be confirmed independently by performing calculations on many mutually exclusive sets of measurements. Using a 1-D global velocity model, the distances between the events estimated using teleseismic P phases are found to be approximately 25 per cent shorter than the distances between events estimated using regional Pn phases. The 2009, 2013 and 2016 events all take place within 1 km of each other and the discrepancy between the regional and teleseismic relative location estimates is no more than about 150 m. The discrepancy is much more significant when estimating the location of the more distant 2006 event relative to the later explosions with regional and teleseismic estimates varying by many hundreds of metres. The relative location of the 2006 event is challenging given the smaller number of observing stations, the lower signal-to-noise ratio and significant waveform dissimilarity at some regional stations. The 2006 event is however highly significant in constraining the absolute locations in the terrain at the Punggye-ri test-site in relation to observed surface infrastructure. For each seismic arrival used to estimate the relative locations, we define a slowness scaling factor which multiplies the gradient of seismic traveltime versus distance, evaluated at the source, relative to the applied 1-D velocity model. A procedure for estimating correction terms which reduce the double-difference time residual vector norms is presented together with a discussion of the associated uncertainty. The modified
Calibrated Tully-Fisher relations for improved estimates of disc rotation velocities
Reyes, R.; Mandelbaum, R.; Gunn, J. E.; Pizagno II, Jim; Lackner, C. N.
2011-01-01
In this paper, we derive scaling relations between photometric observable quantities and disc galaxy rotation velocity V-rot or Tully-Fisher relations (TFRs). Our methodology is dictated by our purpose of obtaining purely photometric, minimal-scatter estimators of V-rot applicable to large galaxy
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Polylinear regression analysis in radiochemistry
International Nuclear Information System (INIS)
Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.
1995-01-01
A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis
International Nuclear Information System (INIS)
Vincent, C.H.
1982-01-01
Bayes' principle is applied to the differential counting measurement of a positive quantity in which the statistical errors are not necessarily small in relation to the true value of the quantity. The methods of estimation derived are found to give consistent results and to avoid the anomalous negative estimates sometimes obtained by conventional methods. One of the methods given provides a simple means of deriving the required estimates from conventionally presented results and appears to have wide potential applications. Both methods provide the actual posterior probability distribution of the quantity to be measured. A particularly important potential application is the correction of counts on low radioacitvity samples for background. (orig.)
Augé, Robert M; Toler, Heather D; Saxton, Arnold M
2016-01-01
Arbuscular mycorrhizal (AM) symbiosis often stimulates gas exchange rates of the host plant. This may relate to mycorrhizal effects on host nutrition and growth rate, or the influence may occur independently of these. Using meta-regression, we tested the strength of the relationship between AM-induced increases in gas exchange, and AM size and leaf mineral effects across the literature. With only a few exceptions, AM stimulation of carbon exchange rate (CER), stomatal conductance (g s), and transpiration rate (E) has been significantly associated with mycorrhizal stimulation of shoot dry weight, leaf phosphorus, leaf nitrogen:phosphorus ratio, and percent root colonization. The sizeable mycorrhizal stimulation of CER, by 49% over all studies, has been about twice as large as the mycorrhizal stimulation of g s and E (28 and 26%, respectively). CER has been over twice as sensitive as g s and four times as sensitive as E to mycorrhizal colonization rates. The AM-induced stimulation of CER increased by 19% with each AM-induced doubling of shoot size; the AM effect was about half as large for g s and E. The ratio of leaf N to leaf P has been more closely associated with mycorrhizal influence on leaf gas exchange than leaf P alone. The mycorrhizal influence on CER has declined markedly over the 35 years of published investigations.
Directory of Open Access Journals (Sweden)
Robert M. Augé
2016-07-01
Full Text Available Arbuscular mycorrhizal (AM symbiosis often stimulates gas exchange rates of the host plant. This may relate to mycorrhizal effects on host nutrition and growth rate, or the influence may occur independently of these. Using meta-regression, we tested the strength of the relationship between AM-induced increases in gas exchange, and AM size and leaf mineral effects across the literature. With only a few exceptions, AM stimulation of carbon exchange rate (CER, stomatal conductance (gs and transpiration rate (E has been significantly associated with mycorrhizal stimulation of shoot dry weight, leaf phosphorus, leaf nitrogen: phosphorus ratio and percent root colonization. The sizeable mycorrhizal stimulation of CER, by 49% over all studies, has been about twice as large as the mycorrhizal stimulation of gs and E (28% and 26%, respectively. Carbon exchange rate has been over twice as sensitive as gs and four times as sensitive as E to mycorrhizal colonization rates. The AM-induced stimulation of CER increased by 19% with each AM-induced doubling of shoot size; the AM effect was about half as large for gs and E. The ratio of leaf N to leaf P has been more closely associated with mycorrhizal influence on leaf gas exchange than leaf P alone. The mycorrhizal influence on CER has declined markedly over the 35 years of published investigations.
Directory of Open Access Journals (Sweden)
Boris Kauhl
2016-11-01
Full Text Available Abstract Background The provision of general practitioners (GPs in Germany still relies mainly on the ratio of inhabitants to GPs at relatively large scales and barely accounts for an increased prevalence of chronic diseases among the elderly and socially underprivileged populations. Type 2 Diabetes Mellitus (T2DM is one of the major cost-intensive diseases with high rates of potentially preventable complications. Provision of healthcare and access to preventive measures is necessary to reduce the burden of T2DM. However, current studies on the spatial variation of T2DM in Germany are mostly based on survey data, which do not only underestimate the true prevalence of T2DM, but are also only available on large spatial scales. The aim of this study is therefore to analyse the spatial distribution of T2DM at fine geographic scales and to assess location-specific risk factors based on data of the AOK health insurance. Methods To display the spatial heterogeneity of T2DM, a bivariate, adaptive kernel density estimation (KDE was applied. The spatial scan statistic (SaTScan was used to detect areas of high risk. Global and local spatial regression models were then constructed to analyze socio-demographic risk factors of T2DM. Results T2DM is especially concentrated in rural areas surrounding Berlin. The risk factors for T2DM consist of proportions of 65–79 year olds, 80 + year olds, unemployment rate among the 55–65 year olds, proportion of employees covered by mandatory social security insurance, mean income tax, and proportion of non-married couples. However, the strength of the association between T2DM and the examined socio-demographic variables displayed strong regional variations. Conclusion The prevalence of T2DM varies at the very local level. Analyzing point data on T2DM of northeastern Germany’s largest health insurance provider thus allows very detailed, location-specific knowledge about increased medical needs. Risk factors
Linear Regression Models for Estimating True Subsurface ...
Indian Academy of Sciences (India)
47
The objective is to minimize the processing time and computer memory required. 10 to carry out inversion .... to the mainland by two long bridges. .... term. In this approach, the model converges when the squared sum of the differences. 143.
Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P
2014-06-26
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.