WorldWideScience

Sample records for linear regression filter

  1. Linear Regression Based Real-Time Filtering

    Misel Batmend

    2013-01-01

    Full Text Available This paper introduces real time filtering method based on linear least squares fitted line. Method can be used in case that a filtered signal is linear. This constraint narrows a band of potential applications. Advantage over Kalman filter is that it is computationally less expensive. The paper further deals with application of introduced method on filtering data used to evaluate a position of engraved material with respect to engraving machine. The filter was implemented to the CNC engraving machine control system. Experiments showing its performance are included.

  2. Linear regression

    Olive, David J

    2017-01-01

    This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...

  3. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression.

    Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong

    2018-01-01

    Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.

  4. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression

    Xu Yu

    2018-01-01

    Full Text Available Cross-domain collaborative filtering (CDCF solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR. We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.

  5. Method validation using weighted linear regression models for quantification of UV filters in water samples.

    da Silva, Claudia Pereira; Emídio, Elissandro Soares; de Marchi, Mary Rosa Rodrigues

    2015-01-01

    This paper describes the validation of a method consisting of solid-phase extraction followed by gas chromatography-tandem mass spectrometry for the analysis of the ultraviolet (UV) filters benzophenone-3, ethylhexyl salicylate, ethylhexyl methoxycinnamate and octocrylene. The method validation criteria included evaluation of selectivity, analytical curve, trueness, precision, limits of detection and limits of quantification. The non-weighted linear regression model has traditionally been used for calibration, but it is not necessarily the optimal model in all cases. Because the assumption of homoscedasticity was not met for the analytical data in this work, a weighted least squares linear regression was used for the calibration method. The evaluated analytical parameters were satisfactory for the analytes and showed recoveries at four fortification levels between 62% and 107%, with relative standard deviations less than 14%. The detection limits ranged from 7.6 to 24.1 ng L(-1). The proposed method was used to determine the amount of UV filters in water samples from water treatment plants in Araraquara and Jau in São Paulo, Brazil. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Applied linear regression

    Weisberg, Sanford

    2013-01-01

    Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus

  7. Recursive Algorithm For Linear Regression

    Varanasi, S. V.

    1988-01-01

    Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.

  8. Multiple linear regression analysis

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  9. Linear Regression Analysis

    Seber, George A F

    2012-01-01

    Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

  10. Advanced statistics: linear regression, part I: simple linear regression.

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  11. Regression filter for signal resolution

    Matthes, W.

    1975-01-01

    The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)

  12. Linear regression in astronomy. II

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  13. Advanced statistics: linear regression, part II: multiple linear regression.

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  14. Correlation and simple linear regression.

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  15. Linear regression in astronomy. I

    Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh

    1990-01-01

    Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.

  16. Quantum algorithm for linear regression

    Wang, Guoming

    2017-07-01

    We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.

  17. Taking into account latency, amplitude, and morphology: improved estimation of single-trial ERPs by wavelet filtering and multiple linear regression.

    Hu, L; Liang, M; Mouraux, A; Wise, R G; Hu, Y; Iannetti, G D

    2011-12-01

    Across-trial averaging is a widely used approach to enhance the signal-to-noise ratio (SNR) of event-related potentials (ERPs). However, across-trial variability of ERP latency and amplitude may contain physiologically relevant information that is lost by across-trial averaging. Hence, we aimed to develop a novel method that uses 1) wavelet filtering (WF) to enhance the SNR of ERPs and 2) a multiple linear regression with a dispersion term (MLR(d)) that takes into account shape distortions to estimate the single-trial latency and amplitude of ERP peaks. Using simulated ERP data sets containing different levels of noise, we provide evidence that, compared with other approaches, the proposed WF+MLR(d) method yields the most accurate estimate of single-trial ERP features. When applied to a real laser-evoked potential data set, the WF+MLR(d) approach provides reliable estimation of single-trial latency, amplitude, and morphology of ERPs and thereby allows performing meaningful correlations at single-trial level. We obtained three main findings. First, WF significantly enhances the SNR of single-trial ERPs. Second, MLR(d) effectively captures and measures the variability in the morphology of single-trial ERPs, thus providing an accurate and unbiased estimate of their peak latency and amplitude. Third, intensity of pain perception significantly correlates with the single-trial estimates of N2 and P2 amplitude. These results indicate that WF+MLR(d) can be used to explore the dynamics between different ERP features, behavioral variables, and other neuroimaging measures of brain activity, thus providing new insights into the functional significance of the different brain processes underlying the brain responses to sensory stimuli.

  18. Regularized Label Relaxation Linear Regression.

    Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung; Fang, Bingwu

    2018-04-01

    Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.

  19. Quantized, piecewise linear filter network

    Sørensen, John Aasted

    1993-01-01

    A quantization based piecewise linear filter network is defined. A method for the training of this network based on local approximation in the input space is devised. The training is carried out by repeatedly alternating between vector quantization of the training set into quantization classes...... and equalization of the quantization classes linear filter mean square training errors. The equalization of the mean square training errors is carried out by adapting the boundaries between neighbor quantization classes such that the differences in mean square training errors are reduced...

  20. Aspects of robust linear regression

    Davies, P.L.

    1993-01-01

    Section 1 of the paper contains a general discussion of robustness. In Section 2 the influence function of the Hampel-Rousseeuw least median of squares estimator is derived. Linearly invariant weak metrics are constructed in Section 3. It is shown in Section 4 that $S$-estimators satisfy an exact

  1. Vectorization of linear discrete filtering algorithms

    Schiess, J. R.

    1977-01-01

    Linear filters, including the conventional Kalman filter and versions of square root filters devised by Potter and Carlson, are studied for potential application on streaming computers. The square root filters are known to maintain a positive definite covariance matrix in cases in which the Kalman filter diverges due to ill-conditioning of the matrix. Vectorization of the filters is discussed, and comparisons are made of the number of operations and storage locations required by each filter. The Carlson filter is shown to be the most efficient of the filters on the Control Data STAR-100 computer.

  2. [From clinical judgment to linear regression model.

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.

  3. Determination of regression laws: Linear and nonlinear

    Onishchenko, A.M.

    1994-01-01

    A detailed mathematical determination of regression laws is presented in the article. Particular emphasis is place on determining the laws of X j on X l to account for source nuclei decay and detector errors in nuclear physics instrumentation. Both linear and nonlinear relations are presented. Linearization of 19 functions is tabulated, including graph, relation, variable substitution, obtained linear function, and remarks. 6 refs., 1 tab

  4. Discriminative Elastic-Net Regularized Linear Regression.

    Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen

    2017-03-01

    In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.

  5. Piecewise linear regression splines with hyperbolic covariates

    Cologne, John B.; Sposto, Richard

    1992-09-01

    Consider the problem of fitting a curve to data that exhibit a multiphase linear response with smooth transitions between phases. We propose substituting hyperbolas as covariates in piecewise linear regression splines to obtain curves that are smoothly joined. The method provides an intuitive and easy way to extend the two-phase linear hyperbolic response model of Griffiths and Miller and Watts and Bacon to accommodate more than two linear segments. The resulting regression spline with hyperbolic covariates may be fit by nonlinear regression methods to estimate the degree of curvature between adjoining linear segments. The added complexity of fitting nonlinear, as opposed to linear, regression models is not great. The extra effort is particularly worthwhile when investigators are unwilling to assume that the slope of the response changes abruptly at the join points. We can also estimate the join points (the values of the abscissas where the linear segments would intersect if extrapolated) if their number and approximate locations may be presumed known. An example using data on changing age at menarche in a cohort of Japanese women illustrates the use of the method for exploratory data analysis. (author)

  6. Removing Malmquist bias from linear regressions

    Verter, Frances

    1993-01-01

    Malmquist bias is present in all astronomical surveys where sources are observed above an apparent brightness threshold. Those sources which can be detected at progressively larger distances are progressively more limited to the intrinsically luminous portion of the true distribution. This bias does not distort any of the measurements, but distorts the sample composition. We have developed the first treatment to correct for Malmquist bias in linear regressions of astronomical data. A demonstration of the corrected linear regression that is computed in four steps is presented.

  7. Compact Spectrometers Based on Linear Variable Filters

    National Aeronautics and Space Administration — Demonstrate a linear-variable spectrometer with an H2RG array. Linear Variable Filter (LVF) spectrometers provide attractive resource benefits – high optical...

  8. Finite Algorithms for Robust Linear Regression

    Madsen, Kaj; Nielsen, Hans Bruun

    1990-01-01

    The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...

  9. Multiple Linear Regression: A Realistic Reflector.

    Nutt, A. T.; Batsell, R. R.

    Examples of the use of Multiple Linear Regression (MLR) techniques are presented. This is done to show how MLR aids data processing and decision-making by providing the decision-maker with freedom in phrasing questions and by accurately reflecting the data on hand. A brief overview of the rationale underlying MLR is given, some basic definitions…

  10. Controlling attribute effect in linear regression

    Calders, Toon; Karim, Asim A.; Kamiran, Faisal; Ali, Wasif Mohammad; Zhang, Xiangliang

    2013-01-01

    In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.

  11. Controlling attribute effect in linear regression

    Calders, Toon

    2013-12-01

    In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.

  12. Post-processing through linear regression

    van Schaeybroeck, B.; Vannitsem, S.

    2011-03-01

    Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  13. Post-processing through linear regression

    B. Van Schaeybroeck

    2011-03-01

    Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.

    These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  14. Linear regression and the normality assumption.

    Schmidt, Amand F; Finan, Chris

    2017-12-16

    Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Signal Enhancement with Variable Span Linear Filters

    Benesty, Jacob; Christensen, Mads Græsbøll; Jensen, Jesper Rindom

    This book introduces readers to the novel concept of variable span speech enhancement filters, and demonstrates how it can be used for effective noise reduction in various ways. Further, the book provides the accompanying Matlab code, allowing readers to easily implement the main ideas discussed....... Variable span filters combine the ideas of optimal linear filters with those of subspace methods, as they involve the joint diagonalization of the correlation matrices of the desired signal and the noise. The book shows how some well-known filter designs, e.g. the minimum distortion, maximum signal......-to-noise ratio, Wiener, and tradeoff filters (including their new generalizations) can be obtained using the variable span filter framework. It then illustrates how the variable span filters can be applied in various contexts, namely in single-channel STFT-based enhancement, in multichannel enhancement in both...

  16. Signal enhancement with variable span linear filters

    Benesty, Jacob; Jensen, Jesper R

    2016-01-01

    This book introduces readers to the novel concept of variable span speech enhancement filters, and demonstrates how it can be used for effective noise reduction in various ways. Further, the book provides the accompanying Matlab code, allowing readers to easily implement the main ideas discussed. Variable span filters combine the ideas of optimal linear filters with those of subspace methods, as they involve the joint diagonalization of the correlation matrices of the desired signal and the noise. The book shows how some well-known filter designs, e.g. the minimum distortion, maximum signal-to-noise ratio, Wiener, and tradeoff filters (including their new generalizations) can be obtained using the variable span filter framework. It then illustrates how the variable span filters can be applied in various contexts, namely in single-channel STFT-based enhancement, in multichannel enhancement in both the time and STFT domains, and, lastly, in time-domain binaural enhancement. In these contexts, the properties of ...

  17. Neutrosophic Correlation and Simple Linear Regression

    A. A. Salama

    2014-09-01

    Full Text Available Since the world is full of indeterminacy, the neutrosophics found their place into contemporary research. The fundamental concepts of neutrosophic set, introduced by Smarandache. Recently, Salama et al., introduced the concept of correlation coefficient of neutrosophic data. In this paper, we introduce and study the concepts of correlation and correlation coefficient of neutrosophic data in probability spaces and study some of their properties. Also, we introduce and study the neutrosophic simple linear regression model. Possible applications to data processing are touched upon.

  18. Signal Enhancement with Variable Span Linear Filters

    Benesty, Jacob; Christensen, Mads Græsbøll; Jensen, Jesper Rindom

    . Variable span filters combine the ideas of optimal linear filters with those of subspace methods, as they involve the joint diagonalization of the correlation matrices of the desired signal and the noise. The book shows how some well-known filter designs, e.g. the minimum distortion, maximum signal...... the time and STFT domains, and, lastly, in time-domain binaural enhancement. In these contexts, the properties of these filters are analyzed in terms of their noise reduction capabilities and desired signal distortion, and the analyses are validated and further explored in simulations....

  19. Linear regression crash prediction models : issues and proposed solutions.

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  20. Suppression Situations in Multiple Linear Regression

    Shieh, Gwowen

    2006-01-01

    This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…

  1. Two Paradoxes in Linear Regression Analysis

    FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong

    2016-01-01

    Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214

  2. Fuzzy multiple linear regression: A computational approach

    Juang, C. H.; Huang, X. H.; Fleming, J. W.

    1992-01-01

    This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.

  3. Time signal filtering by relative neighborhood graph localized linear approximation

    Sørensen, John Aasted

    1994-01-01

    A time signal filtering algorithm based on the relative neighborhood graph (RNG) used for localization of linear filters is proposed. The filter is constructed from a training signal during two stages. During the first stage an RNG is constructed. During the second stage, localized linear filters...

  4. Linear theory for filtering nonlinear multiscale systems with model error.

    Berry, Tyrus; Harlim, John

    2014-07-08

    procedure, simultaneously produce accurate filtering and equilibrium statistical prediction. In contrast, an offline estimation technique based on a linear regression, which fits the parameters to a training dataset without using the filter, yields filter estimates which are worse than the observations or even divergent when the slow variables are not fully observed. This finding does not imply that all offline methods are inherently inferior to the online method for nonlinear estimation problems, it only suggests that an ideal estimation technique should estimate all parameters simultaneously whether it is online or offline.

  5. Augmenting Data with Published Results in Bayesian Linear Regression

    de Leeuw, Christiaan; Klugkist, Irene

    2012-01-01

    In most research, linear regression analyses are performed without taking into account published results (i.e., reported summary statistics) of similar previous studies. Although the prior density in Bayesian linear regression could accommodate such prior knowledge, formal models for doing so are absent from the literature. The goal of this…

  6. A test for the parameters of multiple linear regression models ...

    A test for the parameters of multiple linear regression models is developed for conducting tests simultaneously on all the parameters of multiple linear regression models. The test is robust relative to the assumptions of homogeneity of variances and absence of serial correlation of the classical F-test. Under certain null and ...

  7. Who Will Win?: Predicting the Presidential Election Using Linear Regression

    Lamb, John H.

    2007-01-01

    This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…

  8. Predicting respiratory tumor motion with multi-dimensional adaptive filters and support vector regression

    Riaz, Nadeem; Wiersma, Rodney; Mao Weihua; Xing Lei; Shanker, Piyush; Gudmundsson, Olafur; Widrow, Bernard

    2009-01-01

    Intra-fraction tumor tracking methods can improve radiation delivery during radiotherapy sessions. Image acquisition for tumor tracking and subsequent adjustment of the treatment beam with gating or beam tracking introduces time latency and necessitates predicting the future position of the tumor. This study evaluates the use of multi-dimensional linear adaptive filters and support vector regression to predict the motion of lung tumors tracked at 30 Hz. We expand on the prior work of other groups who have looked at adaptive filters by using a general framework of a multiple-input single-output (MISO) adaptive system that uses multiple correlated signals to predict the motion of a tumor. We compare the performance of these two novel methods to conventional methods like linear regression and single-input, single-output adaptive filters. At 400 ms latency the average root-mean-square-errors (RMSEs) for the 14 treatment sessions studied using no prediction, linear regression, single-output adaptive filter, MISO and support vector regression are 2.58, 1.60, 1.58, 1.71 and 1.26 mm, respectively. At 1 s, the RMSEs are 4.40, 2.61, 3.34, 2.66 and 1.93 mm, respectively. We find that support vector regression most accurately predicts the future tumor position of the methods studied and can provide a RMSE of less than 2 mm at 1 s latency. Also, a multi-dimensional adaptive filter framework provides improved performance over single-dimension adaptive filters. Work is underway to combine these two frameworks to improve performance.

  9. Use of probabilistic weights to enhance linear regression myoelectric control.

    Smith, Lauren H; Kuiken, Todd A; Hargrove, Levi J

    2015-12-01

    Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts' law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p linear regression control. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  10. A family of quantization based piecewise linear filter networks

    Sørensen, John Aasted

    1992-01-01

    A family of quantization-based piecewise linear filter networks is proposed. For stationary signals, a filter network from this family is a generalization of the classical Wiener filter with an input signal and a desired response. The construction of the filter network is based on quantization...... of the input signal x(n) into quantization classes. With each quantization class is associated a linear filter. The filtering at time n is carried out by the filter belonging to the actual quantization class of x(n ) and the filters belonging to the neighbor quantization classes of x(n) (regularization......). This construction leads to a three-layer filter network. The first layer consists of the quantization class filters for the input signal. The second layer carries out the regularization between neighbor quantization classes, and the third layer constitutes a decision of quantization class from where the resulting...

  11. Distributed Monitoring of the R2 Statistic for Linear Regression

    National Aeronautics and Space Administration — The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and...

  12. Identification of Influential Points in a Linear Regression Model

    Jan Grosz

    2011-03-01

    Full Text Available The article deals with the detection and identification of influential points in the linear regression model. Three methods of detection of outliers and leverage points are described. These procedures can also be used for one-sample (independentdatasets. This paper briefly describes theoretical aspects of several robust methods as well. Robust statistics is a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. A simulation model of the simple linear regression is presented.

  13. Learning a Nonnegative Sparse Graph for Linear Regression.

    Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung

    2015-09-01

    Previous graph-based semisupervised learning (G-SSL) methods have the following drawbacks: 1) they usually predefine the graph structure and then use it to perform label prediction, which cannot guarantee an overall optimum and 2) they only focus on the label prediction or the graph structure construction but are not competent in handling new samples. To this end, a novel nonnegative sparse graph (NNSG) learning method was first proposed. Then, both the label prediction and projection learning were integrated into linear regression. Finally, the linear regression and graph structure learning were unified within the same framework to overcome these two drawbacks. Therefore, a novel method, named learning a NNSG for linear regression was presented, in which the linear regression and graph learning were simultaneously performed to guarantee an overall optimum. In the learning process, the label information can be accurately propagated via the graph structure so that the linear regression can learn a discriminative projection to better fit sample labels and accurately classify new samples. An effective algorithm was designed to solve the corresponding optimization problem with fast convergence. Furthermore, NNSG provides a unified perceptiveness for a number of graph-based learning methods and linear regression methods. The experimental results showed that NNSG can obtain very high classification accuracy and greatly outperforms conventional G-SSL methods, especially some conventional graph construction methods.

  14. Teaching the Concept of Breakdown Point in Simple Linear Regression.

    Chan, Wai-Sum

    2001-01-01

    Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…

  15. Testing hypotheses for differences between linear regression lines

    Stanley J. Zarnoch

    2009-01-01

    Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...

  16. Noise Reduction with Optimal Variable Span Linear Filters

    Jensen, Jesper Rindom; Benesty, Jacob; Christensen, Mads Græsbøll

    2016-01-01

    In this paper, the problem of noise reduction is addressed as a linear filtering problem in a novel way by using concepts from subspace-based enhancement methods, resulting in variable span linear filters. This is done by forming the filter coefficients as linear combinations of a number...... included in forming the filter. Using these concepts, a number of different filter designs are considered, like minimum distortion, Wiener, maximum SNR, and tradeoff filters. Interestingly, all these can be expressed as special cases of variable span filters. We also derive expressions for the speech...... demonstrate the advantages and properties of the variable span filter designs, and their potential performance gain compared to widely used speech enhancement methods....

  17. Evaluation of Linear Regression Simultaneous Myoelectric Control Using Intramuscular EMG.

    Smith, Lauren H; Kuiken, Todd A; Hargrove, Levi J

    2016-04-01

    The objective of this study was to evaluate the ability of linear regression models to decode patterns of muscle coactivation from intramuscular electromyogram (EMG) and provide simultaneous myoelectric control of a virtual 3-DOF wrist/hand system. Performance was compared to the simultaneous control of conventional myoelectric prosthesis methods using intramuscular EMG (parallel dual-site control)-an approach that requires users to independently modulate individual muscles in the residual limb, which can be challenging for amputees. Linear regression control was evaluated in eight able-bodied subjects during a virtual Fitts' law task and was compared to performance of eight subjects using parallel dual-site control. An offline analysis also evaluated how different types of training data affected prediction accuracy of linear regression control. The two control systems demonstrated similar overall performance; however, the linear regression method demonstrated improved performance for targets requiring use of all three DOFs, whereas parallel dual-site control demonstrated improved performance for targets that required use of only one DOF. Subjects using linear regression control could more easily activate multiple DOFs simultaneously, but often experienced unintended movements when trying to isolate individual DOFs. Offline analyses also suggested that the method used to train linear regression systems may influence controllability. Linear regression myoelectric control using intramuscular EMG provided an alternative to parallel dual-site control for 3-DOF simultaneous control at the wrist and hand. The two methods demonstrated different strengths in controllability, highlighting the tradeoff between providing simultaneous control and the ability to isolate individual DOFs when desired.

  18. Estimating monotonic rates from biological data using local linear regression.

    Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R

    2017-03-01

    Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.

  19. Comparison of Classical Linear Regression and Orthogonal Regression According to the Sum of Squares Perpendicular Distances

    KELEŞ, Taliha; ALTUN, Murat

    2016-01-01

    Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR) and the comparison of classical linear regression (CLR) and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it wa...

  20. Supervised scale-regularized linear convolutionary filters

    Loog, Marco; Lauze, Francois Bernard

    2017-01-01

    also be solved relatively efficient. All in all, the idea is to properly control the scale of a trained filter, which we solve by introducing a specific regularization term into the overall objective function. We demonstrate, on an artificial filter learning problem, the capabil- ities of our basic...

  1. Use of probabilistic weights to enhance linear regression myoelectric control

    Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

    2015-12-01

    Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  2. Decomposition of ECG by linear filtering.

    Murthy, I S; Niranjan, U C

    1992-01-01

    A simple method is developed for the delineation of a given electrocardiogram (ECG) signal into its component waves. The properties of discrete cosine transform (DCT) are exploited for the purpose. The transformed signal is convolved with appropriate filters and the component waves are obtained by computing the inverse transform (IDCT) of the filtered signals. The filters are derived from the time signal itself. Analysis of continuous strips of ECG signals with various arrhythmias showed that the performance of the method is satisfactory both qualitatively and quantitatively. The small amplitude P wave usually had a high percentage rms difference (PRD) compared to the other large component waves.

  3. Biostatistics Series Module 6: Correlation and Linear Regression.

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.

  4. Linear regression methods a ccording to objective functions

    Yasemin Sisman; Sebahattin Bektas

    2012-01-01

    The aim of the study is to explain the parameter estimation methods and the regression analysis. The simple linear regressionmethods grouped according to the objective function are introduced. The numerical solution is achieved for the simple linear regressionmethods according to objective function of Least Squares and theLeast Absolute Value adjustment methods. The success of the appliedmethods is analyzed using their objective function values.

  5. Regression of non-linear coupling of noise in LIGO detectors

    Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.

    2018-03-01

    In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.

  6. Optimal choice of basis functions in the linear regression analysis

    Khotinskij, A.M.

    1988-01-01

    Problem of optimal choice of basis functions in the linear regression analysis is investigated. Step algorithm with estimation of its efficiency, which holds true at finite number of measurements, is suggested. Conditions, providing the probability of correct choice close to 1 are formulated. Application of the step algorithm to analysis of decay curves is substantiated. 8 refs

  7. Common pitfalls in statistical analysis: Linear regression analysis

    Rakesh Aggarwal

    2017-01-01

    Full Text Available In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.

  8. How Robust Is Linear Regression with Dummy Variables?

    Blankmeyer, Eric

    2006-01-01

    Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…

  9. On the null distribution of Bayes factors in linear regression

    We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...

  10. Fitting program for linear regressions according to Mahon (1996)

    2018-01-09

    This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.

  11. Data Transformations for Inference with Linear Regression: Clarifications and Recommendations

    Pek, Jolynn; Wong, Octavia; Wong, C. M.

    2017-01-01

    Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…

  12. Linear regression and sensitivity analysis in nuclear reactor design

    Kumar, Akansha; Tsvetkov, Pavel V.; McClarren, Ryan G.

    2015-01-01

    Highlights: • Presented a benchmark for the applicability of linear regression to complex systems. • Applied linear regression to a nuclear reactor power system. • Performed neutronics, thermal–hydraulics, and energy conversion using Brayton’s cycle for the design of a GCFBR. • Performed detailed sensitivity analysis to a set of parameters in a nuclear reactor power system. • Modeled and developed reactor design using MCNP, regression using R, and thermal–hydraulics in Java. - Abstract: The paper presents a general strategy applicable for sensitivity analysis (SA), and uncertainity quantification analysis (UA) of parameters related to a nuclear reactor design. This work also validates the use of linear regression (LR) for predictive analysis in a nuclear reactor design. The analysis helps to determine the parameters on which a LR model can be fit for predictive analysis. For those parameters, a regression surface is created based on trial data and predictions are made using this surface. A general strategy of SA to determine and identify the influential parameters those affect the operation of the reactor is mentioned. Identification of design parameters and validation of linearity assumption for the application of LR of reactor design based on a set of tests is performed. The testing methods used to determine the behavior of the parameters can be used as a general strategy for UA, and SA of nuclear reactor models, and thermal hydraulics calculations. A design of a gas cooled fast breeder reactor (GCFBR), with thermal–hydraulics, and energy transfer has been used for the demonstration of this method. MCNP6 is used to simulate the GCFBR design, and perform the necessary criticality calculations. Java is used to build and run input samples, and to extract data from the output files of MCNP6, and R is used to perform regression analysis and other multivariate variance, and analysis of the collinearity of data

  13. Direction of Effects in Multiple Linear Regression Models.

    Wiedermann, Wolfgang; von Eye, Alexander

    2015-01-01

    Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.

  14. SPLINE LINEAR REGRESSION USED FOR EVALUATING FINANCIAL ASSETS 1

    Liviu GEAMBAŞU

    2010-12-01

    Full Text Available One of the most important preoccupations of financial markets participants was and still is the problem of determining more precise the trend of financial assets prices. For solving this problem there were written many scientific papers and were developed many mathematical and statistical models in order to better determine the financial assets price trend. If until recently the simple linear models were largely used due to their facile utilization, the financial crises that affected the world economy starting with 2008 highlight the necessity of adapting the mathematical models to variation of economy. A simple to use model but adapted to economic life realities is the spline linear regression. This type of regression keeps the continuity of regression function, but split the studied data in intervals with homogenous characteristics. The characteristics of each interval are highlighted and also the evolution of market over all the intervals, resulting reduced standard errors. The first objective of the article is the theoretical presentation of the spline linear regression, also referring to scientific national and international papers related to this subject. The second objective is applying the theoretical model to data from the Bucharest Stock Exchange

  15. Simple and multiple linear regression: sample size considerations.

    Hanley, James A

    2016-11-01

    The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Optimal linear filtering of Poisson process with dead time

    Glukhova, E.V.

    1993-01-01

    The paper presents a derivation of an integral equation defining the impulsed transient of optimum linear filtering for evaluation of the intensity of the fluctuating Poisson process with allowance for dead time of transducers

  17. Non-linear and signal energy optimal asymptotic filter design

    Josef Hrusak

    2003-10-01

    Full Text Available The paper studies some connections between the main results of the well known Wiener-Kalman-Bucy stochastic approach to filtering problems based mainly on the linear stochastic estimation theory and emphasizing the optimality aspects of the achieved results and the classical deterministic frequency domain linear filters such as Chebyshev, Butterworth, Bessel, etc. A new non-stochastic but not necessarily deterministic (possibly non-linear alternative approach called asymptotic filtering based mainly on the concepts of signal power, signal energy and a system equivalence relation plays an important role in the presentation. Filtering error invariance and convergence aspects are emphasized in the approach. It is shown that introducing the signal power as the quantitative measure of energy dissipation makes it possible to achieve reasonable results from the optimality point of view as well. The property of structural energy dissipativeness is one of the most important and fundamental features of resulting filters. Therefore, it is natural to call them asymptotic filters. The notion of the asymptotic filter is carried in the paper as a proper tool in order to unify stochastic and non-stochastic, linear and nonlinear approaches to signal filtering.

  18. Implementing fuzzy polynomial interpolation (FPI and fuzzy linear regression (LFR

    Maria Cristina Floreno

    1996-05-01

    Full Text Available This paper presents some preliminary results arising within a general framework concerning the development of software tools for fuzzy arithmetic. The program is in a preliminary stage. What has been already implemented consists of a set of routines for elementary operations, optimized functions evaluation, interpolation and regression. Some of these have been applied to real problems.This paper describes a prototype of a library in C++ for polynomial interpolation of fuzzifying functions, a set of routines in FORTRAN for fuzzy linear regression and a program with graphical user interface allowing the use of such routines.

  19. Stochastic development regression on non-linear manifolds

    Kühnel, Line; Sommer, Stefan Horst

    2017-01-01

    We introduce a regression model for data on non-linear manifolds. The model describes the relation between a set of manifold valued observations, such as shapes of anatomical objects, and Euclidean explanatory variables. The approach is based on stochastic development of Euclidean diffusion...... processes to the manifold. Defining the data distribution as the transition distribution of the mapped stochastic process, parameters of the model, the non-linear analogue of design matrix and intercept, are found via maximum likelihood. The model is intrinsically related to the geometry encoded...

  20. Computer software for linear and nonlinear regression in organic NMR

    Canto, Eduardo Leite do; Rittner, Roberto

    1991-01-01

    Calculation involving two variable linear regressions, require specific procedures generally not familiar to chemist. For attending the necessity of fast and efficient handling of NMR data, a self explained and Pc portable software has been developed, which allows user to produce and use diskette recorded tables, containing chemical shift or any other substituent physical-chemical measurements and constants (σ T , σ o R , E s , ...)

  1. Multicollinearity in applied economics research and the Bayesian linear regression

    EISENSTAT, Eric

    2016-01-01

    This article revises the popular issue of collinearity amongst explanatory variables in the context of a multiple linear regression analysis, particularly in empirical studies within social science related fields. Some important interpretations and explanations are highlighted from the econometrics literature with respect to the effects of multicollinearity on statistical inference, as well as the general shortcomings of the once fervent search for methods intended to detect and mitigate thes...

  2. Extending the linear model with R generalized linear, mixed effects and nonparametric regression models

    Faraway, Julian J

    2005-01-01

    Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...

  3. Establishment of regression dependences. Linear and nonlinear dependences

    Onishchenko, A.M.

    1994-01-01

    The main problems of determination of linear and 19 types of nonlinear regression dependences are completely discussed. It is taken into consideration that total dispersions are the sum of measurement dispersions and parameter variation dispersions themselves. Approaches to all dispersions determination are described. It is shown that the least square fit gives inconsistent estimation for industrial objects and processes. The correction methods by taking into account comparable measurement errors for both variable give an opportunity to obtain consistent estimation for the regression equation parameters. The condition of the correction technique application expediency is given. The technique for determination of nonlinear regression dependences taking into account the dependence form and comparable errors of both variables is described. 6 refs., 1 tab

  4. Return-Volatility Relationship: Insights from Linear and Non-Linear Quantile Regression

    D.E. Allen (David); A.K. Singh (Abhay); R.J. Powell (Robert); M.J. McAleer (Michael); J. Taylor (James); L. Thomas (Lyn)

    2013-01-01

    textabstractThe purpose of this paper is to examine the asymmetric relationship between price and implied volatility and the associated extreme quantile dependence using linear and non linear quantile regression approach. Our goal in this paper is to demonstrate that the relationship between the

  5. Single image super-resolution using locally adaptive multiple linear regression.

    Yu, Soohwan; Kang, Wonseok; Ko, Seungyong; Paik, Joonki

    2015-12-01

    This paper presents a regularized superresolution (SR) reconstruction method using locally adaptive multiple linear regression to overcome the limitation of spatial resolution of digital images. In order to make the SR problem better-posed, the proposed method incorporates the locally adaptive multiple linear regression into the regularization process as a local prior. The local regularization prior assumes that the target high-resolution (HR) pixel is generated by a linear combination of similar pixels in differently scaled patches and optimum weight parameters. In addition, we adapt a modified version of the nonlocal means filter as a smoothness prior to utilize the patch redundancy. Experimental results show that the proposed algorithm better restores HR images than existing state-of-the-art methods in the sense of the most objective measures in the literature.

  6. Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients

    Gorgees, HazimMansoor; Mahdi, FatimahAssim

    2018-05-01

    This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.

  7. On macroeconomic values investigation using fuzzy linear regression analysis

    Richard Pospíšil

    2017-06-01

    Full Text Available The theoretical background for abstract formalization of the vague phenomenon of complex systems is the fuzzy set theory. In the paper, vague data is defined as specialized fuzzy sets - fuzzy numbers and there is described a fuzzy linear regression model as a fuzzy function with fuzzy numbers as vague parameters. To identify the fuzzy coefficients of the model, the genetic algorithm is used. The linear approximation of the vague function together with its possibility area is analytically and graphically expressed. A suitable application is performed in the tasks of the time series fuzzy regression analysis. The time-trend and seasonal cycles including their possibility areas are calculated and expressed. The examples are presented from the economy field, namely the time-development of unemployment, agricultural production and construction respectively between 2009 and 2011 in the Czech Republic. The results are shown in the form of the fuzzy regression models of variables of time series. For the period 2009-2011, the analysis assumptions about seasonal behaviour of variables and the relationship between them were confirmed; in 2010, the system behaved fuzzier and the relationships between the variables were vaguer, that has a lot of causes, from the different elasticity of demand, through state interventions to globalization and transnational impacts.

  8. BRGLM, Interactive Linear Regression Analysis by Least Square Fit

    Ringland, J.T.; Bohrer, R.E.; Sherman, M.E.

    1985-01-01

    1 - Description of program or function: BRGLM is an interactive program written to fit general linear regression models by least squares and to provide a variety of statistical diagnostic information about the fit. Stepwise and all-subsets regression can be carried out also. There are facilities for interactive data management (e.g. setting missing value flags, data transformations) and tools for constructing design matrices for the more commonly-used models such as factorials, cubic Splines, and auto-regressions. 2 - Method of solution: The least squares computations are based on the orthogonal (QR) decomposition of the design matrix obtained using the modified Gram-Schmidt algorithm. 3 - Restrictions on the complexity of the problem: The current release of BRGLM allows maxima of 1000 observations, 99 variables, and 3000 words of main memory workspace. For a problem with N observations and P variables, the number of words of main memory storage required is MAX(N*(P+6), N*P+P*P+3*N, and 3*P*P+6*N). Any linear model may be fit although the in-memory workspace will have to be increased for larger problems

  9. A comparison of random forest regression and multiple linear regression for prediction in neuroscience.

    Smith, Paul F; Ganesh, Siva; Liu, Ping

    2013-10-30

    Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the l-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, l-arginine, l-ornithine, l-citrulline, glutamate and γ-aminobutyric acid (GABA)). The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ≥ 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Relative Importance for Linear Regression in R: The Package relaimpo

    Ulrike Gromping

    2006-09-01

    Full Text Available Relative importance is a topic that has seen a lot of interest in recent years, particularly in applied work. The R package relaimpo implements six different metrics for assessing relative importance of regressors in the linear model, two of which are recommended - averaging over orderings of regressors and a newly proposed metric (Feldman 2005 called pmvd. Apart from delivering the metrics themselves, relaimpo also provides (exploratory bootstrap confidence intervals. This paper offers a brief tutorial introduction to the package. The methods and relaimpo’s functionality are illustrated using the data set swiss that is generally available in R. The paper targets readers who have a basic understanding of multiple linear regression. For the background of more advanced aspects, references are provided.

  11. Stochastic development regression on non-linear manifolds

    Kühnel, Line; Sommer, Stefan Horst

    2017-01-01

    We introduce a regression model for data on non-linear manifolds. The model describes the relation between a set of manifold valued observations, such as shapes of anatomical objects, and Euclidean explanatory variables. The approach is based on stochastic development of Euclidean diffusion...... processes to the manifold. Defining the data distribution as the transition distribution of the mapped stochastic process, parameters of the model, the non-linear analogue of design matrix and intercept, are found via maximum likelihood. The model is intrinsically related to the geometry encoded...... in the connection of the manifold. We propose an estimation procedure which applies the Laplace approximation of the likelihood function. A simulation study of the performance of the model is performed and the model is applied to a real dataset of Corpus Callosum shapes....

  12. Linear filtering applied to Monte Carlo criticality calculations

    Morrison, G.W.; Pike, D.H.; Petrie, L.M.

    1975-01-01

    A significant improvement in the acceleration of the convergence of the eigenvalue computed by Monte Carlo techniques has been developed by applying linear filtering theory to Monte Carlo calculations for multiplying systems. A Kalman filter was applied to a KENO Monte Carlo calculation of an experimental critical system consisting of eight interacting units of fissile material. A comparison of the filter estimate and the Monte Carlo realization was made. The Kalman filter converged in five iterations to 0.9977. After 95 iterations, the average k-eff from the Monte Carlo calculation was 0.9981. This demonstrates that the Kalman filter has the potential of reducing the calculational effort of multiplying systems. Other examples and results are discussed

  13. A brief overview of speech enhancement with linear filtering

    Benesty, Jacob; Christensen, Mads Græsbøll; Jensen, Jesper Rindom

    2014-01-01

    In this paper, we provide an overview of some recently introduced principles and ideas for speech enhancement with linear filtering and explore how these are related and how they can be used in various applications. This is done in a general framework where the speech enhancement problem is stated......-to-noise ratio (SNR), and Wiener filters are derived from the conventional speech enhancement approach and the recently introduced orthogonal decomposition approach. For each of the filters, we derive their properties in terms of output SNR and speech distortion. We then demonstrate how the ideas can be applied...

  14. Linear filtering of systems with memory and application to finance

    2006-01-01

    Full Text Available We study the linear filtering problem for systems driven by continuous Gaussian processes V ( 1 and V ( 2 with memory described by two parameters. The processes V ( j have the virtue that they possess stationary increments and simple semimartingale representations simultaneously. They allow for straightforward parameter estimations. After giving the semimartingale representations of V ( j by innovation theory, we derive Kalman-Bucy-type filtering equations for the systems. We apply the result to the optimal portfolio problem for an investor with partial observations. We illustrate the tractability of the filtering algorithm by numerical implementations.

  15. Robust linear registration of CT images using random regression forests

    Konukoglu, Ender; Criminisi, Antonio; Pathak, Sayan; Robertson, Duncan; White, Steve; Haynor, David; Siddiqui, Khan

    2011-03-01

    Global linear registration is a necessary first step for many different tasks in medical image analysis. Comparing longitudinal studies1, cross-modality fusion2, and many other applications depend heavily on the success of the automatic registration. The robustness and efficiency of this step is crucial as it affects all subsequent operations. Most common techniques cast the linear registration problem as the minimization of a global energy function based on the image intensities. Although these algorithms have proved useful, their robustness in fully automated scenarios is still an open question. In fact, the optimization step often gets caught in local minima yielding unsatisfactory results. Recent algorithms constrain the space of registration parameters by exploiting implicit or explicit organ segmentations, thus increasing robustness4,5. In this work we propose a novel robust algorithm for automatic global linear image registration. Our method uses random regression forests to estimate posterior probability distributions for the locations of anatomical structures - represented as axis aligned bounding boxes6. These posterior distributions are later integrated in a global linear registration algorithm. The biggest advantage of our algorithm is that it does not require pre-defined segmentations or regions. Yet it yields robust registration results. We compare the robustness of our algorithm with that of the state of the art Elastix toolbox7. Validation is performed via 1464 pair-wise registrations in a database of very diverse 3D CT images. We show that our method decreases the "failure" rate of the global linear registration from 12.5% (Elastix) to only 1.9%.

  16. Non-linear DSGE Models, The Central Difference Kalman Filter, and The Mean Shifted Particle Filter

    Andreasen, Martin Møller

    This paper shows how non-linear DSGE models with potential non-normal shocks can be estimated by Quasi-Maximum Likelihood based on the Central Difference Kalman Filter (CDKF). The advantage of this estimator is that evaluating the quasi log-likelihood function only takes a fraction of a second....... The second contribution of this paper is to derive a new particle filter which we term the Mean Shifted Particle Filter (MSPFb). We show that the MSPFb outperforms the standard Particle Filter by delivering more precise state estimates, and in general the MSPFb has lower Monte Carlo variation in the reported...

  17. Estimating Loess Plateau Average Annual Precipitation with Multiple Linear Regression Kriging and Geographically Weighted Regression Kriging

    Qiutong Jin

    2016-06-01

    Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.

  18. Group Lifting Structures For Multirate Filter Banks, II: Linear Phase Filter Banks

    Brislawn, Christopher M [Los Alamos National Laboratory

    2008-01-01

    The theory of group lifting structures is applied to linear phase lifting factorizations for the two nontrivial classes of two-channel linear phase perfect reconstruction filter banks, the whole-and half-sample symmetric classes. Group lifting structures defined for the reversible and irreversible classes of whole-and half-sample symmetric filter banks are shown to satisfy the hypotheses of the uniqueness theorem for group lifting structures. It follows that linear phase lifting factorizations of whole-and half-sample symmetric filter banks are therefore independent of the factorization methods used to compute them. These results cover the specification of user-defined whole-sample symmetric filter banks in Part 2 of the ISO JPEG 2000 standard.

  19. Comparison of Linear and Non-linear Regression Analysis to Determine Pulmonary Pressure in Hyperthyroidism.

    Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan

    2017-01-01

    This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second

  20. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  1. Convergence diagnostics for Eigenvalue problems with linear regression model

    Shi, Bo; Petrovic, Bojan

    2011-01-01

    Although the Monte Carlo method has been extensively used for criticality/Eigenvalue problems, a reliable, robust, and efficient convergence diagnostics method is still desired. Most methods are based on integral parameters (multiplication factor, entropy) and either condense the local distribution information into a single value (e.g., entropy) or even disregard it. We propose to employ the detailed cycle-by-cycle local flux evolution obtained by using mesh tally mechanism to assess the source and flux convergence. By applying a linear regression model to each individual mesh in a mesh tally for convergence diagnostics, a global convergence criterion can be obtained. We exemplify this method on two problems and obtain promising diagnostics results. (author)

  2. All-Pass Filter Based Linear Voltage Controlled Quadrature Oscillator

    Koushick Mathur

    2017-01-01

    Full Text Available A linear voltage controlled quadrature oscillator implemented from a first-order electronically tunable all-pass filter (ETAF is presented. The active element is commercially available current feedback amplifier (AD844 in conjunction with the relatively new Multiplication Mode Current Conveyor (MMCC device. Electronic tunability is obtained by the control node voltage (V of the MMCC. Effects of the device nonidealities, namely, the parasitic capacitors and the roll-off poles of the port-transfer ratios of the device, are shown to be negligible, even though the usable high-frequency ranges are constrained by these imperfections. Subsequently the filter is looped with an electronically tunable integrator (ETI to implement the quadrature oscillator (QO. Experimental responses on the voltage tunable phase of the filter and the linear-tuning law of the quadrature oscillator up to 9.9 MHz at low THD are verified by simulation and hardware tests.

  3. Regressão linear geograficamente ponderada em ambiente SIG

    Luís Eduardo Ximenes Carvalho

    2009-10-01

    Full Text Available

    Este artigo aborda considerações teóricas e resultados da implementação em ambiente SIG de um modelo confirmatório de estatística espacial — regressão linear geograficamente ponderada (RGP — não disponível em ambiente livre. Os aspectos teóricos deste modelo local de regressão espacial foram amplamente discutidos em virtude da escassa bibliografia existente. O modelo RGP foi implementado na linguagem de programação GISDK do SIG-T TransCAD, utilizando compreensivamente as ferramentas de manipulação, tratamento georreferenciado dos dados e rotinas de análise espacial disponibilizadas em plataformas SIG. Ao final, espera-se ter desenvolvido, ainda que de maneira parcial, uma importante ferramenta que contribuirá para a compreensão e refinamento da modelagem de fenômenos geográficos tão amplamente analisados em estudos de Planejamento de Transportes.

  4. Modeling Pan Evaporation for Kuwait by Multiple Linear Regression

    Almedeij, Jaber

    2012-01-01

    Evaporation is an important parameter for many projects related to hydrology and water resources systems. This paper constitutes the first study conducted in Kuwait to obtain empirical relations for the estimation of daily and monthly pan evaporation as functions of available meteorological data of temperature, relative humidity, and wind speed. The data used here for the modeling are daily measurements of substantial continuity coverage, within a period of 17 years between January 1993 and December 2009, which can be considered representative of the desert climate of the urban zone of the country. Multiple linear regression technique is used with a procedure of variable selection for fitting the best model forms. The correlations of evaporation with temperature and relative humidity are also transformed in order to linearize the existing curvilinear patterns of the data by using power and exponential functions, respectively. The evaporation models suggested with the best variable combinations were shown to produce results that are in a reasonable agreement with observation values. PMID:23226984

  5. Kalman filtering for time-delayed linear systems

    LU Xiao; WANG Wei

    2006-01-01

    This paper is to study the linear minimum variance estimation for discrete- time systems. A simple approach to the problem is presented by developing re-organized innovation analysis for the systems with instantaneous and double time-delayed measurements. It is shown that the derived estimator involves solving three different standard Kalman filtering with the same dimension as the original system. The obtained results form the basis for solving some complicated problems such as H∞ fixed-lag smoothing, preview control, H∞ filtering and control with time delays.

  6. Enhancement of Visual Field Predictions with Pointwise Exponential Regression (PER) and Pointwise Linear Regression (PLR).

    Morales, Esteban; de Leon, John Mark S; Abdollahi, Niloufar; Yu, Fei; Nouri-Mahdavi, Kouros; Caprioli, Joseph

    2016-03-01

    The study was conducted to evaluate threshold smoothing algorithms to enhance prediction of the rates of visual field (VF) worsening in glaucoma. We studied 798 patients with primary open-angle glaucoma and 6 or more years of follow-up who underwent 8 or more VF examinations. Thresholds at each VF location for the first 4 years or first half of the follow-up time (whichever was greater) were smoothed with clusters defined by the nearest neighbor (NN), Garway-Heath, Glaucoma Hemifield Test (GHT), and weighting by the correlation of rates at all other VF locations. Thresholds were regressed with a pointwise exponential regression (PER) model and a pointwise linear regression (PLR) model. Smaller root mean square error (RMSE) values of the differences between the observed and the predicted thresholds at last two follow-ups indicated better model predictions. The mean (SD) follow-up times for the smoothing and prediction phase were 5.3 (1.5) and 10.5 (3.9) years. The mean RMSE values for the PER and PLR models were unsmoothed data, 6.09 and 6.55; NN, 3.40 and 3.42; Garway-Heath, 3.47 and 3.48; GHT, 3.57 and 3.74; and correlation of rates, 3.59 and 3.64. Smoothed VF data predicted better than unsmoothed data. Nearest neighbor provided the best predictions; PER also predicted consistently more accurately than PLR. Smoothing algorithms should be used when forecasting VF results with PER or PLR. The application of smoothing algorithms on VF data can improve forecasting in VF points to assist in treatment decisions.

  7. Characteristics and Properties of a Simple Linear Regression Model

    Kowal Robert

    2016-12-01

    Full Text Available A simple linear regression model is one of the pillars of classic econometrics. Despite the passage of time, it continues to raise interest both from the theoretical side as well as from the application side. One of the many fundamental questions in the model concerns determining derivative characteristics and studying the properties existing in their scope, referring to the first of these aspects. The literature of the subject provides several classic solutions in that regard. In the paper, a completely new design is proposed, based on the direct application of variance and its properties, resulting from the non-correlation of certain estimators with the mean, within the scope of which some fundamental dependencies of the model characteristics are obtained in a much more compact manner. The apparatus allows for a simple and uniform demonstration of multiple dependencies and fundamental properties in the model, and it does it in an intuitive manner. The results were obtained in a classic, traditional area, where everything, as it might seem, has already been thoroughly studied and discovered.

  8. Exhaustive Search for Sparse Variable Selection in Linear Regression

    Igarashi, Yasuhiko; Takenaka, Hikaru; Nakanishi-Ohno, Yoshinori; Uemura, Makoto; Ikeda, Shiro; Okada, Masato

    2018-04-01

    We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.

  9. Human visual modeling and image deconvolution by linear filtering

    Larminat, P. de; Barba, D.; Gerber, R.; Ronsin, J.

    1978-01-01

    The problem is the numerical restoration of images degraded by passing through a known and spatially invariant linear system, and by the addition of a stationary noise. We propose an improvement of the Wiener's filter to allow the restoration of such images. This improvement allows to reduce the important drawbacks of classical Wiener's filter: the voluminous data processing, the lack of consideration of the vision's characteristivs which condition the perception by the observer of the restored image. In a first paragraph, we describe the structure of the visual detection system and a modelling method of this system. In the second paragraph we explain a restoration method by Wiener filtering that takes the visual properties into account and that can be adapted to the local properties of the image. Then the results obtained on TV images or scintigrams (images obtained by a gamma-camera) are commented [fr

  10. Weibull and lognormal Taguchi analysis using multiple linear regression

    Piña-Monarrez, Manuel R.; Ortiz-Yañez, Jesús F.

    2015-01-01

    The paper provides to reliability practitioners with a method (1) to estimate the robust Weibull family when the Taguchi method (TM) is applied, (2) to estimate the normal operational Weibull family in an accelerated life testing (ALT) analysis to give confidence to the extrapolation and (3) to perform the ANOVA analysis to both the robust and the normal operational Weibull family. On the other hand, because the Weibull distribution neither has the normal additive property nor has a direct relationship with the normal parameters (µ, σ), in this paper, the issues of estimating a Weibull family by using a design of experiment (DOE) are first addressed by using an L_9 (3"4) orthogonal array (OA) in both the TM and in the Weibull proportional hazard model approach (WPHM). Then, by using the Weibull/Gumbel and the lognormal/normal relationships and multiple linear regression, the direct relationships between the Weibull and the lifetime parameters are derived and used to formulate the proposed method. Moreover, since the derived direct relationships always hold, the method is generalized to the lognormal and ALT analysis. Finally, the method’s efficiency is shown through its application to the used OA and to a set of ALT data. - Highlights: • It gives the statistical relations and steps to use the Taguchi Method (TM) to analyze Weibull data. • It gives the steps to determine the unknown Weibull family to both the robust TM setting and the normal ALT level. • It gives a method to determine the expected lifetimes and to perform its ANOVA analysis in TM and ALT analysis. • It gives a method to give confidence to the extrapolation in an ALT analysis by using the Weibull family of the normal level.

  11. EPMLR: sequence-based linear B-cell epitope prediction method using multiple linear regression.

    Lian, Yao; Ge, Meng; Pan, Xian-Ming

    2014-12-19

    B-cell epitopes have been studied extensively due to their immunological applications, such as peptide-based vaccine development, antibody production, and disease diagnosis and therapy. Despite several decades of research, the accurate prediction of linear B-cell epitopes has remained a challenging task. In this work, based on the antigen's primary sequence information, a novel linear B-cell epitope prediction model was developed using the multiple linear regression (MLR). A 10-fold cross-validation test on a large non-redundant dataset was performed to evaluate the performance of our model. To alleviate the problem caused by the noise of negative dataset, 300 experiments utilizing 300 sub-datasets were performed. We achieved overall sensitivity of 81.8%, precision of 64.1% and area under the receiver operating characteristic curve (AUC) of 0.728. We have presented a reliable method for the identification of linear B cell epitope using antigen's primary sequence information. Moreover, a web server EPMLR has been developed for linear B-cell epitope prediction: http://www.bioinfo.tsinghua.edu.cn/epitope/EPMLR/ .

  12. Filtering Non-Linear Transfer Functions on Surfaces.

    Heitz, Eric; Nowrouzezahrai, Derek; Poulin, Pierre; Neyret, Fabrice

    2014-07-01

    Applying non-linear transfer functions and look-up tables to procedural functions (such as noise), surface attributes, or even surface geometry are common strategies used to enhance visual detail. Their simplicity and ability to mimic a wide range of realistic appearances have led to their adoption in many rendering problems. As with any textured or geometric detail, proper filtering is needed to reduce aliasing when viewed across a range of distances, but accurate and efficient transfer function filtering remains an open problem for several reasons: transfer functions are complex and non-linear, especially when mapped through procedural noise and/or geometry-dependent functions, and the effects of perspective and masking further complicate the filtering over a pixel's footprint. We accurately solve this problem by computing and sampling from specialized filtering distributions on the fly, yielding very fast performance. We investigate the case where the transfer function to filter is a color map applied to (macroscale) surface textures (like noise), as well as color maps applied according to (microscale) geometric details. We introduce a novel representation of a (potentially modulated) color map's distribution over pixel footprints using Gaussian statistics and, in the more complex case of high-resolution color mapped microsurface details, our filtering is view- and light-dependent, and capable of correctly handling masking and occlusion effects. Our approach can be generalized to filter other physical-based rendering quantities. We propose an application to shading with irradiance environment maps over large terrains. Our framework is also compatible with the case of transfer functions used to warp surface geometry, as long as the transformations can be represented with Gaussian statistics, leading to proper view- and light-dependent filtering results. Our results match ground truth and our solution is well suited to real-time applications, requires only a few

  13. Implicit collinearity effect in linear regression: Application to basal ...

    Collinearity of predictor variables is a severe problem in the least square regression analysis. It contributes to the instability of regression coefficients and leads to a wrong prediction accuracy. Despite these problems, studies are conducted with a large number of observed and derived variables linked with a response ...

  14. An ensemble Kalman filter for statistical estimation of physics constrained nonlinear regression models

    Harlim, John; Mahdi, Adam; Majda, Andrew J.

    2014-01-01

    A central issue in contemporary science is the development of nonlinear data driven statistical–dynamical models for time series of noisy partial observations from nature or a complex model. It has been established recently that ad-hoc quadratic multi-level regression models can have finite-time blow-up of statistical solutions and/or pathological behavior of their invariant measure. Recently, a new class of physics constrained nonlinear regression models were developed to ameliorate this pathological behavior. Here a new finite ensemble Kalman filtering algorithm is developed for estimating the state, the linear and nonlinear model coefficients, the model and the observation noise covariances from available partial noisy observations of the state. Several stringent tests and applications of the method are developed here. In the most complex application, the perfect model has 57 degrees of freedom involving a zonal (east–west) jet, two topographic Rossby waves, and 54 nonlinearly interacting Rossby waves; the perfect model has significant non-Gaussian statistics in the zonal jet with blocked and unblocked regimes and a non-Gaussian skewed distribution due to interaction with the other 56 modes. We only observe the zonal jet contaminated by noise and apply the ensemble filter algorithm for estimation. Numerically, we find that a three dimensional nonlinear stochastic model with one level of memory mimics the statistical effect of the other 56 modes on the zonal jet in an accurate fashion, including the skew non-Gaussian distribution and autocorrelation decay. On the other hand, a similar stochastic model with zero memory levels fails to capture the crucial non-Gaussian behavior of the zonal jet from the perfect 57-mode model

  15. Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

    Nielsen, Allan Aasbjerg

    2007-01-01

    This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying...... and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain...... the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between...

  16. Railway Crossing Risk Area Detection Using Linear Regression and Terrain Drop Compensation Techniques

    Chen, Wen-Yuan; Wang, Mei; Fu, Zhou-Xing

    2014-01-01

    Most railway accidents happen at railway crossings. Therefore, how to detect humans or objects present in the risk area of a railway crossing and thus prevent accidents are important tasks. In this paper, three strategies are used to detect the risk area of a railway crossing: (1) we use a terrain drop compensation (TDC) technique to solve the problem of the concavity of railway crossings; (2) we use a linear regression technique to predict the position and length of an object from image processing; (3) we have developed a novel strategy called calculating local maximum Y-coordinate object points (CLMYOP) to obtain the ground points of the object. In addition, image preprocessing is also applied to filter out the noise and successfully improve the object detection. From the experimental results, it is demonstrated that our scheme is an effective and corrective method for the detection of railway crossing risk areas. PMID:24936948

  17. Railway Crossing Risk Area Detection Using Linear Regression and Terrain Drop Compensation Techniques

    Wen-Yuan Chen

    2014-06-01

    Full Text Available Most railway accidents happen at railway crossings. Therefore, how to detect humans or objects present in the risk area of a railway crossing and thus prevent accidents are important tasks. In this paper, three strategies are used to detect the risk area of a railway crossing: (1 we use a terrain drop compensation (TDC technique to solve the problem of the concavity of railway crossings; (2 we use a linear regression technique to predict the position and length of an object from image processing; (3 we have developed a novel strategy called calculating local maximum Y-coordinate object points (CLMYOP to obtain the ground points of the object. In addition, image preprocessing is also applied to filter out the noise and successfully improve the object detection. From the experimental results, it is demonstrated that our scheme is an effective and corrective method for the detection of railway crossing risk areas.

  18. Linear regression models for quantitative assessment of left ...

    Changes in left ventricular structures and function have been reported in cardiomyopathies. No prediction models have been established in this environment. This study established regression models for prediction of left ventricular structures in normal subjects. A sample of normal subjects was drawn from a large urban ...

  19. Linearity and Misspecification Tests for Vector Smooth Transition Regression Models

    Teräsvirta, Timo; Yang, Yukai

    The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...

  20. Using multiple linear regression techniques to quantify carbon ...

    Fallow ecosystems provide a significant carbon stock that can be quantified for inclusion in the accounts of global carbon budgets. Process and statistical models of productivity, though useful, are often technically rigid as the conditions for their application are not easy to satisfy. Multiple regression techniques have been ...

  1. Interpreting Multiple Linear Regression: A Guidebook of Variable Importance

    Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim

    2012-01-01

    Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…

  2. Testing for marginal linear effects in quantile regression

    Wang, Huixia Judy

    2017-10-23

    The paper develops a new marginal testing procedure to detect significant predictors that are associated with the conditional quantiles of a scalar response. The idea is to fit the marginal quantile regression on each predictor one at a time, and then to base the test on the t-statistics that are associated with the most predictive predictors. A resampling method is devised to calibrate this test statistic, which has non-regular limiting behaviour due to the selection of the most predictive variables. Asymptotic validity of the procedure is established in a general quantile regression setting in which the marginal quantile regression models can be misspecified. Even though a fixed dimension is assumed to derive the asymptotic results, the test proposed is applicable and computationally feasible for large dimensional predictors. The method is more flexible than existing marginal screening test methods based on mean regression and has the added advantage of being robust against outliers in the response. The approach is illustrated by using an application to a human immunodeficiency virus drug resistance data set.

  3. Testing for marginal linear effects in quantile regression

    Wang, Huixia Judy; McKeague, Ian W.; Qian, Min

    2017-01-01

    The paper develops a new marginal testing procedure to detect significant predictors that are associated with the conditional quantiles of a scalar response. The idea is to fit the marginal quantile regression on each predictor one at a time, and then to base the test on the t-statistics that are associated with the most predictive predictors. A resampling method is devised to calibrate this test statistic, which has non-regular limiting behaviour due to the selection of the most predictive variables. Asymptotic validity of the procedure is established in a general quantile regression setting in which the marginal quantile regression models can be misspecified. Even though a fixed dimension is assumed to derive the asymptotic results, the test proposed is applicable and computationally feasible for large dimensional predictors. The method is more flexible than existing marginal screening test methods based on mean regression and has the added advantage of being robust against outliers in the response. The approach is illustrated by using an application to a human immunodeficiency virus drug resistance data set.

  4. Generalised Partially Linear Regression with Misclassified Data and an Application to Labour Market Transitions

    Dlugosz, Stephan; Mammen, Enno; Wilke, Ralf

    We consider the semiparametric generalised linear regression model which has mainstream empirical models such as the (partially) linear mean regression, logistic and multinomial regression as special cases. As an extension to related literature we allow a misclassified covariate to be interacted...

  5. Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

    Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

    2013-01-01

    In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…

  6. Variable selection in multiple linear regression: The influence of ...

    provide an indication of whether the fit of the selected model improves or ... and calculate M(−i); quantify the influence of case i in terms of a function, f(•), of M and ..... [21] Venter JH & Snyman JLJ, 1997, Linear model selection based on risk ...

  7. An introduction to using Bayesian linear regression with clinical data.

    Baldwin, Scott A; Larson, Michael J

    2017-11-01

    Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Electricity consumption forecasting in Italy using linear regression models

    Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio [DIAM, Seconda Universita degli Studi di Napoli, Via Roma 29, 81031 Aversa (CE) (Italy)

    2009-09-15

    The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of {+-}1% for the best case and {+-}11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)

  9. Electricity consumption forecasting in Italy using linear regression models

    Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio

    2009-01-01

    The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of ±1% for the best case and ±11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)

  10. Relative Importance for Linear Regression in R: The Package relaimpo

    Groemping, Ulrike

    2006-01-01

    Relative importance is a topic that has seen a lot of interest in recent years, particularly in applied work. The R package relaimpo implements six different metrics for assessing relative importance of regressors in the linear model, two of which are recommended - averaging over orderings of regressors and a newly proposed metric (Feldman 2005) called pmvd. Apart from delivering the metrics themselves, relaimpo also provides (exploratory) bootstrap confidence intervals. This paper offers a b...

  11. Implementation of non-linear filters for iterative penalized maximum likelihood image reconstruction

    Liang, Z.; Gilland, D.; Jaszczak, R.; Coleman, R.

    1990-01-01

    In this paper, the authors report on the implementation of six edge-preserving, noise-smoothing, non-linear filters applied in image space for iterative penalized maximum-likelihood (ML) SPECT image reconstruction. The non-linear smoothing filters implemented were the median filter, the E 6 filter, the sigma filter, the edge-line filter, the gradient-inverse filter, and the 3-point edge filter with gradient-inverse filter, and the 3-point edge filter with gradient-inverse weight. A 3 x 3 window was used for all these filters. The best image obtained, by viewing the profiles through the image in terms of noise-smoothing, edge-sharpening, and contrast, was the one smoothed with the 3-point edge filter. The computation time for the smoothing was less than 1% of one iteration, and the memory space for the smoothing was negligible. These images were compared with the results obtained using Bayesian analysis

  12. Non-linear auto-regressive models for cross-frequency coupling in neural time series

    Tallot, Lucille; Grabot, Laetitia; Doyère, Valérie; Grenier, Yves; Gramfort, Alexandre

    2017-01-01

    We address the issue of reliably detecting and quantifying cross-frequency coupling (CFC) in neural time series. Based on non-linear auto-regressive models, the proposed method provides a generative and parametric model of the time-varying spectral content of the signals. As this method models the entire spectrum simultaneously, it avoids the pitfalls related to incorrect filtering or the use of the Hilbert transform on wide-band signals. As the model is probabilistic, it also provides a score of the model “goodness of fit” via the likelihood, enabling easy and legitimate model selection and parameter comparison; this data-driven feature is unique to our model-based approach. Using three datasets obtained with invasive neurophysiological recordings in humans and rodents, we demonstrate that these models are able to replicate previous results obtained with other metrics, but also reveal new insights such as the influence of the amplitude of the slow oscillation. Using simulations, we demonstrate that our parametric method can reveal neural couplings with shorter signals than non-parametric methods. We also show how the likelihood can be used to find optimal filtering parameters, suggesting new properties on the spectrum of the driving signal, but also to estimate the optimal delay between the coupled signals, enabling a directionality estimation in the coupling. PMID:29227989

  13. The microcomputer scientific software series 2: general linear model--regression.

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  14. Least-Squares Linear Regression and Schrodinger's Cat: Perspectives on the Analysis of Regression Residuals.

    Hecht, Jeffrey B.

    The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…

  15. Two biased estimation techniques in linear regression: Application to aircraft

    Klein, Vladislav

    1988-01-01

    Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.

  16. Detection of epistatic effects with logic regression and a classical linear regression model.

    Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata

    2014-02-01

    To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.

  17. A simplified procedure of linear regression in a preliminary analysis

    Silvia Facchinetti

    2013-05-01

    Full Text Available The analysis of a statistical large data-set can be led by the study of a particularly interesting variable Y – regressed – and an explicative variable X, chosen among the remained variables, conjointly observed. The study gives a simplified procedure to obtain the functional link of the variables y=y(x by a partition of the data-set into m subsets, in which the observations are synthesized by location indices (mean or median of X and Y. Polynomial models for y(x of order r are considered to verify the characteristics of the given procedure, in particular we assume r= 1 and 2. The distributions of the parameter estimators are obtained by simulation, when the fitting is done for m= r + 1. Comparisons of the results, in terms of distribution and efficiency, are made with the results obtained by the ordinary least square methods. The study also gives some considerations on the consistency of the estimated parameters obtained by the given procedure.

  18. Identifying predictors of physics item difficulty: A linear regression approach

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge

  19. Identifying predictors of physics item difficulty: A linear regression approach

    Hasnija Muratovic

    2011-06-01

    Full Text Available Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal

  20. Noise Reduction of Measurement Data using Linear Digital Filters

    Hitzmann B.

    2007-12-01

    Full Text Available In this paper Butterworth, Chebyshev (Type I and II and Elliptic digital filters are designed for signal noise reduction. On-line data measurements of substrate concentration from E. coli fed-batch cultivation process are used. Application of the designed filters leads to a successful noise reduction of on-line glucose measurements. The digital filters presented here are simple, easy to implement and effective - the used filters allow for a smart compromise between signal information and noise corruption.

  1. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  2. Time-resolved flow reconstruction with indirect measurements using regression models and Kalman-filtered POD ROM

    Leroux, Romain; Chatellier, Ludovic; David, Laurent

    2018-01-01

    This article is devoted to the estimation of time-resolved particle image velocimetry (TR-PIV) flow fields using a time-resolved point measurements of a voltage signal obtained by hot-film anemometry. A multiple linear regression model is first defined to map the TR-PIV flow fields onto the voltage signal. Due to the high temporal resolution of the signal acquired by the hot-film sensor, the estimates of the TR-PIV flow fields are obtained with a multiple linear regression method called orthonormalized partial least squares regression (OPLSR). Subsequently, this model is incorporated as the observation equation in an ensemble Kalman filter (EnKF) applied on a proper orthogonal decomposition reduced-order model to stabilize it while reducing the effects of the hot-film sensor noise. This method is assessed for the reconstruction of the flow around a NACA0012 airfoil at a Reynolds number of 1000 and an angle of attack of {20}°. Comparisons with multi-time delay-modified linear stochastic estimation show that both the OPLSR and EnKF combined with OPLSR are more accurate as they produce a much lower relative estimation error, and provide a faithful reconstruction of the time evolution of the velocity flow fields.

  3. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  4. Trend Estimation and Regression Analysis in Climatological Time Series: An Application of Structural Time Series Models and the Kalman Filter.

    Visser, H.; Molenaar, J.

    1995-05-01

    The detection of trends in climatological data has become central to the discussion on climate change due to the enhanced greenhouse effect. To prove detection, a method is needed (i) to make inferences on significant rises or declines in trends, (ii) to take into account natural variability in climate series, and (iii) to compare output from GCMs with the trends in observed climate data. To meet these requirements, flexible mathematical tools are needed. A structural time series model is proposed with which a stochastic trend, a deterministic trend, and regression coefficients can be estimated simultaneously. The stochastic trend component is described using the class of ARIMA models. The regression component is assumed to be linear. However, the regression coefficients corresponding with the explanatory variables may be time dependent to validate this assumption. The mathematical technique used to estimate this trend-regression model is the Kaiman filter. The main features of the filter are discussed.Examples of trend estimation are given using annual mean temperatures at a single station in the Netherlands (1706-1990) and annual mean temperatures at Northern Hemisphere land stations (1851-1990). The inclusion of explanatory variables is shown by regressing the latter temperature series on four variables: Southern Oscillation index (SOI), volcanic dust index (VDI), sunspot numbers (SSN), and a simulated temperature signal, induced by increasing greenhouse gases (GHG). In all analyses, the influence of SSN on global temperatures is found to be negligible. The correlations between temperatures and SOI and VDI appear to be negative. For SOI, this correlation is significant, but for VDI it is not, probably because of a lack of volcanic eruptions during the sample period. The relation between temperatures and GHG is positive, which is in agreement with the hypothesis of a warming climate because of increasing levels of greenhouse gases. The prediction performance of

  5. Finite-Time H∞ Filtering for Linear Continuous Time-Varying Systems with Uncertain Observations

    Huihong Zhao

    2012-01-01

    Full Text Available This paper is concerned with the finite-time H∞ filtering problem for linear continuous time-varying systems with uncertain observations and ℒ2-norm bounded noise. The design of finite-time H∞ filter is equivalent to the problem that a certain indefinite quadratic form has a minimum and the filter is such that the minimum is positive. The quadratic form is related to a Krein state-space model according to the Krein space linear estimation theory. By using the projection theory in Krein space, the finite-time H∞ filtering problem is solved. A numerical example is given to illustrate the performance of the H∞ filter.

  6. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  7. Predicting Fuel Ignition Quality Using 1H NMR Spectroscopy and Multiple Linear Regression

    Abdul Jameel, Abdul Gani; Naser, Nimal; Emwas, Abdul-Hamid M.; Dooley, Stephen; Sarathy, Mani

    2016-01-01

    An improved model for the prediction of ignition quality of hydrocarbon fuels has been developed using 1H nuclear magnetic resonance (NMR) spectroscopy and multiple linear regression (MLR) modeling. Cetane number (CN) and derived cetane number (DCN

  8. How to deal with continuous and dichotomic outcomes in epidemiological research: linear and logistic regression analyses

    Tripepi, Giovanni; Jager, Kitty J.; Stel, Vianda S.; Dekker, Friedo W.; Zoccali, Carmine

    2011-01-01

    Because of some limitations of stratification methods, epidemiologists frequently use multiple linear and logistic regression analyses to address specific epidemiological questions. If the dependent variable is a continuous one (for example, systolic pressure and serum creatinine), the researcher

  9. Analysis of γ spectra in airborne radioactivity measurements using multiple linear regressions

    Bao Min; Shi Quanlin; Zhang Jiamei

    2004-01-01

    This paper describes the net peak counts calculating of nuclide 137 Cs at 662 keV of γ spectra in airborne radioactivity measurements using multiple linear regressions. Mathematic model is founded by analyzing every factor that has contribution to Cs peak counts in spectra, and multiple linear regression function is established. Calculating process adopts stepwise regression, and the indistinctive factors are eliminated by F check. The regression results and its uncertainty are calculated using Least Square Estimation, then the Cs peak net counts and its uncertainty can be gotten. The analysis results for experimental spectrum are displayed. The influence of energy shift and energy resolution on the analyzing result is discussed. In comparison with the stripping spectra method, multiple linear regression method needn't stripping radios, and the calculating result has relation with the counts in Cs peak only, and the calculating uncertainty is reduced. (authors)

  10. Robust output feedback H-infinity control and filtering for uncertain linear systems

    Chang, Xiao-Heng

    2014-01-01

    "Robust Output Feedback H-infinity Control and Filtering for Uncertain Linear Systems" discusses new and meaningful findings on robust output feedback H-infinity control and filtering for uncertain linear systems, presenting a number of useful and less conservative design results based on the linear matrix inequality (LMI) technique. Though primarily intended for graduate students in control and filtering, the book can also serve as a valuable reference work for researchers wishing to explore the area of robust H-infinity control and filtering of uncertain systems. Dr. Xiao-Heng Chang is a Professor at the College of Engineering, Bohai University, China.

  11. Applications of Kalman filters based on non-linear functions to numerical weather predictions

    G. Galanis

    2006-10-01

    Full Text Available This paper investigates the use of non-linear functions in classical Kalman filter algorithms on the improvement of regional weather forecasts. The main aim is the implementation of non linear polynomial mappings in a usual linear Kalman filter in order to simulate better non linear problems in numerical weather prediction. In addition, the optimal order of the polynomials applied for such a filter is identified. This work is based on observations and corresponding numerical weather predictions of two meteorological parameters characterized by essential differences in their evolution in time, namely, air temperature and wind speed. It is shown that in both cases, a polynomial of low order is adequate for eliminating any systematic error, while higher order functions lead to instabilities in the filtered results having, at the same time, trivial contribution to the sensitivity of the filter. It is further demonstrated that the filter is independent of the time period and the geographic location of application.

  12. Applications of Kalman filters based on non-linear functions to numerical weather predictions

    G. Galanis

    2006-10-01

    Full Text Available This paper investigates the use of non-linear functions in classical Kalman filter algorithms on the improvement of regional weather forecasts. The main aim is the implementation of non linear polynomial mappings in a usual linear Kalman filter in order to simulate better non linear problems in numerical weather prediction. In addition, the optimal order of the polynomials applied for such a filter is identified. This work is based on observations and corresponding numerical weather predictions of two meteorological parameters characterized by essential differences in their evolution in time, namely, air temperature and wind speed. It is shown that in both cases, a polynomial of low order is adequate for eliminating any systematic error, while higher order functions lead to instabilities in the filtered results having, at the same time, trivial contribution to the sensitivity of the filter. It is further demonstrated that the filter is independent of the time period and the geographic location of application.

  13. Do clinical and translational science graduate students understand linear regression? Development and early validation of the REGRESS quiz.

    Enders, Felicity

    2013-12-01

    Although regression is widely used for reading and publishing in the medical literature, no instruments were previously available to assess students' understanding. The goal of this study was to design and assess such an instrument for graduate students in Clinical and Translational Science and Public Health. A 27-item REsearch on Global Regression Expectations in StatisticS (REGRESS) quiz was developed through an iterative process. Consenting students taking a course on linear regression in a Clinical and Translational Science program completed the quiz pre- and postcourse. Student results were compared to practicing statisticians with a master's or doctoral degree in statistics or a closely related field. Fifty-two students responded precourse, 59 postcourse , and 22 practicing statisticians completed the quiz. The mean (SD) score was 9.3 (4.3) for students precourse and 19.0 (3.5) postcourse (P REGRESS quiz was internally reliable (Cronbach's alpha 0.89). The initial validation is quite promising with statistically significant and meaningful differences across time and study populations. Further work is needed to validate the quiz across multiple institutions. © 2013 Wiley Periodicals, Inc.

  14. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  15. A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

    Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

    2018-04-01

    In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.

  16. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  17. Transmission of linear regression patterns between time series: from relationship in time series to complex networks.

    Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

    2014-07-01

    The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.

  18. The number of subjects per variable required in linear regression analyses

    P.C. Austin (Peter); E.W. Steyerberg (Ewout)

    2015-01-01

    textabstractObjectives To determine the number of independent variables that can be included in a linear regression model. Study Design and Setting We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression

  19. Tightness of M-estimators for multiple linear regression in time series

    Johansen, Søren; Nielsen, Bent

    We show tightness of a general M-estimator for multiple linear regression in time series. The positive criterion function for the M-estimator is assumed lower semi-continuous and sufficiently large for large argument: Particular cases are the Huber-skip and quantile regression. Tightness requires...

  20. Piecewise linear regression techniques to analyze the timing of head coach dismissals in Dutch soccer clubs

    Schryver, T. de; Eisinga, R.

    2010-01-01

    The key question in research on dismissals of head coaches in sports clubs is not whether they should happen but when they will happen. This paper applies piecewise linear regression to advance our understanding of the timing of head coach dismissals. Essentially, the regression sacrifices degrees

  1. Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

    Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

    2006-01-01

    Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

  2. Investigation of linear regression of EPR dosimetric signal of the man tooth enamel

    Pivovarov, S.P.; Rukhin, A.B.; Zhakparov, R.K.; Vasilevskaya, L.A.

    2001-01-01

    The experimental relations of the EPR radiation signal in samples of man tooth enamel of three donors of different age up to doses 1350 Gy are examined. To all of them the linear regression is applicable. The considerable errors leading to apparent non-linearity are eliminated most. (author)

  3. Genomic prediction based on data from three layer lines using non-linear regression models

    Huang, H.; Windig, J.J.; Vereijken, A.; Calus, M.P.L.

    2014-01-01

    Background - Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. Methods - In an attempt to alleviate

  4. Multiple linear regression and regression with time series error models in forecasting PM10 concentrations in Peninsular Malaysia.

    Ng, Kar Yong; Awang, Norhashidah

    2018-01-06

    Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.

  5. Modeling Fire Occurrence at the City Scale: A Comparison between Geographically Weighted Regression and Global Linear Regression.

    Song, Chao; Kwan, Mei-Po; Zhu, Jiping

    2017-04-08

    An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.

  6. Filter Selection for Optimizing the Spectral Sensitivity of Broadband Multispectral Cameras Based on Maximum Linear Independence.

    Li, Sui-Xian

    2018-05-07

    Previous research has shown that the effectiveness of selecting filter sets from among a large set of commercial broadband filters by a vector analysis method based on maximum linear independence (MLI). However, the traditional MLI approach is suboptimal due to the need to predefine the first filter of the selected filter set to be the maximum ℓ₂ norm among all available filters. An exhaustive imaging simulation with every single filter serving as the first filter is conducted to investigate the features of the most competent filter set. From the simulation, the characteristics of the most competent filter set are discovered. Besides minimization of the condition number, the geometric features of the best-performed filter set comprise a distinct transmittance peak along the wavelength axis of the first filter, a generally uniform distribution for the peaks of the filters and substantial overlaps of the transmittance curves of the adjacent filters. Therefore, the best-performed filter sets can be recognized intuitively by simple vector analysis and just a few experimental verifications. A practical two-step framework for selecting optimal filter set is recommended, which guarantees a significant enhancement of the performance of the systems. This work should be useful for optimizing the spectral sensitivity of broadband multispectral imaging sensors.

  7. Filter Selection for Optimizing the Spectral Sensitivity of Broadband Multispectral Cameras Based on Maximum Linear Independence

    Sui-Xian Li

    2018-05-01

    Full Text Available Previous research has shown that the effectiveness of selecting filter sets from among a large set of commercial broadband filters by a vector analysis method based on maximum linear independence (MLI. However, the traditional MLI approach is suboptimal due to the need to predefine the first filter of the selected filter set to be the maximum ℓ2 norm among all available filters. An exhaustive imaging simulation with every single filter serving as the first filter is conducted to investigate the features of the most competent filter set. From the simulation, the characteristics of the most competent filter set are discovered. Besides minimization of the condition number, the geometric features of the best-performed filter set comprise a distinct transmittance peak along the wavelength axis of the first filter, a generally uniform distribution for the peaks of the filters and substantial overlaps of the transmittance curves of the adjacent filters. Therefore, the best-performed filter sets can be recognized intuitively by simple vector analysis and just a few experimental verifications. A practical two-step framework for selecting optimal filter set is recommended, which guarantees a significant enhancement of the performance of the systems. This work should be useful for optimizing the spectral sensitivity of broadband multispectral imaging sensors.

  8. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  9. Fuzzy Linear Regression for the Time Series Data which is Fuzzified with SMRGT Method

    Seçil YALAZ

    2016-10-01

    Full Text Available Our work on regression and classification provides a new contribution to the analysis of time series used in many areas for years. Owing to the fact that convergence could not obtained with the methods used in autocorrelation fixing process faced with time series regression application, success is not met or fall into obligation of changing the models’ degree. Changing the models’ degree may not be desirable in every situation. In our study, recommended for these situations, time series data was fuzzified by using the simple membership function and fuzzy rule generation technique (SMRGT and to estimate future an equation has created by applying fuzzy least square regression (FLSR method which is a simple linear regression method to this data. Although SMRGT has success in determining the flow discharge in open channels and can be used confidently for flow discharge modeling in open canals, as well as in pipe flow with some modifications, there is no clue about that this technique is successful in fuzzy linear regression modeling. Therefore, in order to address the luck of such a modeling, a new hybrid model has been described within this study. In conclusion, to demonstrate our methods’ efficiency, classical linear regression for time series data and linear regression for fuzzy time series data were applied to two different data sets, and these two approaches performances were compared by using different measures.

  10. An improved multiple linear regression and data analysis computer program package

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  11. A comparative study of Kalman filter and Linear Matrix Inequality based H infinity filter for SPND delay compensation

    Tamboli, P.K.; Duttagupta, Siddhartha P.; Roy, Kallol

    2016-01-01

    Highlights: • Derivation for delay compensation algorithm using recursive Kalman filter. • Derivation for delay compensation algorithm using Linear Matrix Inequality based H infinity filter. • Process modeling suitable for delay compensation. • Dynamic tuning of the delay compensation algorithm for both Kalman and H infinity filter. • Simulations and trade-off curve for Kalman and H infinity filter. - Abstract: This paper deals with delay compensation of vanadium Self Powered Neutron Detectors (SPNDs) using Linear Matrix Inequality (LMI) based H-infinity filtering method and compares the results with Kalman filtering method. The entire study is established upon the framework of neutron flux estimation in large core Pressurized Heavy Water Reactor (PHWR) in which delayed SPNDs such as vanadium SPNDs are used as in-core flux monitoring detectors. The use of vanadium SPNDs are limited to 3-D flux mapping despite of providing better Signal to Noise Ratio as compared to other prompt SPNDs, due to their small prompt component in the signal. The use of an appropriate delay compensation technique has been always considered to be an effective strategy to build a prompt and accurate estimate of the neutron flux. We also indicate the noise-response trade-off curve for both the techniques. Since all the delay compensation algorithms always suffer from noise amplification, we propose an efficient adaptive parameter tuning technique for improving performance of the filtering algorithm against noise in the measurement.

  12. The research of radar target tracking observed information linear filter method

    Chen, Zheng; Zhao, Xuanzhi; Zhang, Wen

    2018-05-01

    Aiming at the problems of low precision or even precision divergent is caused by nonlinear observation equation in radar target tracking, a new filtering algorithm is proposed in this paper. In this algorithm, local linearization is carried out on the observed data of the distance and angle respectively. Then the kalman filter is performed on the linearized data. After getting filtered data, a mapping operation will provide the posteriori estimation of target state. A large number of simulation results show that this algorithm can solve above problems effectively, and performance is better than the traditional filtering algorithm for nonlinear dynamic systems.

  13. Generation of Long Waves using Non-Linear Digital Filters

    Høgedal, Michael; Frigaard, Peter

    1994-01-01

    transform of the 1st order surface elevation and subsequently inverse Fourier transformed. Hence, the methods are unsuitable for real-time applications, for example where white noise are filtered digitally to obtain a wave spectrum with built-in stochastic variabillity. In the present paper an approximative...... method for including the correct 2nd order bound terms in such applications is presented. The technique utilizes non-liner digital filters fitted to the appropriate transfer function is derived only for bounded 2nd order subharmonics, as they laboratory experiments generally are considered the most...

  14. A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design.

    Meaney, Christopher; Moineddin, Rahim

    2014-01-24

    In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the

  15. Linear filtering applied to safeguards of nuclear material

    Pike, D.H.; Morrison, G.W.; Holland, C.W.

    1975-01-01

    In regard to the problem of nuclear materials theft or diversion in the fuel cycle, a method is needed to detect continual thefts of relatively small amounts of material. It is suggested that Kalman filtering techniques be used. A hypothetical material flow situation is used to illustrate the technique; losses could be detected in as few as 5 months. (DLC)

  16. Generation of Long Waves using Non-Linear Digital Filters

    Høgedal, Michael; Frigaard, Peter; Christensen, Morten

    1994-01-01

    transform of the 1st order surface elevation and subsequently inverse Fourier transformed. Hence, the methods are unsuitable for real-time applications, for example where white noise are filtered digitally to obtain a wave spectrum with built-in stochastic variabillity. In the present paper an approximative...

  17. Use of multiple linear regression and logistic regression models to investigate changes in birthweight for term singleton infants in Scotland.

    Bonellie, Sandra R

    2012-10-01

    To illustrate the use of regression and logistic regression models to investigate changes over time in size of babies particularly in relation to social deprivation, age of the mother and smoking. Mean birthweight has been found to be increasing in many countries in recent years, but there are still a group of babies who are born with low birthweights. Population-based retrospective cohort study. Multiple linear regression and logistic regression models are used to analyse data on term 'singleton births' from Scottish hospitals between 1994-2003. Mothers who smoke are shown to give birth to lighter babies on average, a difference of approximately 0.57 Standard deviations lower (95% confidence interval. 0.55-0.58) when adjusted for sex and parity. These mothers are also more likely to have babies that are low birthweight (odds ratio 3.46, 95% confidence interval 3.30-3.63) compared with non-smokers. Low birthweight is 30% more likely where the mother lives in the most deprived areas compared with the least deprived, (odds ratio 1.30, 95% confidence interval 1.21-1.40). Smoking during pregnancy is shown to have a detrimental effect on the size of infants at birth. This effect explains some, though not all, of the observed socioeconomic birthweight. It also explains much of the observed birthweight differences by the age of the mother.   Identifying mothers at greater risk of having a low birthweight baby as important implications for the care and advice this group receives. © 2012 Blackwell Publishing Ltd.

  18. Treating experimental data of inverse kinetic method by unitary linear regression analysis

    Zhao Yusen; Chen Xiaoliang

    2009-01-01

    The theory of treating experimental data of inverse kinetic method by unitary linear regression analysis was described. Not only the reactivity, but also the effective neutron source intensity could be calculated by this method. Computer code was compiled base on the inverse kinetic method and unitary linear regression analysis. The data of zero power facility BFS-1 in Russia were processed and the results were compared. The results show that the reactivity and the effective neutron source intensity can be obtained correctly by treating experimental data of inverse kinetic method using unitary linear regression analysis and the precision of reactivity measurement is improved. The central element efficiency can be calculated by using the reactivity. The result also shows that the effect to reactivity measurement caused by external neutron source should be considered when the reactor power is low and the intensity of external neutron source is strong. (authors)

  19. Reducing false-positive incidental findings with ensemble genotyping and logistic regression based variant filtering methods.

    Hwang, Kyu-Baek; Lee, In-Hee; Park, Jin-Ho; Hambuch, Tina; Choe, Yongjoon; Kim, MinHyeok; Lee, Kyungjoon; Song, Taemin; Neu, Matthew B; Gupta, Neha; Kohane, Isaac S; Green, Robert C; Kong, Sek Won

    2014-08-01

    As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false-positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here, we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false-negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous single nucleotide variants (SNVs); 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery in NA12878, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and an ensemble genotyping would be essential to minimize false-positive DNM candidates. © 2014 WILEY PERIODICALS, INC.

  20. Eigenvector Spatial Filtering Regression Modeling of Ground PM2.5 Concentrations Using Remotely Sensed Data

    Jingyi Zhang

    2018-06-01

    Full Text Available This paper proposes a regression model using the Eigenvector Spatial Filtering (ESF method to estimate ground PM2.5 concentrations. Covariates are derived from remotely sensed data including aerosol optical depth, normal differential vegetation index, surface temperature, air pressure, relative humidity, height of planetary boundary layer and digital elevation model. In addition, cultural variables such as factory densities and road densities are also used in the model. With the Yangtze River Delta region as the study area, we constructed ESF-based Regression (ESFR models at different time scales, using data for the period between December 2015 and November 2016. We found that the ESFR models effectively filtered spatial autocorrelation in the OLS residuals and resulted in increases in the goodness-of-fit metrics as well as reductions in residual standard errors and cross-validation errors, compared to the classic OLS models. The annual ESFR model explained 70% of the variability in PM2.5 concentrations, 16.7% more than the non-spatial OLS model. With the ESFR models, we performed detail analyses on the spatial and temporal distributions of PM2.5 concentrations in the study area. The model predictions are lower than ground observations but match the general trend. The experiment shows that ESFR provides a promising approach to PM2.5 analysis and prediction.

  1. Eigenvector Spatial Filtering Regression Modeling of Ground PM2.5 Concentrations Using Remotely Sensed Data.

    Zhang, Jingyi; Li, Bin; Chen, Yumin; Chen, Meijie; Fang, Tao; Liu, Yongfeng

    2018-06-11

    This paper proposes a regression model using the Eigenvector Spatial Filtering (ESF) method to estimate ground PM 2.5 concentrations. Covariates are derived from remotely sensed data including aerosol optical depth, normal differential vegetation index, surface temperature, air pressure, relative humidity, height of planetary boundary layer and digital elevation model. In addition, cultural variables such as factory densities and road densities are also used in the model. With the Yangtze River Delta region as the study area, we constructed ESF-based Regression (ESFR) models at different time scales, using data for the period between December 2015 and November 2016. We found that the ESFR models effectively filtered spatial autocorrelation in the OLS residuals and resulted in increases in the goodness-of-fit metrics as well as reductions in residual standard errors and cross-validation errors, compared to the classic OLS models. The annual ESFR model explained 70% of the variability in PM 2.5 concentrations, 16.7% more than the non-spatial OLS model. With the ESFR models, we performed detail analyses on the spatial and temporal distributions of PM 2.5 concentrations in the study area. The model predictions are lower than ground observations but match the general trend. The experiment shows that ESFR provides a promising approach to PM 2.5 analysis and prediction.

  2. A primer for biomedical scientists on how to execute model II linear regression analysis.

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  3. Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

    Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

    2016-01-01

    Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19

  4. The Relationship between Economic Growth and Money Laundering – a Linear Regression Model

    Daniel Rece

    2009-09-01

    Full Text Available This study provides an overview of the relationship between economic growth and money laundering modeled by a least squares function. The report analyzes statistically data collected from USA, Russia, Romania and other eleven European countries, rendering a linear regression model. The study illustrates that 23.7% of the total variance in the regressand (level of money laundering is “explained” by the linear regression model. In our opinion, this model will provide critical auxiliary judgment and decision support for anti-money laundering service systems.

  5. Comparing Consider-Covariance Analysis with Sigma-Point Consider Filter and Linear-Theory Consider Filter Formulations

    Lisano, Michael E.

    2007-01-01

    Recent literature in applied estimation theory reflects growing interest in the sigma-point (also called unscented ) formulation for optimal sequential state estimation, often describing performance comparisons with extended Kalman filters as applied to specific dynamical problems [c.f. 1, 2, 3]. Favorable attributes of sigma-point filters are described as including a lower expected error for nonlinear even non-differentiable dynamical systems, and a straightforward formulation not requiring derivation or implementation of any partial derivative Jacobian matrices. These attributes are particularly attractive, e.g. in terms of enabling simplified code architecture and streamlined testing, in the formulation of estimators for nonlinear spaceflight mechanics systems, such as filter software onboard deep-space robotic spacecraft. As presented in [4], the Sigma-Point Consider Filter (SPCF) algorithm extends the sigma-point filter algorithm to the problem of consider covariance analysis. Considering parameters in a dynamical system, while estimating its state, provides an upper bound on the estimated state covariance, which is viewed as a conservative approach to designing estimators for problems of general guidance, navigation and control. This is because, whether a parameter in the system model is observable or not, error in the knowledge of the value of a non-estimated parameter will increase the actual uncertainty of the estimated state of the system beyond the level formally indicated by the covariance of an estimator that neglects errors or uncertainty in that parameter. The equations for SPCF covariance evolution are obtained in a fashion similar to the derivation approach taken with standard (i.e. linearized or extended) consider parameterized Kalman filters (c.f. [5]). While in [4] the SPCF and linear-theory consider filter (LTCF) were applied to an illustrative linear dynamics/linear measurement problem, in the present work examines the SPCF as applied to

  6. On Optimal Linear Filtering of Speech for Near-End Listening Enhancement

    Taal, Cees H.; Jensen, Jesper; Leijon, Arne

    2013-01-01

    In this letter the focus is on linear filtering of speech before degradation due to additive background noise. The goal is to design the filter such that the speech intelligibility index (SII) is maximized when the speech is played back in a known noisy environment. Moreover, a power constraint i...

  7. The number of subjects per variable required in linear regression analyses.

    Austin, Peter C; Steyerberg, Ewout W

    2015-06-01

    To determine the number of independent variables that can be included in a linear regression model. We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model. A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model. Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Using the classical linear regression model in analysis of the dependences of conveyor belt life

    Miriam Andrejiová

    2013-12-01

    Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.

  9. Linear filtering in three-dimensional depiction of radiographic data

    Gorbunov, V.I.; Popov, A.A.; Stoyanov, A.K.

    1978-01-01

    The radiography process is discussed from the point of linear system theory. The requirements to the pulse reaction type are formulated for the equivalent schemes of holography pseudonoise tomosynthesis in radiography. The experimental data are given

  10. A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

    Cooper, Paul D.

    2010-01-01

    A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

  11. Analysis of interactive fixed effects dynamic linear panel regression with measurement error

    Nayoung Lee; Hyungsik Roger Moon; Martin Weidner

    2011-01-01

    This paper studies a simple dynamic panel linear regression model with interactive fixed effects in which the variable of interest is measured with error. To estimate the dynamic coefficient, we consider the least-squares minimum distance (LS-MD) estimation method.

  12. An Introduction to Graphical and Mathematical Methods for Detecting Heteroscedasticity in Linear Regression.

    Thompson, Russel L.

    Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…

  13. INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION

    Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...

  14. Application of range-test in multiple linear regression analysis in ...

    Application of range-test in multiple linear regression analysis in the presence of outliers is studied in this paper. First, the plot of the explanatory variables (i.e. Administration, Social/Commercial, Economic services and Transfer) on the dependent variable (i.e. GDP) was done to identify the statistical trend over the years.

  15. [Prediction model of health workforce and beds in county hospitals of Hunan by multiple linear regression].

    Ling, Ru; Liu, Jiawang

    2011-12-01

    To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.

  16. Calculation of U, Ra, Th and K contents in uranium ore by multiple linear regression method

    Lin Chao; Chen Yingqiang; Zhang Qingwen; Tan Fuwen; Peng Guanghui

    1991-01-01

    A multiple linear regression method was used to compute γ spectra of uranium ore samples and to calculate contents of U, Ra, Th, and K. In comparison with the inverse matrix method, its advantage is that no standard samples of pure U, Ra, Th and K are needed for obtaining response coefficients

  17. Comparing Regression Coefficients between Nested Linear Models for Clustered Data with Generalized Estimating Equations

    Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer

    2013-01-01

    Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…

  18. Bayesian linear regression : different conjugate models and their (in)sensitivity to prior-data conflict

    Walter, G.M.; Augustin, Th.; Kneib, Thomas; Tutz, Gerhard

    2010-01-01

    The paper is concerned with Bayesian analysis under prior-data conflict, i.e. the situation when observed data are rather unexpected under the prior (and the sample size is not large enough to eliminate the influence of the prior). Two approaches for Bayesian linear regression modeling based on

  19. A unified framework for testing in the linear regression model under unknown order of fractional integration

    Christensen, Bent Jesper; Kruse, Robinson; Sibbertsen, Philipp

    We consider hypothesis testing in a general linear time series regression framework when the possibly fractional order of integration of the error term is unknown. We show that the approach suggested by Vogelsang (1998a) for the case of integer integration does not apply to the case of fractional...

  20. Alpins and thibos vectorial astigmatism analyses: proposal of a linear regression model between methods

    Giuliano de Oliveira Freitas

    2013-10-01

    Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.

  1. Power properties of invariant tests for spatial autocorrelation in linear regression

    Martellosio, F.

    2006-01-01

    Many popular tests for residual spatial autocorrelation in the context of the linear regression model belong to the class of invariant tests. This paper derives a number of exact properties of the power function of such tests. In particular, we extend the work of Krämer (2005, Journal of Statistical

  2. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis.

    Khalil, Mohamed H; Shebl, Mostafa K; Kosba, Mohamed A; El-Sabrout, Karim; Zaki, Nesma

    2016-08-01

    This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens' eggs. Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens.

  3. truncSP: An R Package for Estimation of Semi-Parametric Truncated Linear Regression Models

    Maria Karlsson

    2014-05-01

    Full Text Available Problems with truncated data occur in many areas, complicating estimation and inference. Regarding linear regression models, the ordinary least squares estimator is inconsistent and biased for these types of data and is therefore unsuitable for use. Alternative estimators, designed for the estimation of truncated regression models, have been developed. This paper presents the R package truncSP. The package contains functions for the estimation of semi-parametric truncated linear regression models using three different estimators: the symmetrically trimmed least squares, quadratic mode, and left truncated estimators, all of which have been shown to have good asymptotic and ?nite sample properties. The package also provides functions for the analysis of the estimated models. Data from the environmental sciences are used to illustrate the functions in the package.

  4. Genomic prediction based on data from three layer lines using non-linear regression models.

    Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L

    2014-11-06

    Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional

  5. Predicting recovery of cognitive function soon after stroke: differential modeling of logarithmic and linear regression.

    Suzuki, Makoto; Sugimura, Yuko; Yamada, Sumio; Omori, Yoshitsugu; Miyamoto, Masaaki; Yamamoto, Jun-ichi

    2013-01-01

    Cognitive disorders in the acute stage of stroke are common and are important independent predictors of adverse outcome in the long term. Despite the impact of cognitive disorders on both patients and their families, it is still difficult to predict the extent or duration of cognitive impairments. The objective of the present study was, therefore, to provide data on predicting the recovery of cognitive function soon after stroke by differential modeling with logarithmic and linear regression. This study included two rounds of data collection comprising 57 stroke patients enrolled in the first round for the purpose of identifying the time course of cognitive recovery in the early-phase group data, and 43 stroke patients in the second round for the purpose of ensuring that the correlation of the early-phase group data applied to the prediction of each individual's degree of cognitive recovery. In the first round, Mini-Mental State Examination (MMSE) scores were assessed 3 times during hospitalization, and the scores were regressed on the logarithm and linear of time. In the second round, calculations of MMSE scores were made for the first two scoring times after admission to tailor the structures of logarithmic and linear regression formulae to fit an individual's degree of functional recovery. The time course of early-phase recovery for cognitive functions resembled both logarithmic and linear functions. However, MMSE scores sampled at two baseline points based on logarithmic regression modeling could estimate prediction of cognitive recovery more accurately than could linear regression modeling (logarithmic modeling, R(2) = 0.676, PLogarithmic modeling based on MMSE scores could accurately predict the recovery of cognitive function soon after the occurrence of stroke. This logarithmic modeling with mathematical procedures is simple enough to be adopted in daily clinical practice.

  6. Linear regression metamodeling as a tool to summarize and present simulation model results.

    Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

    2013-10-01

    Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.

  7. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Drzewiecki, Wojciech

    2016-12-01

    In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.

  8. Analysis of dental caries using generalized linear and count regression models

    Javali M. Phil

    2013-11-01

    Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.

  9. Implementation of linear filters for iterative penalized maximum likelihood SPECT reconstruction

    Liang, Z.

    1991-01-01

    This paper reports on six low-pass linear filters applied in frequency space implemented for iterative penalized maximum-likelihood (ML) SPECT image reconstruction. The filters implemented were the Shepp-Logan filter, the Butterworth filer, the Gaussian filter, the Hann filter, the Parzen filer, and the Lagrange filter. The low-pass filtering was applied in frequency space to projection data for the initial estimate and to the difference of projection data and reprojected data for higher order approximations. The projection data were acquired experimentally from a chest phantom consisting of non-uniform attenuating media. All the filters could effectively remove the noise and edge artifacts associated with ML approach if the frequency cutoff was properly chosen. The improved performance of the Parzen and Lagrange filters relative to the others was observed. The best image, by viewing its profiles in terms of noise-smoothing, edge-sharpening, and contrast, was the one obtained with the Parzen filter. However, the Lagrange filter has the potential to consider the characteristics of detector response function

  10. Fault prediction for nonlinear stochastic system with incipient faults based on particle filter and nonlinear regression.

    Ding, Bo; Fang, Huajing

    2017-05-01

    This paper is concerned with the fault prediction for the nonlinear stochastic system with incipient faults. Based on the particle filter and the reasonable assumption about the incipient faults, the modified fault estimation algorithm is proposed, and the system state is estimated simultaneously. According to the modified fault estimation, an intuitive fault detection strategy is introduced. Once each of the incipient fault is detected, the parameters of which are identified by a nonlinear regression method. Then, based on the estimated parameters, the future fault signal can be predicted. Finally, the effectiveness of the proposed method is verified by the simulations of the Three-tank system. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  11. Multivariate regression analysis for determining short-term values of radon and its decay products from filter measurements

    Kraut, W.; Schwarz, W.; Wilhelm, A.

    1994-01-01

    A multivariate regression analysis is applied to decay measurements of α-resp. β-filter activcity. Activity concentrations for Po-218, Pb-214 and Bi-214, resp. for the Rn-222 equilibrium equivalent concentration are obtained explicitly. The regression analysis takes into account properly the variances of the measured count rates and their influence on the resulting activity concentrations. (orig.) [de

  12. Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

    C. Wu

    2018-03-01

    Full Text Available Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS, Deming regression (DR, orthogonal distance regression (ODR, weighted ODR (WODR, and York regression (YR. We first introduce a new data generation scheme that employs the Mersenne twister (MT pseudorandom number generator. The numerical simulations are also improved by (a refining the parameterization of nonlinear measurement uncertainties, (b inclusion of a linear measurement uncertainty, and (c inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot was developed to facilitate the implementation of error-in-variables regressions.

  13. Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

    Wu, Cheng; Zhen Yu, Jian

    2018-03-01

    Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.

  14. Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

    MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

    2005-01-01

    Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.

  15. A method for fitting regression splines with varying polynomial order in the linear mixed model.

    Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W

    2006-02-15

    The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.

  16. Support Vector Regression-Based Adaptive Divided Difference Filter for Nonlinear State Estimation Problems

    Hongjian Wang

    2014-01-01

    Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.

  17. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  18. Linear Regression on Sparse Features for Single-Channel Speech Separation

    Schmidt, Mikkel N.; Olsson, Rasmus Kongsgaard

    2007-01-01

    In this work we address the problem of separating multiple speakers from a single microphone recording. We formulate a linear regression model for estimating each speaker based on features derived from the mixture. The employed feature representation is a sparse, non-negative encoding of the speech...... mixture in terms of pre-learned speaker-dependent dictionaries. Previous work has shown that this feature representation by itself provides some degree of separation. We show that the performance is significantly improved when regression analysis is performed on the sparse, non-negative features, both...

  19. Linear regression based on Minimum Covariance Determinant (MCD) and TELBS methods on the productivity of phytoplankton

    Gusriani, N.; Firdaniza

    2018-03-01

    The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.

  20. Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling.

    Kawashima, Issaku; Kumano, Hiroaki

    2017-01-01

    Mind-wandering (MW), task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG) variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR) to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

  1. Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling

    Issaku Kawashima

    2017-07-01

    Full Text Available Mind-wandering (MW, task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

  2. Error analysis of dimensionless scaling experiments with multiple points using linear regression

    Guercan, Oe.D.; Vermare, L.; Hennequin, P.; Bourdelle, C.

    2010-01-01

    A general method of error estimation in the case of multiple point dimensionless scaling experiments, using linear regression and standard error propagation, is proposed. The method reduces to the previous result of Cordey (2009 Nucl. Fusion 49 052001) in the case of a two-point scan. On the other hand, if the points follow a linear trend, it explains how the estimated error decreases as more points are added to the scan. Based on the analytical expression that is derived, it is argued that for a low number of points, adding points to the ends of the scanned range, rather than the middle, results in a smaller error estimate. (letter)

  3. Kalman filter with a linear state model for PDR+WLAN positioning and its application to assisting a particle filter

    Raitoharju, Matti; Nurminen, Henri; Piché, Robert

    2015-12-01

    Indoor positioning based on wireless local area network (WLAN) signals is often enhanced using pedestrian dead reckoning (PDR) based on an inertial measurement unit. The state evolution model in PDR is usually nonlinear. We present a new linear state evolution model for PDR. In simulated-data and real-data tests of tightly coupled WLAN-PDR positioning, the positioning accuracy with this linear model is better than with the traditional models when the initial heading is not known, which is a common situation. The proposed method is computationally light and is also suitable for smoothing. Furthermore, we present modifications to WLAN positioning based on Gaussian coverage areas and show how a Kalman filter using the proposed model can be used for integrity monitoring and (re)initialization of a particle filter.

  4. Dynamic Optimization for IPS2 Resource Allocation Based on Improved Fuzzy Multiple Linear Regression

    Maokuan Zheng

    2017-01-01

    Full Text Available The study mainly focuses on resource allocation optimization for industrial product-service systems (IPS2. The development of IPS2 leads to sustainable economy by introducing cooperative mechanisms apart from commodity transaction. The randomness and fluctuation of service requests from customers lead to the volatility of IPS2 resource utilization ratio. Three basic rules for resource allocation optimization are put forward to improve system operation efficiency and cut unnecessary costs. An approach based on fuzzy multiple linear regression (FMLR is developed, which integrates the strength and concision of multiple linear regression in data fitting and factor analysis and the merit of fuzzy theory in dealing with uncertain or vague problems, which helps reduce those costs caused by unnecessary resource transfer. The iteration mechanism is introduced in the FMLR algorithm to improve forecasting accuracy. A case study of human resource allocation optimization in construction machinery industry is implemented to test and verify the proposed model.

  5. COLOR IMAGE RETRIEVAL BASED ON FEATURE FUSION THROUGH MULTIPLE LINEAR REGRESSION ANALYSIS

    K. Seetharaman

    2015-08-01

    Full Text Available This paper proposes a novel technique based on feature fusion using multiple linear regression analysis, and the least-square estimation method is employed to estimate the parameters. The given input query image is segmented into various regions according to the structure of the image. The color and texture features are extracted on each region of the query image, and the features are fused together using the multiple linear regression model. The estimated parameters of the model, which is modeled based on the features, are formed as a vector called a feature vector. The Canberra distance measure is adopted to compare the feature vectors of the query and target images. The F-measure is applied to evaluate the performance of the proposed technique. The obtained results expose that the proposed technique is comparable to the other existing techniques.

  6. BFLCRM: A BAYESIAN FUNCTIONAL LINEAR COX REGRESSION MODEL FOR PREDICTING TIME TO CONVERSION TO ALZHEIMER'S DISEASE.

    Lee, Eunjee; Zhu, Hongtu; Kong, Dehan; Wang, Yalin; Giovanello, Kelly Sullivan; Ibrahim, Joseph G

    2015-12-01

    The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer's disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer's Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months. The functional linear Cox regression model was used to establish that functional covariates including hippocampus surface morphology and scalar covariates including brain MRI volumes, cognitive performance (ADAS-Cog), and APOE status can accurately predict time to onset of AD. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFLCRM.

  7. Inverse estimation of multiple muscle activations based on linear logistic regression.

    Sekiya, Masashi; Tsuji, Toshiaki

    2017-07-01

    This study deals with a technology to estimate the muscle activity from the movement data using a statistical model. A linear regression (LR) model and artificial neural networks (ANN) have been known as statistical models for such use. Although ANN has a high estimation capability, it is often in the clinical application that the lack of data amount leads to performance deterioration. On the other hand, the LR model has a limitation in generalization performance. We therefore propose a muscle activity estimation method to improve the generalization performance through the use of linear logistic regression model. The proposed method was compared with the LR model and ANN in the verification experiment with 7 participants. As a result, the proposed method showed better generalization performance than the conventional methods in various tasks.

  8. User's Guide to the Weighted-Multiple-Linear Regression Program (WREG version 1.0)

    Eng, Ken; Chen, Yin-Yu; Kiang, Julie.E.

    2009-01-01

    Streamflow is not measured at every location in a stream network. Yet hydrologists, State and local agencies, and the general public still seek to know streamflow characteristics, such as mean annual flow or flood flows with different exceedance probabilities, at ungaged basins. The goals of this guide are to introduce and familiarize the user with the weighted multiple-linear regression (WREG) program, and to also provide the theoretical background for program features. The program is intended to be used to develop a regional estimation equation for streamflow characteristics that can be applied at an ungaged basin, or to improve the corresponding estimate at continuous-record streamflow gages with short records. The regional estimation equation results from a multiple-linear regression that relates the observable basin characteristics, such as drainage area, to streamflow characteristics.

  9. Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.

    Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong

    2017-01-01

    This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  10. On the Relationship Between Confidence Sets and Exchangeable Weights in Multiple Linear Regression.

    Pek, Jolynn; Chalmers, R Philip; Monette, Georges

    2016-01-01

    When statistical models are employed to provide a parsimonious description of empirical relationships, the extent to which strong conclusions can be drawn rests on quantifying the uncertainty in parameter estimates. In multiple linear regression (MLR), regression weights carry two kinds of uncertainty represented by confidence sets (CSs) and exchangeable weights (EWs). Confidence sets quantify uncertainty in estimation whereas the set of EWs quantify uncertainty in the substantive interpretation of regression weights. As CSs and EWs share certain commonalities, we clarify the relationship between these two kinds of uncertainty about regression weights. We introduce a general framework describing how CSs and the set of EWs for regression weights are estimated from the likelihood-based and Wald-type approach, and establish the analytical relationship between CSs and sets of EWs. With empirical examples on posttraumatic growth of caregivers (Cadell et al., 2014; Schneider, Steele, Cadell & Hemsworth, 2011) and on graduate grade point average (Kuncel, Hezlett & Ones, 2001), we illustrate the usefulness of CSs and EWs for drawing strong scientific conclusions. We discuss the importance of considering both CSs and EWs as part of the scientific process, and provide an Online Appendix with R code for estimating Wald-type CSs and EWs for k regression weights.

  11. MULTIPLE LINEAR REGRESSION ANALYSIS FOR PREDICTION OF BOILER LOSSES AND BOILER EFFICIENCY

    Chayalakshmi C.L

    2018-01-01

    MULTIPLE LINEAR REGRESSION ANALYSIS FOR PREDICTION OF BOILER LOSSES AND BOILER EFFICIENCY ABSTRACT Calculation of boiler efficiency is essential if its parameters need to be controlled for either maintaining or enhancing its efficiency. But determination of boiler efficiency using conventional method is time consuming and very expensive. Hence, it is not recommended to find boiler efficiency frequently. The work presented in this paper deals with establishing the statistical mo...

  12. A Simple Linear Regression Method for Quantitative Trait Loci Linkage Analysis With Censored Observations

    Anderson, Carl A.; McRae, Allan F.; Visscher, Peter M.

    2006-01-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using...

  13. The detection of influential subsets in linear regression using an influence matrix

    Peña, Daniel; Yohai, Víctor J.

    1991-01-01

    This paper presents a new method to identify influential subsets in linear regression problems. The procedure uses the eigenstructure of an influence matrix which is defined as the matrix of uncentered covariance of the effect on the whole data set of deleting each observation, normalized to include the univariate Cook's statistics in the diagonal. It is shown that points in an influential subset will appear with large weight in at least one of the eigenvector linked to the largest eigenvalue...

  14. USE OF THE SIMPLE LINEAR REGRESSION MODEL IN MACRO-ECONOMICAL ANALYSES

    Constantin ANGHELACHE

    2011-10-01

    Full Text Available The article presents the fundamental aspects of the linear regression, as a toolbox which can be used in macroeconomic analyses. The article describes the estimation of the parameters, the statistical tests used, the homoscesasticity and heteroskedasticity. The use of econometrics instrument in macroeconomics is an important factor that guarantees the quality of the models, analyses, results and possible interpretation that can be drawn at this level.

  15. The regression-calibration method for fitting generalized linear models with additive measurement error

    James W. Hardin; Henrik Schmeidiche; Raymond J. Carroll

    2003-01-01

    This paper discusses and illustrates the method of regression calibration. This is a straightforward technique for fitting models with additive measurement error. We present this discussion in terms of generalized linear models (GLMs) following the notation defined in Hardin and Carroll (2003). Discussion will include specified measurement error, measurement error estimated by replicate error-prone proxies, and measurement error estimated by instrumental variables. The discussion focuses on s...

  16. Applications of Kalman Filtering to nuclear material control. [Kalman filtering and linear smoothing for detecting nuclear material losses

    Pike, D.H.; Morrison, G.W.; Westley, G.W.

    1977-10-01

    The feasibility of using modern state estimation techniques (specifically Kalman Filtering and Linear Smoothing) to detect losses of material from material balance areas is evaluated. It is shown that state estimation techniques are not only feasible but in most situations are superior to existing methods of analysis. The various techniques compared include Kalman Filtering, linear smoothing, standard control charts, and average cumulative summation (CUSUM) charts. Analysis results indicated that the standard control chart is the least effective method for detecting regularly occurring losses. An improvement in the detection capability over the standard control chart can be realized by use of the CUSUM chart. Even more sensitivity in the ability to detect losses can be realized by use of the Kalman Filter and the linear smoother. It was found that the error-covariance matrix can be used to establish limits of error for state estimates. It is shown that state estimation techniques represent a feasible and desirable method of theft detection. The technique is usually more sensitive than the CUSUM chart in detecting losses. One kind of loss which is difficult to detect using state estimation techniques is a single isolated loss. State estimation procedures are predicated on dynamic models and are well-suited for detecting losses which occur regularly over several accounting periods. A single isolated loss does not conform to this basic assumption and is more difficult to detect.

  17. A dynamic particle filter-support vector regression method for reliability prediction

    Wei, Zhao; Tao, Tao; ZhuoShu, Ding; Zio, Enrico

    2013-01-01

    Support vector regression (SVR) has been applied to time series prediction and some works have demonstrated the feasibility of its use to forecast system reliability. For accuracy of reliability forecasting, the selection of SVR's parameters is important. The existing research works on SVR's parameters selection divide the example dataset into training and test subsets, and tune the parameters on the training data. However, these fixed parameters can lead to poor prediction capabilities if the data of the test subset differ significantly from those of training. Differently, the novel method proposed in this paper uses particle filtering to estimate the SVR model parameters according to the whole measurement sequence up to the last observation instance. By treating the SVR training model as the observation equation of a particle filter, our method allows updating the SVR model parameters dynamically when a new observation comes. Because of the adaptability of the parameters to dynamic data pattern, the new PF–SVR method has superior prediction performance over that of standard SVR. Four application results show that PF–SVR is more robust than SVR to the decrease of the number of training data and the change of initial SVR parameter values. Also, even if there are trends in the test data different from those in the training data, the method can capture the changes, correct the SVR parameters and obtain good predictions. -- Highlights: •A dynamic PF–SVR method is proposed to predict the system reliability. •The method can adjust the SVR parameters according to the change of data. •The method is robust to the size of training data and initial parameter values. •Some cases based on both artificial and real data are studied. •PF–SVR shows superior prediction performance over standard SVR

  18. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  19. Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

    Gascón Adrià

    2017-10-01

    Full Text Available We propose privacy-preserving protocols for computing linear regression models, in the setting where the training dataset is vertically distributed among several parties. Our main contribution is a hybrid multi-party computation protocol that combines Yao’s garbled circuits with tailored protocols for computing inner products. Like many machine learning tasks, building a linear regression model involves solving a system of linear equations. We conduct a comprehensive evaluation and comparison of different techniques for securely performing this task, including a new Conjugate Gradient Descent (CGD algorithm. This algorithm is suitable for secure computation because it uses an efficient fixed-point representation of real numbers while maintaining accuracy and convergence rates comparable to what can be obtained with a classical solution using floating point numbers. Our technique improves on Nikolaenko et al.’s method for privacy-preserving ridge regression (S&P 2013, and can be used as a building block in other analyses. We implement a complete system and demonstrate that our approach is highly scalable, solving data analysis problems with one million records and one hundred features in less than one hour of total running time.

  20. LINEAR REGRESSION MODEL ESTİMATİON FOR RIGHT CENSORED DATA

    Ersin Yılmaz

    2016-05-01

    Full Text Available In this study, firstly we will define a right censored data. If we say shortly right-censored data is censoring values that above the exact line. This may be related with scaling device. And then  we will use response variable acquainted from right-censored explanatory variables. Then the linear regression model will be estimated. For censored data’s existence, Kaplan-Meier weights will be used for  the estimation of the model. With the weights regression model  will be consistent and unbiased with that.   And also there is a method for the censored data that is a semi parametric regression and this method also give  useful results  for censored data too. This study also might be useful for the health studies because of the censored data used in medical issues generally.

  1. [Multiple linear regression analysis of X-ray measurement and WOMAC scores of knee osteoarthritis].

    Ma, Yu-Feng; Wang, Qing-Fu; Chen, Zhao-Jun; Du, Chun-Lin; Li, Jun-Hai; Huang, Hu; Shi, Zong-Ting; Yin, Yue-Shan; Zhang, Lei; A-Di, Li-Jiang; Dong, Shi-Yu; Wu, Ji

    2012-05-01

    To perform Multiple Linear Regression analysis of X-ray measurement and WOMAC scores of knee osteoarthritis, and to analyze their relationship with clinical and biomechanical concepts. From March 2011 to July 2011, 140 patients (250 knees) were reviewed, including 132 knees in the left and 118 knees in the right; ranging in age from 40 to 71 years, with an average of 54.68 years. The MB-RULER measurement software was applied to measure femoral angle, tibial angle, femorotibial angle, joint gap angle from antero-posterir and lateral position of X-rays. The WOMAC scores were also collected. Then multiple regression equations was applied for the linear regression analysis of correlation between the X-ray measurement and WOMAC scores. There was statistical significance in the regression equation of AP X-rays value and WOMAC scores (Pregression equation of lateral X-ray value and WOMAC scores (P>0.05). 1) X-ray measurement of knee joint can reflect the WOMAC scores to a certain extent. 2) It is necessary to measure the X-ray mechanical axis of knee, which is important for diagnosis and treatment of osteoarthritis. 3) The correlation between tibial angle,joint gap angle on antero-posterior X-ray and WOMAC scores is significant, which can be used to assess the functional recovery of patients before and after treatment.

  2. Using the fuzzy linear regression method to benchmark the energy efficiency of commercial buildings

    Chung, William

    2012-01-01

    Highlights: ► Fuzzy linear regression method is used for developing benchmarking systems. ► The systems can be used to benchmark energy efficiency of commercial buildings. ► The resulting benchmarking model can be used by public users. ► The resulting benchmarking model can capture the fuzzy nature of input–output data. -- Abstract: Benchmarking systems from a sample of reference buildings need to be developed to conduct benchmarking processes for the energy efficiency of commercial buildings. However, not all benchmarking systems can be adopted by public users (i.e., other non-reference building owners) because of the different methods in developing such systems. An approach for benchmarking the energy efficiency of commercial buildings using statistical regression analysis to normalize other factors, such as management performance, was developed in a previous work. However, the field data given by experts can be regarded as a distribution of possibility. Thus, the previous work may not be adequate to handle such fuzzy input–output data. Consequently, a number of fuzzy structures cannot be fully captured by statistical regression analysis. This present paper proposes the use of fuzzy linear regression analysis to develop a benchmarking process, the resulting model of which can be used by public users. An illustrative example is given as well.

  3. A highly linear fully integrated powerline filter for biopotential acquisition systems.

    Alzaher, Hussain A; Tasadduq, Noman; Mahnashi, Yaqub

    2013-10-01

    Powerline interference is one of the most dominant problems in detection and processing of biopotential signals. This work presents a new fully integrated notch filter exhibiting high linearity and low power consumption. High filter linearity is preserved utilizing active-RC approach while IC implementation is achieved through replacing passive resistors by R-2R ladders achieving area saving of approximately 120 times. The filter design is optimized for low power operation using an efficient circuit topology and an ultra-low power operational amplifier. Fully differential implementation of the proposed filter shows notch depth of 43 dB (78 dB for 4th-order) with THD of better than -70 dB while consuming about 150 nW from 1.5 V supply.

  4. A highly linear baseband Gm—C filter for WLAN application

    Lijun, Yang; Zheng, Gong; Yin, Shi; Zhiming, Chen

    2011-09-01

    A low voltage, highly linear transconductan—C (Gm—C) low-pass filter for wireless local area network (WLAN) transceiver application is proposed. This transmitter (Tx) filter adopts a 9.8 MHz 3rd-order Chebyshev low pass prototype and achieves 35 dB stop-band attenuation at 30 MHz frequency. By utilizing pseudo-differential linear-region MOS transconductors, the filter IIP3 is measured to be as high as 9.5 dBm. Fabricated in a 0.35 μm standard CMOS technology, the proposed filter chip occupies a 0.41 × 0.17 mm2 die area and consumes 3.36 mA from a 3.3-V power supply.

  5. A highly linear baseband Gm-C filter for WLAN application

    Yang Lijun; Chen Zhiming; Gong Zheng; Shi Yin

    2011-01-01

    A low voltage, highly linear transconductan-C (G m -C) low-pass filter for wireless local area network (WLAN) transceiver application is proposed. This transmitter (Tx) filter adopts a 9.8 MHz 3rd-order Chebyshev low pass prototype and achieves 35 dB stop-band attenuation at 30 MHz frequency. By utilizing pseudo-differential linear-region MOS transconductors, the filter IIP 3 is measured to be as high as 9.5 dBm. Fabricated in a 0.35 μm standard CMOS technology, the proposed filter chip occupies a 0.41 x 0.17 mm 2 die area and consumes 3.36 mA from a 3.3-V power supply. (semiconductor integrated circuits)

  6. A highly linear baseband G{sub m}-C filter for WLAN application

    Yang Lijun; Chen Zhiming [Department of Electronic Engineering, Xi' an University of Technology, Xi' an 710048 (China); Gong Zheng; Shi Yin, E-mail: ljyang@sci-inc.com.cn [Suzhou-CAS Semiconductors Integrated Technology Research Center, Suzhou 215021 (China)

    2011-09-15

    A low voltage, highly linear transconductan-C (G{sub m}-C) low-pass filter for wireless local area network (WLAN) transceiver application is proposed. This transmitter (Tx) filter adopts a 9.8 MHz 3rd-order Chebyshev low pass prototype and achieves 35 dB stop-band attenuation at 30 MHz frequency. By utilizing pseudo-differential linear-region MOS transconductors, the filter IIP{sub 3} is measured to be as high as 9.5 dBm. Fabricated in a 0.35 {mu}m standard CMOS technology, the proposed filter chip occupies a 0.41 x 0.17 mm{sup 2} die area and consumes 3.36 mA from a 3.3-V power supply. (semiconductor integrated circuits)

  7. Evaluation of accuracy of linear regression models in predicting urban stormwater discharge characteristics.

    Madarang, Krish J; Kang, Joo-Hyon

    2014-06-01

    Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R(2) and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.

  8. Hippocampal atrophy and developmental regression as first sign of linear scleroderma "en coup de sabre".

    Verhelst, Helene E; Beele, Hilde; Joos, Rik; Vanneuville, Benedicte; Van Coster, Rudy N

    2008-11-01

    An 8-year-old girl with linear scleroderma "en coup de sabre" is reported who, at preschool age, presented with intractable simple partial seizures more than 1 year before skin lesions were first noticed. MRI revealed hippocampal atrophy, controlaterally to the seizures and ipsilaterally to the skin lesions. In the following months, a mental and motor regression was noticed. Cerebral CT scan showed multiple foci of calcifications in the affected hemisphere. In previously reported patients the skin lesions preceded the neurological signs. To the best of our knowledge, hippocampal atrophy was not earlier reported as presenting symptom of linear scleroderma. Linear scleroderma should be included in the differential diagnosis in patients with unilateral hippocampal atrophy even when the typical skin lesions are not present.

  9. Lifted linear phase filter banks and the polyphase-with-advance representation

    Brislawn, C. M. (Christopher M.); Wohlberg, B. E. (Brendt E.)

    2004-01-01

    A matrix theory is developed for the noncausal polyphase-with-advance representation that underlies the theory of lifted perfect reconstruction filter banks and wavelet transforms as developed by Sweldens and Daubechies. This theory provides the fundamental lifting methodology employed in the ISO/IEC JPEG-2000 still image coding standard, which the authors helped to develop. Lifting structures for polyphase-with-advance filter banks are depicted in Figure 1. In the analysis bank of Figure 1(a), the first lifting step updates x{sub 0} with a filtered version of x{sub 1} and the second step updates x{sub 1} with a filtered version of x{sub 0}; gain factors 1/K and K normalize the lowpass- and highpass-filtered output subbands. Each of these steps is inverted by the corresponding operations in the synthesis bank shown in Figure 1(b). Lifting steps correspond to upper- or lower-triangular matrices, S{sub i}(z), in a cascade-form decomposition of the polyphase analysis matrix, H{sub a}(z). Lifting structures can also be implemented reversibly (i.e., losslessly in fixed-precision arithmetic) by rounding the lifting updates to integer values. Our treatment of the polyphase-with-advance representation develops an extensive matrix algebra framework that goes far beyond the results of. Specifically, we focus on analyzing and implementing linear phase two-channel filter banks via linear phase lifting cascade schemes. Whole-sample symmetric (WS) and half-sample symmetric (HS) linear phase filter banks are characterized completely in terms of the polyphase-with-advance representation. The theory benefits significantly from a number of new group-theoretic structures arising in the polyphase-with-advance matrix algebra from the lifting factorization of linear phase filter banks.

  10. Structural Shielding Design of a 6 MV Flattening Filter Free Linear Accelerator: Indian Scenario

    Mishra, Bibekananda; Selvam, T. Palani; Sharma, P. K. Dash

    2017-01-01

    Detailed structural shielding of primary and secondary barriers for a 6 MV medical linear accelerator (LINAC) operated with flattening filter (FF) and flattening filter free (FFF) modes are calculated. The calculations have been carried out by two methods, one using the approach given in National Council on Radiation Protection (NCRP) Report No. 151 and the other based on the monitor units (MUs) delivered in clinical practice. Radiation survey of the installations was also carried out. NCRP a...

  11. Design of Filter for a Class of Switched Linear Neutral Systems

    Caiyun Wu

    2013-01-01

    Full Text Available This paper is concerned with the filtering problem for a class of switched linear neutral systems with time-varying delays. The time-varying delays appear not only in the state but also in the state derivatives. Based on the average dwell time approach and the piecewise Lyapunov functional technique, sufficient conditions are proposed for the exponential stability of the filtering error dynamic system. Then, the corresponding solvability condition for a desired filter satisfying a weighted performance is established. All the conditions obtained are delay-dependent. Finally, two numerical examples are given to illustrate the effectiveness of the proposed theory.

  12. Computer software for linear and nonlinear regression in organic NMR; Programa de computador para regressao linear e nao linear em R.M.N. organica

    Canto, Eduardo Leite do; Rittner, Roberto [Universidade Estadual de Campinas, SP (Brazil). Inst. de Quimica

    1992-12-31

    Calculation involving two variable linear regressions, require specific procedures generally not familiar to chemist. For attending the necessity of fast and efficient handling of NMR data, a self explained and Pc portable software has been developed, which allows user to produce and use diskette recorded tables, containing chemical shift or any other substituent physical-chemical measurements and constants ({sigma}{sub T}, {sigma}{sup o}{sub R}, E{sub s}, ...) 9 refs., 1 fig.

  13. Significance tests to determine the direction of effects in linear regression models.

    Wiedermann, Wolfgang; Hagmann, Michael; von Eye, Alexander

    2015-02-01

    Previous studies have discussed asymmetric interpretations of the Pearson correlation coefficient and have shown that higher moments can be used to decide on the direction of dependence in the bivariate linear regression setting. The current study extends this approach by illustrating that the third moment of regression residuals may also be used to derive conclusions concerning the direction of effects. Assuming non-normally distributed variables, it is shown that the distribution of residuals of the correctly specified regression model (e.g., Y is regressed on X) is more symmetric than the distribution of residuals of the competing model (i.e., X is regressed on Y). Based on this result, 4 one-sample tests are discussed which can be used to decide which variable is more likely to be the response and which one is more likely to be the explanatory variable. A fifth significance test is proposed based on the differences of skewness estimates, which leads to a more direct test of a hypothesis that is compatible with direction of dependence. A Monte Carlo simulation study was performed to examine the behaviour of the procedures under various degrees of associations, sample sizes, and distributional properties of the underlying population. An empirical example is given which illustrates the application of the tests in practice. © 2014 The British Psychological Society.

  14. Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling.

    Edelman, Eric R; van Kuijk, Sander M J; Hamaekers, Ankie E W; de Korte, Marcel J M; van Merode, Godefridus G; Buhre, Wolfgang F F A

    2017-01-01

    For efficient utilization of operating rooms (ORs), accurate schedules of assigned block time and sequences of patient cases need to be made. The quality of these planning tools is dependent on the accurate prediction of total procedure time (TPT) per case. In this paper, we attempt to improve the accuracy of TPT predictions by using linear regression models based on estimated surgeon-controlled time (eSCT) and other variables relevant to TPT. We extracted data from a Dutch benchmarking database of all surgeries performed in six academic hospitals in The Netherlands from 2012 till 2016. The final dataset consisted of 79,983 records, describing 199,772 h of total OR time. Potential predictors of TPT that were included in the subsequent analysis were eSCT, patient age, type of operation, American Society of Anesthesiologists (ASA) physical status classification, and type of anesthesia used. First, we computed the predicted TPT based on a previously described fixed ratio model for each record, multiplying eSCT by 1.33. This number is based on the research performed by van Veen-Berkx et al., which showed that 33% of SCT is generally a good approximation of anesthesia-controlled time (ACT). We then systematically tested all possible linear regression models to predict TPT using eSCT in combination with the other available independent variables. In addition, all regression models were again tested without eSCT as a predictor to predict ACT separately (which leads to TPT by adding SCT). TPT was most accurately predicted using a linear regression model based on the independent variables eSCT, type of operation, ASA classification, and type of anesthesia. This model performed significantly better than the fixed ratio model and the method of predicting ACT separately. Making use of these more accurate predictions in planning and sequencing algorithms may enable an increase in utilization of ORs, leading to significant financial and productivity related benefits.

  15. Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling

    Eric R. Edelman

    2017-06-01

    Full Text Available For efficient utilization of operating rooms (ORs, accurate schedules of assigned block time and sequences of patient cases need to be made. The quality of these planning tools is dependent on the accurate prediction of total procedure time (TPT per case. In this paper, we attempt to improve the accuracy of TPT predictions by using linear regression models based on estimated surgeon-controlled time (eSCT and other variables relevant to TPT. We extracted data from a Dutch benchmarking database of all surgeries performed in six academic hospitals in The Netherlands from 2012 till 2016. The final dataset consisted of 79,983 records, describing 199,772 h of total OR time. Potential predictors of TPT that were included in the subsequent analysis were eSCT, patient age, type of operation, American Society of Anesthesiologists (ASA physical status classification, and type of anesthesia used. First, we computed the predicted TPT based on a previously described fixed ratio model for each record, multiplying eSCT by 1.33. This number is based on the research performed by van Veen-Berkx et al., which showed that 33% of SCT is generally a good approximation of anesthesia-controlled time (ACT. We then systematically tested all possible linear regression models to predict TPT using eSCT in combination with the other available independent variables. In addition, all regression models were again tested without eSCT as a predictor to predict ACT separately (which leads to TPT by adding SCT. TPT was most accurately predicted using a linear regression model based on the independent variables eSCT, type of operation, ASA classification, and type of anesthesia. This model performed significantly better than the fixed ratio model and the method of predicting ACT separately. Making use of these more accurate predictions in planning and sequencing algorithms may enable an increase in utilization of ORs, leading to significant financial and productivity related

  16. Bivariate least squares linear regression: Towards a unified analytic formalism. I. Functional models

    Caimmi, R.

    2011-08-01

    Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both

  17. Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States

    Yang, J.; Astitha, M.; Schwartz, C. S.

    2017-12-01

    Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.

  18. Building a new predictor for multiple linear regression technique-based corrective maintenance turnaround time.

    Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa

    2008-01-01

    This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.

  19. Linear and evolutionary polynomial regression models to forecast coastal dynamics: Comparison and reliability assessment

    Bruno, Delia Evelina; Barca, Emanuele; Goncalves, Rodrigo Mikosz; de Araujo Queiroz, Heithor Alexandre; Berardi, Luigi; Passarella, Giuseppe

    2018-01-01

    In this paper, the Evolutionary Polynomial Regression data modelling strategy has been applied to study small scale, short-term coastal morphodynamics, given its capability for treating a wide database of known information, non-linearly. Simple linear and multilinear regression models were also applied to achieve a balance between the computational load and reliability of estimations of the three models. In fact, even though it is easy to imagine that the more complex the model, the more the prediction improves, sometimes a "slight" worsening of estimations can be accepted in exchange for the time saved in data organization and computational load. The models' outcomes were validated through a detailed statistical, error analysis, which revealed a slightly better estimation of the polynomial model with respect to the multilinear model, as expected. On the other hand, even though the data organization was identical for the two models, the multilinear one required a simpler simulation setting and a faster run time. Finally, the most reliable evolutionary polynomial regression model was used in order to make some conjecture about the uncertainty increase with the extension of extrapolation time of the estimation. The overlapping rate between the confidence band of the mean of the known coast position and the prediction band of the estimated position can be a good index of the weakness in producing reliable estimations when the extrapolation time increases too much. The proposed models and tests have been applied to a coastal sector located nearby Torre Colimena in the Apulia region, south Italy.

  20. A Linear Regression Model for Global Solar Radiation on Horizontal Surfaces at Warri, Nigeria

    Michael S. Okundamiya

    2013-10-01

    Full Text Available The growing anxiety on the negative effects of fossil fuels on the environment and the global emission reduction targets call for a more extensive use of renewable energy alternatives. Efficient solar energy utilization is an essential solution to the high atmospheric pollution caused by fossil fuel combustion. Global solar radiation (GSR data, which are useful for the design and evaluation of solar energy conversion system, are not measured at the forty-five meteorological stations in Nigeria. The dearth of the measured solar radiation data calls for accurate estimation. This study proposed a temperature-based linear regression, for predicting the monthly average daily GSR on horizontal surfaces, at Warri (latitude 5.020N and longitude 7.880E an oil city located in the south-south geopolitical zone, in Nigeria. The proposed model is analyzed based on five statistical indicators (coefficient of correlation, coefficient of determination, mean bias error, root mean square error, and t-statistic, and compared with the existing sunshine-based model for the same study. The results indicate that the proposed temperature-based linear regression model could replace the existing sunshine-based model for generating global solar radiation data. Keywords: air temperature; empirical model; global solar radiation; regression analysis; renewable energy; Warri

  1. Multiple regression technique for Pth degree polynominals with and without linear cross products

    Davis, J. W.

    1973-01-01

    A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.

  2. Research on the multiple linear regression in non-invasive blood glucose measurement.

    Zhu, Jianming; Chen, Zhencheng

    2015-01-01

    A non-invasive blood glucose measurement sensor and the data process algorithm based on the metabolic energy conservation (MEC) method are presented in this paper. The physiological parameters of human fingertip can be measured by various sensing modalities, and blood glucose value can be evaluated with the physiological parameters by the multiple linear regression analysis. Five methods such as enter, remove, forward, backward and stepwise in multiple linear regression were compared, and the backward method had the best performance. The best correlation coefficient was 0.876 with the standard error of the estimate 0.534, and the significance was 0.012 (sig. regression equation was valid. The Clarke error grid analysis was performed to compare the MEC method with the hexokinase method, using 200 data points. The correlation coefficient R was 0.867 and all of the points were located in Zone A and Zone B, which shows the MEC method provides a feasible and valid way for non-invasive blood glucose measurement.

  3. Generating linear regression model to predict motor functions by use of laser range finder during TUG.

    Adachi, Daiki; Nishiguchi, Shu; Fukutani, Naoto; Hotta, Takayuki; Tashiro, Yuto; Morino, Saori; Shirooka, Hidehiko; Nozaki, Yuma; Hirata, Hinako; Yamaguchi, Moe; Yorozu, Ayanori; Takahashi, Masaki; Aoyama, Tomoki

    2017-05-01

    The purpose of this study was to investigate which spatial and temporal parameters of the Timed Up and Go (TUG) test are associated with motor function in elderly individuals. This study included 99 community-dwelling women aged 72.9 ± 6.3 years. Step length, step width, single support time, variability of the aforementioned parameters, gait velocity, cadence, reaction time from starting signal to first step, and minimum distance between the foot and a marker placed to 3 in front of the chair were measured using our analysis system. The 10-m walk test, five times sit-to-stand (FTSTS) test, and one-leg standing (OLS) test were used to assess motor function. Stepwise multivariate linear regression analysis was used to determine which TUG test parameters were associated with each motor function test. Finally, we calculated a predictive model for each motor function test using each regression coefficient. In stepwise linear regression analysis, step length and cadence were significantly associated with the 10-m walk test, FTSTS and OLS test. Reaction time was associated with the FTSTS test, and step width was associated with the OLS test. Each predictive model showed a strong correlation with the 10-m walk test and OLS test (P motor function test. Moreover, the TUG test time regarded as the lower extremity function and mobility has strong predictive ability in each motor function test. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  4. Partitioning of late gestation energy expenditure in ewes using indirect calorimetry and a linear regression approach

    Kiani, Alishir; Chwalibog, André; Nielsen, Mette O

    2007-01-01

    Late gestation energy expenditure (EE(gest)) originates from energy expenditure (EE) of development of conceptus (EE(conceptus)) and EE of homeorhetic adaptation of metabolism (EE(homeorhetic)). Even though EE(gest) is relatively easy to quantify, its partitioning is problematic. In the present...... study metabolizable energy (ME) intake ranges for twin-bearing ewes were 220-440, 350- 700, 350-900 kJ per metabolic body weight (W0.75) at week seven, five, two pre-partum respectively. Indirect calorimetry and a linear regression approach were used to quantify EE(gest) and then partition to EE......(conceptus) and EE(homeorhetic). Energy expenditure of basal metabolism of the non-gravid tissues (EE(bmng)), derived from the intercept of the linear regression equation of retained energy [kJ/W0.75] and ME intake [kJ/W(0.75)], was 298 [kJ/ W0.75]. Values of the intercepts of the regression equations at week seven...

  5. Robust best linear estimation for regression analysis using surrogate and instrumental variables.

    Wang, C Y

    2012-04-01

    We investigate methods for regression analysis when covariates are measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies the classical measurement error model, but it may not have repeated measurements. In addition to the surrogate variables that are available among the subjects in the calibration sample, we assume that there is an instrumental variable (IV) that is available for all study subjects. An IV is correlated with the unobserved true exposure variable and hence can be useful in the estimation of the regression coefficients. We propose a robust best linear estimator that uses all the available data, which is the most efficient among a class of consistent estimators. The proposed estimator is shown to be consistent and asymptotically normal under very weak distributional assumptions. For Poisson or linear regression, the proposed estimator is consistent even if the measurement error from the surrogate or IV is heteroscedastic. Finite-sample performance of the proposed estimator is examined and compared with other estimators via intensive simulation studies. The proposed method and other methods are applied to a bladder cancer case-control study.

  6. Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

    Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

    2016-03-01

    In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Distributed Monitoring of the R(sup 2) Statistic for Linear Regression

    Bhaduri, Kanishka; Das, Kamalika; Giannella, Chris R.

    2011-01-01

    The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.

  8. Single Image Super-Resolution Using Global Regression Based on Multiple Local Linear Mappings.

    Choi, Jae-Seok; Kim, Munchurl

    2017-03-01

    Super-resolution (SR) has become more vital, because of its capability to generate high-quality ultra-high definition (UHD) high-resolution (HR) images from low-resolution (LR) input images. Conventional SR methods entail high computational complexity, which makes them difficult to be implemented for up-scaling of full-high-definition input images into UHD-resolution images. Nevertheless, our previous super-interpolation (SI) method showed a good compromise between Peak-Signal-to-Noise Ratio (PSNR) performances and computational complexity. However, since SI only utilizes simple linear mappings, it may fail to precisely reconstruct HR patches with complex texture. In this paper, we present a novel SR method, which inherits the large-to-small patch conversion scheme from SI but uses global regression based on local linear mappings (GLM). Thus, our new SR method is called GLM-SI. In GLM-SI, each LR input patch is divided into 25 overlapped subpatches. Next, based on the local properties of these subpatches, 25 different local linear mappings are applied to the current LR input patch to generate 25 HR patch candidates, which are then regressed into one final HR patch using a global regressor. The local linear mappings are learned cluster-wise in our off-line training phase. The main contribution of this paper is as follows: Previously, linear-mapping-based conventional SR methods, including SI only used one simple yet coarse linear mapping to each patch to reconstruct its HR version. On the contrary, for each LR input patch, our GLM-SI is the first to apply a combination of multiple local linear mappings, where each local linear mapping is found according to local properties of the current LR patch. Therefore, it can better approximate nonlinear LR-to-HR mappings for HR patches with complex texture. Experiment results show that the proposed GLM-SI method outperforms most of the state-of-the-art methods, and shows comparable PSNR performance with much lower

  9. An evaluation of bias in propensity score-adjusted non-linear regression models.

    Wan, Fei; Mitra, Nandita

    2018-03-01

    Propensity score methods are commonly used to adjust for observed confounding when estimating the conditional treatment effect in observational studies. One popular method, covariate adjustment of the propensity score in a regression model, has been empirically shown to be biased in non-linear models. However, no compelling underlying theoretical reason has been presented. We propose a new framework to investigate bias and consistency of propensity score-adjusted treatment effects in non-linear models that uses a simple geometric approach to forge a link between the consistency of the propensity score estimator and the collapsibility of non-linear models. Under this framework, we demonstrate that adjustment of the propensity score in an outcome model results in the decomposition of observed covariates into the propensity score and a remainder term. Omission of this remainder term from a non-collapsible regression model leads to biased estimates of the conditional odds ratio and conditional hazard ratio, but not for the conditional rate ratio. We further show, via simulation studies, that the bias in these propensity score-adjusted estimators increases with larger treatment effect size, larger covariate effects, and increasing dissimilarity between the coefficients of the covariates in the treatment model versus the outcome model.

  10. A note on the use of multiple linear regression in molecular ecology.

    Frasier, Timothy R

    2016-03-01

    Multiple linear regression analyses (also often referred to as generalized linear models--GLMs, or generalized linear mixed models--GLMMs) are widely used in the analysis of data in molecular ecology, often to assess the relative effects of genetic characteristics on individual fitness or traits, or how environmental characteristics influence patterns of genetic differentiation. However, the coefficients resulting from multiple regression analyses are sometimes misinterpreted, which can lead to incorrect interpretations and conclusions within individual studies, and can propagate to wider-spread errors in the general understanding of a topic. The primary issue revolves around the interpretation of coefficients for independent variables when interaction terms are also included in the analyses. In this scenario, the coefficients associated with each independent variable are often interpreted as the independent effect of each predictor variable on the predicted variable. However, this interpretation is incorrect. The correct interpretation is that these coefficients represent the effect of each predictor variable on the predicted variable when all other predictor variables are zero. This difference may sound subtle, but the ramifications cannot be overstated. Here, my goals are to raise awareness of this issue, to demonstrate and emphasize the problems that can result and to provide alternative approaches for obtaining the desired information. © 2015 John Wiley & Sons Ltd.

  11. Weighted functional linear regression models for gene-based association analysis.

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  12. Time-Frequency Analysis of Non-Stationary Biological Signals with Sparse Linear Regression Based Fourier Linear Combiner

    Yubo Wang

    2017-06-01

    Full Text Available It is often difficult to analyze biological signals because of their nonlinear and non-stationary characteristics. This necessitates the usage of time-frequency decomposition methods for analyzing the subtle changes in these signals that are often connected to an underlying phenomena. This paper presents a new approach to analyze the time-varying characteristics of such signals by employing a simple truncated Fourier series model, namely the band-limited multiple Fourier linear combiner (BMFLC. In contrast to the earlier designs, we first identified the sparsity imposed on the signal model in order to reformulate the model to a sparse linear regression model. The coefficients of the proposed model are then estimated by a convex optimization algorithm. The performance of the proposed method was analyzed with benchmark test signals. An energy ratio metric is employed to quantify the spectral performance and results show that the proposed method Sparse-BMFLC has high mean energy (0.9976 ratio and outperforms existing methods such as short-time Fourier transfrom (STFT, continuous Wavelet transform (CWT and BMFLC Kalman Smoother. Furthermore, the proposed method provides an overall 6.22% in reconstruction error.

  13. Time-Frequency Analysis of Non-Stationary Biological Signals with Sparse Linear Regression Based Fourier Linear Combiner.

    Wang, Yubo; Veluvolu, Kalyana C

    2017-06-14

    It is often difficult to analyze biological signals because of their nonlinear and non-stationary characteristics. This necessitates the usage of time-frequency decomposition methods for analyzing the subtle changes in these signals that are often connected to an underlying phenomena. This paper presents a new approach to analyze the time-varying characteristics of such signals by employing a simple truncated Fourier series model, namely the band-limited multiple Fourier linear combiner (BMFLC). In contrast to the earlier designs, we first identified the sparsity imposed on the signal model in order to reformulate the model to a sparse linear regression model. The coefficients of the proposed model are then estimated by a convex optimization algorithm. The performance of the proposed method was analyzed with benchmark test signals. An energy ratio metric is employed to quantify the spectral performance and results show that the proposed method Sparse-BMFLC has high mean energy (0.9976) ratio and outperforms existing methods such as short-time Fourier transfrom (STFT), continuous Wavelet transform (CWT) and BMFLC Kalman Smoother. Furthermore, the proposed method provides an overall 6.22% in reconstruction error.

  14. Adaptive Linear and Normalized Combination of Radial Basis Function Networks for Function Approximation and Regression

    Yunfeng Wu

    2014-01-01

    Full Text Available This paper presents a novel adaptive linear and normalized combination (ALNC method that can be used to combine the component radial basis function networks (RBFNs to implement better function approximation and regression tasks. The optimization of the fusion weights is obtained by solving a constrained quadratic programming problem. According to the instantaneous errors generated by the component RBFNs, the ALNC is able to perform the selective ensemble of multiple leaners by adaptively adjusting the fusion weights from one instance to another. The results of the experiments on eight synthetic function approximation and six benchmark regression data sets show that the ALNC method can effectively help the ensemble system achieve a higher accuracy (measured in terms of mean-squared error and the better fidelity (characterized by normalized correlation coefficient of approximation, in relation to the popular simple average, weighted average, and the Bagging methods.

  15. Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure.

    Li, Yanming; Nan, Bin; Zhu, Ji

    2015-06-01

    We propose a multivariate sparse group lasso variable selection and estimation method for data with high-dimensional predictors as well as high-dimensional response variables. The method is carried out through a penalized multivariate multiple linear regression model with an arbitrary group structure for the regression coefficient matrix. It suits many biology studies well in detecting associations between multiple traits and multiple predictors, with each trait and each predictor embedded in some biological functional groups such as genes, pathways or brain regions. The method is able to effectively remove unimportant groups as well as unimportant individual coefficients within important groups, particularly for large p small n problems, and is flexible in handling various complex group structures such as overlapping or nested or multilevel hierarchical structures. The method is evaluated through extensive simulations with comparisons to the conventional lasso and group lasso methods, and is applied to an eQTL association study. © 2015, The International Biometric Society.

  16. Linear and support vector regressions based on geometrical correlation of data

    Kaijun Wang

    2007-10-01

    Full Text Available Linear regression (LR and support vector regression (SVR are widely used in data analysis. Geometrical correlation learning (GcLearn was proposed recently to improve the predictive ability of LR and SVR through mining and using correlations between data of a variable (inner correlation. This paper theoretically analyzes prediction performance of the GcLearn method and proves that GcLearn LR and SVR will have better prediction performance than traditional LR and SVR for prediction tasks when good inner correlations are obtained and predictions by traditional LR and SVR are far away from their neighbor training data under inner correlation. This gives the applicable condition of GcLearn method.

  17. Radioligand assays - methods and applications. IV. Uniform regression of hyperbolic and linear radioimmunoassay calibration curves

    Keilacker, H; Becker, G; Ziegler, M; Gottschling, H D [Zentralinstitut fuer Diabetes, Karlsburg (German Democratic Republic)

    1980-10-01

    In order to handle all types of radioimmunoassay (RIA) calibration curves obtained in the authors' laboratory in the same way, they tried to find a non-linear expression for their regression which allows calibration curves with different degrees of curvature to be fitted. Considering the two boundary cases of the incubation protocol they derived a hyperbolic inverse regression function: x = a/sub 1/y + a/sub 0/ + asub(-1)y/sup -1/, where x is the total concentration of antigen, asub(i) are constants, and y is the specifically bound radioactivity. An RIA evaluation procedure based on this function is described providing a fitted inverse RIA calibration curve and some statistical quality parameters. The latter are of an order which is normal for RIA systems. There is an excellent agreement between fitted and experimentally obtained calibration curves having a different degree of curvature.

  18. A computer tool for a minimax criterion in binary response and heteroscedastic simple linear regression models.

    Casero-Alonso, V; López-Fidalgo, J; Torsney, B

    2017-01-01

    Binary response models are used in many real applications. For these models the Fisher information matrix (FIM) is proportional to the FIM of a weighted simple linear regression model. The same is also true when the weight function has a finite integral. Thus, optimal designs for one binary model are also optimal for the corresponding weighted linear regression model. The main objective of this paper is to provide a tool for the construction of MV-optimal designs, minimizing the maximum of the variances of the estimates, for a general design space. MV-optimality is a potentially difficult criterion because of its nondifferentiability at equal variance designs. A methodology for obtaining MV-optimal designs where the design space is a compact interval [a, b] will be given for several standard weight functions. The methodology will allow us to build a user-friendly computer tool based on Mathematica to compute MV-optimal designs. Some illustrative examples will show a representation of MV-optimal designs in the Euclidean plane, taking a and b as the axes. The applet will be explained using two relevant models. In the first one the case of a weighted linear regression model is considered, where the weight function is directly chosen from a typical family. In the second example a binary response model is assumed, where the probability of the outcome is given by a typical probability distribution. Practitioners can use the provided applet to identify the solution and to know the exact support points and design weights. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  19. Kalman filtering and smoothing for linear wave equations with model error

    Lee, Wonjung; McDougall, D; Stuart, A M

    2011-01-01

    Filtering is a widely used methodology for the incorporation of observed data into time-evolving systems. It provides an online approach to state estimation inverse problems when data are acquired sequentially. The Kalman filter plays a central role in many applications because it is exact for linear systems subject to Gaussian noise, and because it forms the basis for many approximate filters which are used in high-dimensional systems. The aim of this paper is to study the effect of model error on the Kalman filter, in the context of linear wave propagation problems. A consistency result is proved when no model error is present, showing recovery of the true signal in the large data limit. This result, however, is not robust: it is also proved that arbitrarily small model error can lead to inconsistent recovery of the signal in the large data limit. If the model error is in the form of a constant shift to the velocity, the filtering and smoothing distributions only recover a partial Fourier expansion, a phenomenon related to aliasing. On the other hand, for a class of wave velocity model errors which are time dependent, it is possible to recover the filtering distribution exactly, but not the smoothing distribution. Numerical results are presented which corroborate the theory, and also propose a computational approach which overcomes the inconsistency in the presence of model error, by relaxing the model

  20. Healthcare Expenditures Associated with Depression Among Individuals with Osteoarthritis: Post-Regression Linear Decomposition Approach.

    Agarwal, Parul; Sambamoorthi, Usha

    2015-12-01

    Depression is common among individuals with osteoarthritis and leads to increased healthcare burden. The objective of this study was to examine excess total healthcare expenditures associated with depression among individuals with osteoarthritis in the US. Adults with self-reported osteoarthritis (n = 1881) were identified using data from the 2010 Medical Expenditure Panel Survey (MEPS). Among those with osteoarthritis, chi-square tests and ordinary least square regressions (OLS) were used to examine differences in healthcare expenditures between those with and without depression. Post-regression linear decomposition technique was used to estimate the relative contribution of different constructs of the Anderson's behavioral model, i.e., predisposing, enabling, need, personal healthcare practices, and external environment factors, to the excess expenditures associated with depression among individuals with osteoarthritis. All analysis accounted for the complex survey design of MEPS. Depression coexisted among 20.6 % of adults with osteoarthritis. The average total healthcare expenditures were $13,684 among adults with depression compared to $9284 among those without depression. Multivariable OLS regression revealed that adults with depression had 38.8 % higher healthcare expenditures (p regression linear decomposition analysis indicated that 50 % of differences in expenditures among adults with and without depression can be explained by differences in need factors. Among individuals with coexisting osteoarthritis and depression, excess healthcare expenditures associated with depression were mainly due to comorbid anxiety, chronic conditions and poor health status. These expenditures may potentially be reduced by providing timely intervention for need factors or by providing care under a collaborative care model.

  1. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.

    Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-04-01

    To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.

  2. Face Hallucination with Linear Regression Model in Semi-Orthogonal Multilinear PCA Method

    Asavaskulkiet, Krissada

    2018-04-01

    In this paper, we propose a new face hallucination technique, face images reconstruction in HSV color space with a semi-orthogonal multilinear principal component analysis method. This novel hallucination technique can perform directly from tensors via tensor-to-vector projection by imposing the orthogonality constraint in only one mode. In our experiments, we use facial images from FERET database to test our hallucination approach which is demonstrated by extensive experiments with high-quality hallucinated color faces. The experimental results assure clearly demonstrated that we can generate photorealistic color face images by using the SO-MPCA subspace with a linear regression model.

  3. Estimating integrated variance in the presence of microstructure noise using linear regression

    Holý, Vladimír

    2017-07-01

    Using financial high-frequency data for estimation of integrated variance of asset prices is beneficial but with increasing number of observations so-called microstructure noise occurs. This noise can significantly bias the realized variance estimator. We propose a method for estimation of the integrated variance robust to microstructure noise as well as for testing the presence of the noise. Our method utilizes linear regression in which realized variances estimated from different data subsamples act as dependent variable while the number of observations act as explanatory variable. We compare proposed estimator with other methods on simulated data for several microstructure noise structures.

  4. Application of genetic algorithm - multiple linear regressions to predict the activity of RSK inhibitors

    Avval Zhila Mohajeri

    2015-01-01

    Full Text Available This paper deals with developing a linear quantitative structure-activity relationship (QSAR model for predicting the RSK inhibition activity of some new compounds. A dataset consisting of 62 pyrazino [1,2-α] indole, diazepino [1,2-α] indole, and imidazole derivatives with known inhibitory activities was used. Multiple linear regressions (MLR technique combined with the stepwise (SW and the genetic algorithm (GA methods as variable selection tools was employed. For more checking stability, robustness and predictability of the proposed models, internal and external validation techniques were used. Comparison of the results obtained, indicate that the GA-MLR model is superior to the SW-MLR model and that it isapplicable for designing novel RSK inhibitors.

  5. Optimal Linear Filters for Pulse Height Measurements in the Presence of Noise

    Nygaard, K.

    1966-07-01

    For measurements of nuclear pulse height spectra a linear filter is used between the pulse amplifier and the pulse height recorder so as to improve the signal/noise ratio. The problem of finding the optimal filter is investigated with emphasis on technical realizability. The maximum available signal/noise ratio is theoretically calculated on the basis of all the information which can be found in the output of the pulse amplifier, and on an assumed a priori knowledge of the pulse time of arrival. It is then shown that the maximum available signal/noise ratio can be obtained with practical measurements without any a priori knowledge of pulse time of arrival, and a general description of the optimal linear filter is given. The solution is unique, technically realizable, and based solely on data (noise power spectrum and pulse shape) which can be measured at the output terminals of the pulse amplifier used

  6. Microstrip linear phase low pass filter based on defected ground structures for partial response modulation

    Cimoli, Bruno; Johansen, Tom Keinicke; Olmos, Juan Jose Vegas

    2018-01-01

    We report a high performance linear phase low pass filter (LPF) designed for partial response (PR) modulations. For the implementation, we adopted microstrip technology and a variant of the standard stepped‐impedance technique. Defected ground structures (DGS) are used for increasing the characte......We report a high performance linear phase low pass filter (LPF) designed for partial response (PR) modulations. For the implementation, we adopted microstrip technology and a variant of the standard stepped‐impedance technique. Defected ground structures (DGS) are used for increasing...... the characteristic impedance of transmission lines. Experimental results prove that the proposed filter can successfully modulate a non‐return‐to‐zero (NRZ) signal into a five levels PR one....

  7. Time-dependent switched discrete-time linear systems control and filtering

    Zhang, Lixian; Shi, Peng; Lu, Qiugang

    2016-01-01

    This book focuses on the basic control and filtering synthesis problems for discrete-time switched linear systems under time-dependent switching signals. Chapter 1, as an introduction of the book, gives the backgrounds and motivations of switched systems, the definitions of the typical time-dependent switching signals, the differences and links to other types of systems with hybrid characteristics and a literature review mainly on the control and filtering for the underlying systems. By summarizing the multiple Lyapunov-like functions (MLFs) approach in which different requirements on comparisons of Lyapunov function values at switching instants, a series of methodologies are developed for the issues on stability and stabilization, and l2-gain performance or tube-based robustness for l∞ disturbance, respectively, in Chapters 2 and 3. Chapters 4 and 5 are devoted to the control and filtering problems for the time-dependent switched linear systems with either polytopic uncertainties or measurable time-varying...

  8. Weighted H∞ Filtering for a Class of Switched Linear Systems with Additive Time-Varying Delays

    Li-li Li

    2015-01-01

    Full Text Available This paper is concerned with the problem of weighted H∞ filtering for a class of switched linear systems with two additive time-varying delays, which represent a general class of switched time-delay systems with strong practical background. Combining average dwell time (ADT technique with piecewise Lyapunov functionals, sufficient conditions are established to guarantee the exponential stability and weighted H∞ performance for the filtering error systems. The parameters of the designed switched filters are obtained by solving linear matrix inequalities (LMIs. A modification of Jensen integral inequality is exploited to derive results with less theoretical conservatism and computational complexity. Finally, two examples are given to demonstrate the effectiveness of the proposed method.

  9. Optimal Linear Filters for Pulse Height Measurements in the Presence of Noise

    Nygaard, K

    1966-07-15

    For measurements of nuclear pulse height spectra a linear filter is used between the pulse amplifier and the pulse height recorder so as to improve the signal/noise ratio. The problem of finding the optimal filter is investigated with emphasis on technical realizability. The maximum available signal/noise ratio is theoretically calculated on the basis of all the information which can be found in the output of the pulse amplifier, and on an assumed a priori knowledge of the pulse time of arrival. It is then shown that the maximum available signal/noise ratio can be obtained with practical measurements without any a priori knowledge of pulse time of arrival, and a general description of the optimal linear filter is given. The solution is unique, technically realizable, and based solely on data (noise power spectrum and pulse shape) which can be measured at the output terminals of the pulse amplifier used.

  10. Ion beam properties after mass filtering with a linear radiofrequency quadrupole

    Ferrer, R.; Kwiatkowski, A.A.; Bollen, G.; Lincoln, D.L.; Morrissey, D.J.; Pang, G.K.; Ringle, R.; Savory, J.; Schwarz, S.

    2014-01-01

    The properties of ion beams passing through a linear radiofrequency quadrupole mass filter were investigated with special attention to their dependence on the mass resolving power. Experimentally, an increase of the transverse emittance was observed as the mass-to-charge selectivity of the mass filter was raised. The experimental behavior was confirmed by beam transport simulations. -- Highlights: • The ion-optical properties of a Quadrupole Mass Filter (QMF) are presented. • Measured beam emittances follow a trend to larger values for smaller A/Q ratios and increasing mass resolution. • The experimental behavior was confirmed by beam transport simulations. • The use of a QMF for mass filtering comes at the cost of emittance growth of the ion beam

  11. Estimation of time-varying reactivity by the H∞ optimal linear filter

    Suzuki, Katsuo; Shimazaki, Junya; Watanabe, Koiti

    1995-01-01

    The problem of estimating the time-varying net reactivity from flux measurements is solved for a point reactor kinetics model using a linear filtering technique in an H ∞ settings. In order to sue this technique, an appropriate dynamical model of the reactivity is constructed that can be embedded into the reactor model as one of its variables. A filter, which minimizes the H ∞ norm of the estimation error power spectrum, operates on neutron density measurements corrupted by noise and provides an estimate of the dynamic net reactivity. Computer simulations are performed to reveal the basic characteristics of the H ∞ optimal filter. The results of the simulation indicate that the filter can be used to determine the time-varying reactivity from neutron density measurements that have been corrupted by noise

  12. Non-linear DSGE Models and The Central Difference Kalman Filter

    Andreasen, Martin Møller

    This paper introduces a Quasi Maximum Likelihood (QML) approach based on the Cen- tral Difference Kalman Filter (CDKF) to estimate non-linear DSGE models with potentially non-Gaussian shocks. We argue that this estimator can be expected to be consistent and asymptotically normal for DSGE models...

  13. Estimating traffic volume on Wyoming low volume roads using linear and logistic regression methods

    Dick Apronti

    2016-12-01

    Full Text Available Traffic volume is an important parameter in most transportation planning applications. Low volume roads make up about 69% of road miles in the United States. Estimating traffic on the low volume roads is a cost-effective alternative to taking traffic counts. This is because traditional traffic counts are expensive and impractical for low priority roads. The purpose of this paper is to present the development of two alternative means of cost-effectively estimating traffic volumes for low volume roads in Wyoming and to make recommendations for their implementation. The study methodology involves reviewing existing studies, identifying data sources, and carrying out the model development. The utility of the models developed were then verified by comparing actual traffic volumes to those predicted by the model. The study resulted in two regression models that are inexpensive and easy to implement. The first regression model was a linear regression model that utilized pavement type, access to highways, predominant land use types, and population to estimate traffic volume. In verifying the model, an R2 value of 0.64 and a root mean square error of 73.4% were obtained. The second model was a logistic regression model that identified the level of traffic on roads using five thresholds or levels. The logistic regression model was verified by estimating traffic volume thresholds and determining the percentage of roads that were accurately classified as belonging to the given thresholds. For the five thresholds, the percentage of roads classified correctly ranged from 79% to 88%. In conclusion, the verification of the models indicated both model types to be useful for accurate and cost-effective estimation of traffic volumes for low volume Wyoming roads. The models developed were recommended for use in traffic volume estimations for low volume roads in pavement management and environmental impact assessment studies.

  14. The maximally achievable accuracy of linear optimal regulators and linear optimal filters

    Kwakernaak, H.; Sivan, Raphael

    1972-01-01

    A linear system with a quadratic cost function, which is a weighted sum of the integral square regulation error and the integral square input, is considered. What happens to the integral square regulation error as the relative weight of the integral square input reduces to zero is investigated. In

  15. Introduction to statistical modelling 2: categorical variables and interactions in linear regression.

    Lunt, Mark

    2015-07-01

    In the first article in this series we explored the use of linear regression to predict an outcome variable from a number of predictive factors. It assumed that the predictive factors were measured on an interval scale. However, this article shows how categorical variables can also be included in a linear regression model, enabling predictions to be made separately for different groups and allowing for testing the hypothesis that the outcome differs between groups. The use of interaction terms to measure whether the effect of a particular predictor variable differs between groups is also explained. An alternative approach to testing the difference between groups of the effect of a given predictor, which consists of measuring the effect in each group separately and seeing whether the statistical significance differs between the groups, is shown to be misleading. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  16. Reduction of interferences in graphite furnace atomic absorption spectrometry by multiple linear regression modelling

    Grotti, Marco; Abelmoschi, Maria Luisa; Soggia, Francesco; Tiberiade, Christian; Frache, Roberto

    2000-12-01

    The multivariate effects of Na, K, Mg and Ca as nitrates on the electrothermal atomisation of manganese, cadmium and iron were studied by multiple linear regression modelling. Since the models proved to efficiently predict the effects of the considered matrix elements in a wide range of concentrations, they were applied to correct the interferences occurring in the determination of trace elements in seawater after pre-concentration of the analytes. In order to obtain a statistically significant number of samples, a large volume of the certified seawater reference materials CASS-3 and NASS-3 was treated with Chelex-100 resin; then, the chelating resin was separated from the solution, divided into several sub-samples, each of them was eluted with nitric acid and analysed by electrothermal atomic absorption spectrometry (for trace element determinations) and inductively coupled plasma optical emission spectrometry (for matrix element determinations). To minimise any other systematic error besides that due to matrix effects, accuracy of the pre-concentration step and contamination levels of the procedure were checked by inductively coupled plasma mass spectrometric measurements. Analytical results obtained by applying the multiple linear regression models were compared with those obtained with other calibration methods, such as external calibration using acid-based standards, external calibration using matrix-matched standards and the analyte addition technique. Empirical models proved to efficiently reduce interferences occurring in the analysis of real samples, allowing an improvement of accuracy better than for other calibration methods.

  17. A Feature-Free 30-Disease Pathological Brain Detection System by Linear Regression Classifier.

    Chen, Yi; Shao, Ying; Yan, Jie; Yuan, Ti-Fei; Qu, Yanwen; Lee, Elizabeth; Wang, Shuihua

    2017-01-01

    Alzheimer's disease patients are increasing rapidly every year. Scholars tend to use computer vision methods to develop automatic diagnosis system. (Background) In 2015, Gorji et al. proposed a novel method using pseudo Zernike moment. They tested four classifiers: learning vector quantization neural network, pattern recognition neural network trained by Levenberg-Marquardt, by resilient backpropagation, and by scaled conjugate gradient. This study presents an improved method by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Therefore, it can be used to detect Alzheimer's disease. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  18. Linear Regression between CIE-Lab Color Parameters and Organic Matter in Soils of Tea Plantations

    Chen, Yonggen; Zhang, Min; Fan, Dongmei; Fan, Kai; Wang, Xiaochang

    2018-02-01

    To quantify the relationship between the soil organic matter and color parameters using the CIE-Lab system, 62 soil samples (0-10 cm, Ferralic Acrisols) from tea plantations were collected from southern China. After air-drying and sieving, numerical color information and reflectance spectra of soil samples were measured under laboratory conditions using an UltraScan VIS (HunterLab) spectrophotometer equipped with CIE-Lab color models. We found that soil total organic carbon (TOC) and nitrogen (TN) contents were negatively correlated with the L* value (lightness) ( r = -0.84 and -0.80, respectively), a* value (correlation coefficient r = -0.51 and -0.46, respectively) and b* value ( r = -0.76 and -0.70, respectively). There were also linear regressions between TOC and TN contents with the L* value and b* value. Results showed that color parameters from a spectrophotometer equipped with CIE-Lab color models can predict TOC contents well for soils in tea plantations. The linear regression model between color values and soil organic carbon contents showed it can be used as a rapid, cost-effective method to evaluate content of soil organic matter in Chinese tea plantations.

  19. Multivariate linear regression of high-dimensional fMRI data with multiple target variables.

    Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia

    2014-05-01

    Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets. Copyright © 2013 Wiley Periodicals, Inc.

  20. Two-Sample Tests for High-Dimensional Linear Regression with an Application to Detecting Interactions.

    Xia, Yin; Cai, Tianxi; Cai, T Tony

    2018-01-01

    Motivated by applications in genomics, we consider in this paper global and multiple testing for the comparisons of two high-dimensional linear regression models. A procedure for testing the equality of the two regression vectors globally is proposed and shown to be particularly powerful against sparse alternatives. We then introduce a multiple testing procedure for identifying unequal coordinates while controlling the false discovery rate and false discovery proportion. Theoretical justifications are provided to guarantee the validity of the proposed tests and optimality results are established under sparsity assumptions on the regression coefficients. The proposed testing procedures are easy to implement. Numerical properties of the procedures are investigated through simulation and data analysis. The results show that the proposed tests maintain the desired error rates under the null and have good power under the alternative at moderate sample sizes. The procedures are applied to the Framingham Offspring study to investigate the interactions between smoking and cardiovascular related genetic mutations important for an inflammation marker.

  1. Synthesis of linear regression coefficients by recovering the within-study covariance matrix from summary statistics.

    Yoneoka, Daisuke; Henmi, Masayuki

    2017-06-01

    Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Soil moisture estimation using multi linear regression with terraSAR-X data

    G. García

    2016-06-01

    Full Text Available The first five centimeters of soil form an interface where the main heat fluxes exchanges between the land surface and the atmosphere occur. Besides ground measurements, remote sensing has proven to be an excellent tool for the monitoring of spatial and temporal distributed data of the most relevant Earth surface parameters including soil’s parameters. Indeed, active microwave sensors (Synthetic Aperture Radar - SAR offer the opportunity to monitor soil moisture (HS at global, regional and local scales by monitoring involved processes. Several inversion algorithms, that derive geophysical information as HS from SAR data, were developed. Many of them use electromagnetic models for simulating the backscattering coefficient and are based on statistical techniques, such as neural networks, inversion methods and regression models. Recent studies have shown that simple multiple regression techniques yield satisfactory results. The involved geophysical variables in these methodologies are descriptive of the soil structure, microwave characteristics and land use. Therefore, in this paper we aim at developing a multiple linear regression model to estimate HS on flat agricultural regions using TerraSAR-X satellite data and data from a ground weather station. The results show that the backscatter, the precipitation and the relative humidity are the explanatory variables of HS. The results obtained presented a RMSE of 5.4 and a R2  of about 0.6

  3. Uncertainty of pesticide residue concentration determined from ordinary and weighted linear regression curve.

    Yolci Omeroglu, Perihan; Ambrus, Árpad; Boyacioglu, Dilek

    2018-03-28

    Determination of pesticide residues is based on calibration curves constructed for each batch of analysis. Calibration standard solutions are prepared from a known amount of reference material at different concentration levels covering the concentration range of the analyte in the analysed samples. In the scope of this study, the applicability of both ordinary linear and weighted linear regression (OLR and WLR) for pesticide residue analysis was investigated. We used 782 multipoint calibration curves obtained for 72 different analytical batches with high-pressure liquid chromatography equipped with an ultraviolet detector, and gas chromatography with electron capture, nitrogen phosphorus or mass spectrophotometer detectors. Quality criteria of the linear curves including regression coefficient, standard deviation of relative residuals and deviation of back calculated concentrations were calculated both for WLR and OLR methods. Moreover, the relative uncertainty of the predicted analyte concentration was estimated for both methods. It was concluded that calibration curve based on WLR complies with all the quality criteria set by international guidelines compared to those calculated with OLR. It means that all the data fit well with WLR for pesticide residue analysis. It was estimated that, regardless of the actual concentration range of the calibration, relative uncertainty at the lowest calibrated level ranged between 0.3% and 113.7% for OLR and between 0.2% and 22.1% for WLR. At or above 1/3 of the calibrated range, uncertainty of calibration curve ranged between 0.1% and 16.3% for OLR and 0% and 12.2% for WLR, and therefore, the two methods gave comparable results.

  4. Linear filters as a method of real-time prediction of geomagnetic activity

    McPherron, R.L.; Baker, D.N.; Bargatze, L.F.

    1985-01-01

    Important factors controlling geomagnetic activity include the solar wind velocity, the strength of the interplanetary magnetic field (IMF), and the field orientation. Because these quantities change so much in transit through the solar wind, real-time monitoring immediately upstream of the earth provides the best input for any technique of real-time prediction. One such technique is linear prediction filtering which utilizes past histories of the input and output of a linear system to create a time-invariant filter characterizing the system. Problems of nonlinearity or temporal changes of the system can be handled by appropriate choice of input parameters and piecewise approximation in various ranges of the input. We have created prediction filters for all the standard magnetic indices and tested their efficiency. The filters show that the initial response of the magnetosphere to a southward turning of the IMF peaks in 20 minutes and then again in 55 minutes. After a northward turning, auroral zone indices and the midlatitude ASYM index return to background within 2 hours, while Dst decays exponentially with a time constant of about 8 hours. This paper describes a simple, real-time system utilizing these filters which could predict a substantial fraction of the variation in magnetic activity indices 20 to 50 minutes in advance

  5. Heteroscedasticity as a Basis of Direction Dependence in Reversible Linear Regression Models.

    Wiedermann, Wolfgang; Artner, Richard; von Eye, Alexander

    2017-01-01

    Heteroscedasticity is a well-known issue in linear regression modeling. When heteroscedasticity is observed, researchers are advised to remedy possible model misspecification of the explanatory part of the model (e.g., considering alternative functional forms and/or omitted variables). The present contribution discusses another source of heteroscedasticity in observational data: Directional model misspecifications in the case of nonnormal variables. Directional misspecification refers to situations where alternative models are equally likely to explain the data-generating process (e.g., x → y versus y → x). It is shown that the homoscedasticity assumption is likely to be violated in models that erroneously treat true nonnormal predictors as response variables. Recently, Direction Dependence Analysis (DDA) has been proposed as a framework to empirically evaluate the direction of effects in linear models. The present study links the phenomenon of heteroscedasticity with DDA and describes visual diagnostics and nine homoscedasticity tests that can be used to make decisions concerning the direction of effects in linear models. Results of a Monte Carlo simulation that demonstrate the adequacy of the approach are presented. An empirical example is provided, and applicability of the methodology in cases of violated assumptions is discussed.

  6. Study of 1D complex resistivity inversion using digital linear filter technique; Linear filter ho wo mochiita fukusohi teiko no gyakukaisekiho no kento

    Sakurai, K; Shima, H [OYO Corp., Tokyo (Japan)

    1996-10-01

    This paper proposes a modeling method of one-dimensional complex resistivity using linear filter technique which has been extended to the complex resistivity. In addition, a numerical test of inversion was conducted using the monitoring results, to discuss the measured frequency band. Linear filter technique is a method by which theoretical potential can be calculated for stratified structures, and it is widely used for the one-dimensional analysis of dc electrical exploration. The modeling can be carried out only using values of complex resistivity without using values of potential. In this study, a bipolar method was employed as a configuration of electrodes. The numerical test of one-dimensional complex resistivity inversion was conducted using the formulated modeling. A three-layered structure model was used as a numerical model. A multi-layer structure with a thickness of 5 m was analyzed on the basis of apparent complex resistivity calculated from the model. From the results of numerical test, it was found that both the chargeability and the time constant agreed well with those of the original model. A trade-off was observed between the chargeability and the time constant at the stage of convergence. 3 refs., 9 figs., 1 tab.

  7. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  8. Estimating leaf photosynthetic pigments information by stepwise multiple linear regression analysis and a leaf optical model

    Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei

    2014-10-01

    Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.

  9. Causal correlation of foliar biochemical concentrations with AVIRIS spectra using forced entry linear regression

    Dawson, Terence P.; Curran, Paul J.; Kupiec, John A.

    1995-01-01

    link between wavelengths chosen by stepwise regression and the biochemical of interest, and this in turn has cast doubts on the use of imaging spectrometry for the estimation of foliar biochemical concentrations at sites distant from the training sites. To investigate this problem, an analysis was conducted on the variation in canopy biochemical concentrations and reflectance spectra using forced entry linear regression.

  10. Analysis of positron lifetime spectra using quantified maximum entropy and a general linear filter

    Shukla, A.; Peter, M.; Hoffmann, L.

    1993-01-01

    Two new approaches are used to analyze positron annihilation lifetime spectra. A general linear filter is designed to filter the noise from lifetime data. The quantified maximum entropy method is used to solve the inverse problem of finding the lifetimes and intensities present in data. We determine optimal values of parameters needed for fitting using Bayesian methods. Estimates of errors are provided. We present results on simulated and experimental data with extensive tests to show the utility of this method and compare it with other existing methods. (orig.)

  11. Alternate MIMO AF relaying networks with interference alignment: Spectral efficient protocol and linear filter design

    Park, Kihong

    2013-02-01

    In this paper, we study a two-hop relaying network consisting of one source, one destination, and three amplify-and-forward (AF) relays with multiple antennas. To compensate for the capacity prelog factor loss of 1/2$ due to the half-duplex relaying, alternate transmission is performed among three relays, and the inter-relay interference due to the alternate relaying is aligned to make additional degrees of freedom. In addition, suboptimal linear filter designs at the nodes are proposed to maximize the achievable sum rate for different fading scenarios when the destination utilizes a minimum mean-square error filter. © 1967-2012 IEEE.

  12. Boosted regression trees, multivariate adaptive regression splines and their two-step combinations with multiple linear regression or partial least squares to predict blood-brain barrier passage: a case study.

    Deconinck, E; Zhang, M H; Petitet, F; Dubus, E; Ijjaali, I; Coomans, D; Vander Heyden, Y

    2008-02-18

    The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches.

  13. Daily Suspended Sediment Discharge Prediction Using Multiple Linear Regression and Artificial Neural Network

    Uca; Toriman, Ekhwan; Jaafar, Othman; Maru, Rosmini; Arfan, Amal; Saleh Ahmar, Ansari

    2018-01-01

    Prediction of suspended sediment discharge in a catchments area is very important because it can be used to evaluation the erosion hazard, management of its water resources, water quality, hydrology project management (dams, reservoirs, and irrigation) and to determine the extent of the damage that occurred in the catchments. Multiple Linear Regression analysis and artificial neural network can be used to predict the amount of daily suspended sediment discharge. Regression analysis using the least square method, whereas artificial neural networks using Radial Basis Function (RBF) and feedforward multilayer perceptron with three learning algorithms namely Levenberg-Marquardt (LM), Scaled Conjugate Descent (SCD) and Broyden-Fletcher-Goldfarb-Shanno Quasi-Newton (BFGS). The number neuron of hidden layer is three to sixteen, while in output layer only one neuron because only one output target. The mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R2 ) and coefficient of efficiency (CE) of the multiple linear regression (MLRg) value Model 2 (6 input variable independent) has the lowest the value of MAE and RMSE (0.0000002 and 13.6039) and highest R2 and CE (0.9971 and 0.9971). When compared between LM, SCG and RBF, the BFGS model structure 3-7-1 is the better and more accurate to prediction suspended sediment discharge in Jenderam catchment. The performance value in testing process, MAE and RMSE (13.5769 and 17.9011) is smallest, meanwhile R2 and CE (0.9999 and 0.9998) is the highest if it compared with the another BFGS Quasi-Newton model (6-3-1, 9-10-1 and 12-12-1). Based on the performance statistics value, MLRg, LM, SCG, BFGS and RBF suitable and accurately for prediction by modeling the non-linear complex behavior of suspended sediment responses to rainfall, water depth and discharge. The comparison between artificial neural network (ANN) and MLRg, the MLRg Model 2 accurately for to prediction suspended sediment discharge (kg

  14. Recursive least squares method of regression coefficients estimation as a special case of Kalman filter

    Borodachev, S. M.

    2016-06-01

    The simple derivation of recursive least squares (RLS) method equations is given as special case of Kalman filter estimation of a constant system state under changing observation conditions. A numerical example illustrates application of RLS to multicollinearity problem.

  15. Carbon 13 nuclear magnetic resonance chemical shifts empiric calculations of polymers by multi linear regression and molecular modeling

    Da Silva Pinto, P.S.; Eustache, R.P.; Audenaert, M.; Bernassau, J.M.

    1996-01-01

    This work deals with carbon 13 nuclear magnetic resonance chemical shifts empiric calculations by multi linear regression and molecular modeling. The multi linear regression is indeed one way to obtain an equation able to describe the behaviour of the chemical shift for some molecules which are in the data base (rigid molecules with carbons). The methodology consists of structures describer parameters definition which can be bound to carbon 13 chemical shift known for these molecules. Then, the linear regression is used to determine the equation significant parameters. This one can be extrapolated to molecules which presents some resemblances with those of the data base. (O.L.). 20 refs., 4 figs., 1 tab

  16. Analysis of the Covered Electrode Welding Process Stability on the Basis of Linear Regression Equation

    Słania J.

    2014-10-01

    Full Text Available The article presents the process of production of coated electrodes and their welding properties. The factors concerning the welding properties and the currently applied method of assessing are given. The methodology of the testing based on the measuring and recording of instantaneous values of welding current and welding arc voltage is discussed. Algorithm for creation of reference data base of the expert system is shown, aiding the assessment of covered electrodes welding properties. The stability of voltage–current characteristics was discussed. Statistical factors of instantaneous values of welding current and welding arc voltage waveforms used for determining of welding process stability are presented. The results of coated electrodes welding properties are compared. The article presents the results of linear regression as well as the impact of the independent variables on the welding process performance. Finally the conclusions drawn from the research are given.

  17. Generalized Partially Linear Regression with Misclassified Data and an Application to Labour Market Transitions

    Dlugosz, Stephan; Mammen, Enno; Wilke, Ralf

    2017-01-01

    Large data sets that originate from administrative or operational activity are increasingly used for statistical analysis as they often contain very precise information and a large number of observations. But there is evidence that some variables can be subject to severe misclassification...... or contain missing values. Given the size of the data, a flexible semiparametric misclassification model would be good choice but their use in practise is scarce. To close this gap a semiparametric model for the probability of observing labour market transitions is estimated using a sample of 20 m...... observations from Germany. It is shown that estimated marginal effects of a number of covariates are sizeably affected by misclassification and missing values in the analysis data. The proposed generalized partially linear regression extends existing models by allowing a misclassified discrete covariate...

  18. hMuLab: A Biomedical Hybrid MUlti-LABel Classifier Based on Multiple Linear Regression.

    Wang, Pu; Ge, Ruiquan; Xiao, Xuan; Zhou, Manli; Zhou, Fengfeng

    2017-01-01

    Many biomedical classification problems are multi-label by nature, e.g., a gene involved in a variety of functions and a patient with multiple diseases. The majority of existing classification algorithms assumes each sample with only one class label, and the multi-label classification problem remains to be a challenge for biomedical researchers. This study proposes a novel multi-label learning algorithm, hMuLab, by integrating both feature-based and neighbor-based similarity scores. The multiple linear regression modeling techniques make hMuLab capable of producing multiple label assignments for a query sample. The comparison results over six commonly-used multi-label performance measurements suggest that hMuLab performs accurately and stably for the biomedical datasets, and may serve as a complement to the existing literature.

  19. Linear regression models and k-means clustering for statistical analysis of fNIRS data.

    Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro

    2015-02-01

    We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.

  20. Multiple Linear Regression Model Based on Neural Network and Its Application in the MBR Simulation

    Chunqing Li

    2012-01-01

    Full Text Available The computer simulation of the membrane bioreactor MBR has become the research focus of the MBR simulation. In order to compensate for the defects, for example, long test period, high cost, invisible equipment seal, and so forth, on the basis of conducting in-depth study of the mathematical model of the MBR, combining with neural network theory, this paper proposed a three-dimensional simulation system for MBR wastewater treatment, with fast speed, high efficiency, and good visualization. The system is researched and developed with the hybrid programming of VC++ programming language and OpenGL, with a multifactor linear regression model of affecting MBR membrane fluxes based on neural network, applying modeling method of integer instead of float and quad tree recursion. The experiments show that the three-dimensional simulation system, using the above models and methods, has the inspiration and reference for the future research and application of the MBR simulation technology.

  1. Predicting Fuel Ignition Quality Using 1H NMR Spectroscopy and Multiple Linear Regression

    Abdul Jameel, Abdul Gani

    2016-09-14

    An improved model for the prediction of ignition quality of hydrocarbon fuels has been developed using 1H nuclear magnetic resonance (NMR) spectroscopy and multiple linear regression (MLR) modeling. Cetane number (CN) and derived cetane number (DCN) of 71 pure hydrocarbons and 54 hydrocarbon blends were utilized as a data set to study the relationship between ignition quality and molecular structure. CN and DCN are functional equivalents and collectively referred to as D/CN, herein. The effect of molecular weight and weight percent of structural parameters such as paraffinic CH3 groups, paraffinic CH2 groups, paraffinic CH groups, olefinic CH–CH2 groups, naphthenic CH–CH2 groups, and aromatic C–CH groups on D/CN was studied. A particular emphasis on the effect of branching (i.e., methyl substitution) on the D/CN was studied, and a new parameter denoted as the branching index (BI) was introduced to quantify this effect. A new formula was developed to calculate the BI of hydrocarbon fuels using 1H NMR spectroscopy. Multiple linear regression (MLR) modeling was used to develop an empirical relationship between D/CN and the eight structural parameters. This was then used to predict the DCN of many hydrocarbon fuels. The developed model has a high correlation coefficient (R2 = 0.97) and was validated with experimentally measured DCN of twenty-two real fuel mixtures (e.g., gasolines and diesels) and fifty-nine blends of known composition, and the predicted values matched well with the experimental data.

  2. Standardizing effect size from linear regression models with log-transformed variables for meta-analysis.

    Rodríguez-Barranco, Miguel; Tobías, Aurelio; Redondo, Daniel; Molina-Portillo, Elena; Sánchez, María José

    2017-03-17

    Meta-analysis is very useful to summarize the effect of a treatment or a risk factor for a given disease. Often studies report results based on log-transformed variables in order to achieve the principal assumptions of a linear regression model. If this is the case for some, but not all studies, the effects need to be homogenized. We derived a set of formulae to transform absolute changes into relative ones, and vice versa, to allow including all results in a meta-analysis. We applied our procedure to all possible combinations of log-transformed independent or dependent variables. We also evaluated it in a simulation based on two variables either normally or asymmetrically distributed. In all the scenarios, and based on different change criteria, the effect size estimated by the derived set of formulae was equivalent to the real effect size. To avoid biased estimates of the effect, this procedure should be used with caution in the case of independent variables with asymmetric distributions that significantly differ from the normal distribution. We illustrate an application of this procedure by an application to a meta-analysis on the potential effects on neurodevelopment in children exposed to arsenic and manganese. The procedure proposed has been shown to be valid and capable of expressing the effect size of a linear regression model based on different change criteria in the variables. Homogenizing the results from different studies beforehand allows them to be combined in a meta-analysis, independently of whether the transformations had been performed on the dependent and/or independent variables.

  3. Multiple linear combination (MLC) regression tests for common variants adapted to linkage disequilibrium structure.

    Yoo, Yun Joo; Sun, Lei; Poirier, Julia G; Paterson, Andrew D; Bull, Shelley B

    2017-02-01

    By jointly analyzing multiple variants within a gene, instead of one at a time, gene-based multiple regression can improve power, robustness, and interpretation in genetic association analysis. We investigate multiple linear combination (MLC) test statistics for analysis of common variants under realistic trait models with linkage disequilibrium (LD) based on HapMap Asian haplotypes. MLC is a directional test that exploits LD structure in a gene to construct clusters of closely correlated variants recoded such that the majority of pairwise correlations are positive. It combines variant effects within the same cluster linearly, and aggregates cluster-specific effects in a quadratic sum of squares and cross-products, producing a test statistic with reduced degrees of freedom (df) equal to the number of clusters. By simulation studies of 1000 genes from across the genome, we demonstrate that MLC is a well-powered and robust choice among existing methods across a broad range of gene structures. Compared to minimum P-value, variance-component, and principal-component methods, the mean power of MLC is never much lower than that of other methods, and can be higher, particularly with multiple causal variants. Moreover, the variation in gene-specific MLC test size and power across 1000 genes is less than that of other methods, suggesting it is a complementary approach for discovery in genome-wide analysis. The cluster construction of the MLC test statistics helps reveal within-gene LD structure, allowing interpretation of clustered variants as haplotypic effects, while multiple regression helps to distinguish direct and indirect associations. © 2016 The Authors Genetic Epidemiology Published by Wiley Periodicals, Inc.

  4. A SOCIOLOGICAL ANALYSIS OF THE CHILDBEARING COEFFICIENT IN THE ALTAI REGION BASED ON METHOD OF FUZZY LINEAR REGRESSION

    Sergei Vladimirovich Varaksin

    2017-06-01

    Full Text Available Purpose. Construction of a mathematical model of the dynamics of childbearing change in the Altai region in 2000–2016, analysis of the dynamics of changes in birth rates for multiple age categories of women of childbearing age. Methodology. A auxiliary analysis element is the construction of linear mathematical models of the dynamics of childbearing by using fuzzy linear regression method based on fuzzy numbers. Fuzzy linear regression is considered as an alternative to standard statistical linear regression for short time series and unknown distribution law. The parameters of fuzzy linear and standard statistical regressions for childbearing time series were defined with using the built in language MatLab algorithm. Method of fuzzy linear regression is not used in sociological researches yet. Results. There are made the conclusions about the socio-demographic changes in society, the high efficiency of the demographic policy of the leadership of the region and the country, and the applicability of the method of fuzzy linear regression for sociological analysis.

  5. Direct integral linear least square regression method for kinetic evaluation of hepatobiliary scintigraphy

    Shuke, Noriyuki

    1991-01-01

    In hepatobiliary scintigraphy, kinetic model analysis, which provides kinetic parameters like hepatic extraction or excretion rate, have been done for quantitative evaluation of liver function. In this analysis, unknown model parameters are usually determined using nonlinear least square regression method (NLS method) where iterative calculation and initial estimate for unknown parameters are required. As a simple alternative to NLS method, direct integral linear least square regression method (DILS method), which can determine model parameters by a simple calculation without initial estimate, is proposed, and tested the applicability to analysis of hepatobiliary scintigraphy. In order to see whether DILS method could determine model parameters as good as NLS method, or to determine appropriate weight for DILS method, simulated theoretical data based on prefixed parameters were fitted to 1 compartment model using both DILS method with various weightings and NLS method. The parameter values obtained were then compared with prefixed values which were used for data generation. The effect of various weights on the error of parameter estimate was examined, and inverse of time was found to be the best weight to make the error minimum. When using this weight, DILS method could give parameter values close to those obtained by NLS method and both parameter values were very close to prefixed values. With appropriate weighting, the DILS method could provide reliable parameter estimate which is relatively insensitive to the data noise. In conclusion, the DILS method could be used as a simple alternative to NLS method, providing reliable parameter estimate. (author)

  6. Neck-focused panic attacks among Cambodian refugees; a logistic and linear regression analysis.

    Hinton, Devon E; Chhean, Dara; Pich, Vuth; Um, Khin; Fama, Jeanne M; Pollack, Mark H

    2006-01-01

    Consecutive Cambodian refugees attending a psychiatric clinic were assessed for the presence and severity of current--i.e., at least one episode in the last month--neck-focused panic. Among the whole sample (N=130), in a logistic regression analysis, the Anxiety Sensitivity Index (ASI; odds ratio=3.70) and the Clinician-Administered PTSD Scale (CAPS; odds ratio=2.61) significantly predicted the presence of current neck panic (NP). Among the neck panic patients (N=60), in the linear regression analysis, NP severity was significantly predicted by NP-associated flashbacks (beta=.42), NP-associated catastrophic cognitions (beta=.22), and CAPS score (beta=.28). Further analysis revealed the effect of the CAPS score to be significantly mediated (Sobel test [Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182]) by both NP-associated flashbacks and catastrophic cognitions. In the care of traumatized Cambodian refugees, NP severity, as well as NP-associated flashbacks and catastrophic cognitions, should be specifically assessed and treated.

  7. QSAR Study of Insecticides of Phthalamide Derivatives Using Multiple Linear Regression and Artificial Neural Network Methods

    Adi Syahputra

    2014-03-01

    Full Text Available Quantitative structure activity relationship (QSAR for 21 insecticides of phthalamides containing hydrazone (PCH was studied using multiple linear regression (MLR, principle component regression (PCR and artificial neural network (ANN. Five descriptors were included in the model for MLR and ANN analysis, and five latent variables obtained from principle component analysis (PCA were used in PCR analysis. Calculation of descriptors was performed using semi-empirical PM6 method. ANN analysis was found to be superior statistical technique compared to the other methods and gave a good correlation between descriptors and activity (r2 = 0.84. Based on the obtained model, we have successfully designed some new insecticides with higher predicted activity than those of previously synthesized compounds, e.g.2-(decalinecarbamoyl-5-chloro-N’-((5-methylthiophen-2-ylmethylene benzohydrazide, 2-(decalinecarbamoyl-5-chloro-N’-((thiophen-2-yl-methylene benzohydrazide and 2-(decaline carbamoyl-N’-(4-fluorobenzylidene-5-chlorobenzohydrazide with predicted log LC50 of 1.640, 1.672, and 1.769 respectively.

  8. Bayesian linear regression with skew-symmetric error distributions with applications to survival analysis

    Rubio, Francisco J.

    2016-02-09

    We study Bayesian linear regression models with skew-symmetric scale mixtures of normal error distributions. These kinds of models can be used to capture departures from the usual assumption of normality of the errors in terms of heavy tails and asymmetry. We propose a general noninformative prior structure for these regression models and show that the corresponding posterior distribution is proper under mild conditions. We extend these propriety results to cases where the response variables are censored. The latter scenario is of interest in the context of accelerated failure time models, which are relevant in survival analysis. We present a simulation study that demonstrates good frequentist properties of the posterior credible intervals associated with the proposed priors. This study also sheds some light on the trade-off between increased model flexibility and the risk of over-fitting. We illustrate the performance of the proposed models with real data. Although we focus on models with univariate response variables, we also present some extensions to the multivariate case in the Supporting Information.

  9. A simplified calculation procedure for mass isotopomer distribution analysis (MIDA) based on multiple linear regression.

    Fernández-Fernández, Mario; Rodríguez-González, Pablo; García Alonso, J Ignacio

    2016-10-01

    We have developed a novel, rapid and easy calculation procedure for Mass Isotopomer Distribution Analysis based on multiple linear regression which allows the simultaneous calculation of the precursor pool enrichment and the fraction of newly synthesized labelled proteins (fractional synthesis) using linear algebra. To test this approach, we used the peptide RGGGLK as a model tryptic peptide containing three subunits of glycine. We selected glycine labelled in two 13 C atoms ( 13 C 2 -glycine) as labelled amino acid to demonstrate that spectral overlap is not a problem in the proposed methodology. The developed methodology was tested first in vitro by changing the precursor pool enrichment from 10 to 40% of 13 C 2 -glycine. Secondly, a simulated in vivo synthesis of proteins was designed by combining the natural abundance RGGGLK peptide and 10 or 20% 13 C 2 -glycine at 1 : 1, 1 : 3 and 3 : 1 ratios. Precursor pool enrichments and fractional synthesis values were calculated with satisfactory precision and accuracy using a simple spreadsheet. This novel approach can provide a relatively rapid and easy means to measure protein turnover based on stable isotope tracers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  10. Describing Growth Pattern of Bali Cows Using Non-linear Regression Models

    Mohd. Hafiz A.W

    2016-12-01

    Full Text Available The objective of this study was to evaluate the best fit non-linear regression model to describe the growth pattern of Bali cows. Estimates of asymptotic mature weight, rate of maturing and constant of integration were derived from Brody, von Bertalanffy, Gompertz and Logistic models which were fitted to cross-sectional data of body weight taken from 74 Bali cows raised in MARDI Research Station Muadzam Shah Pahang. Coefficient of determination (R2 and residual mean squares (MSE were used to determine the best fit model in describing the growth pattern of Bali cows. Von Bertalanffy model was the best model among the four growth functions evaluated to determine the mature weight of Bali cattle as shown by the highest R2 and lowest MSE values (0.973 and 601.9, respectively, followed by Gompertz (0.972 and 621.2, respectively, Logistic (0.971 and 648.4, respectively and Brody (0.932 and 660.5, respectively models. The correlation between rate of maturing and mature weight was found to be negative in the range of -0.170 to -0.929 for all models, indicating that animals of heavier mature weight had lower rate of maturing. The use of non-linear model could summarize the weight-age relationship into several biologically interpreted parameters compared to the entire lifespan weight-age data points that are difficult and time consuming to interpret.

  11. Association of footprint measurements with plantar kinetics: a linear regression model.

    Fascione, Jeanna M; Crews, Ryan T; Wrobel, James S

    2014-03-01

    The use of foot measurements to classify morphology and interpret foot function remains one of the focal concepts of lower-extremity biomechanics. However, only 27% to 55% of midfoot variance in foot pressures has been determined in the most comprehensive models. We investigated whether dynamic walking footprint measurements are associated with inter-individual foot loading variability. Thirty individuals (15 men and 15 women; mean ± SD age, 27.17 ± 2.21 years) walked at a self-selected speed over an electronic pedography platform using the midgait technique. Kinetic variables (contact time, peak pressure, pressure-time integral, and force-time integral) were collected for six masked regions. Footprints were digitized for area and linear boundaries using digital photo planimetry software. Six footprint measurements were determined: contact area, footprint index, arch index, truncated arch index, Chippaux-Smirak index, and Staheli index. Linear regression analysis with a Bonferroni adjustment was performed to determine the association between the footprint measurements and each of the kinetic variables. The findings demonstrate that a relationship exists between increased midfoot contact and increased kinetic values in respective locations. Many of these variables produced large effect sizes while describing 38% to 71% of the common variance of select plantar kinetic variables in the medial midfoot region. In addition, larger footprints were associated with larger kinetic values at the medial heel region and both masked forefoot regions. Dynamic footprint measurements are associated with dynamic plantar loading kinetics, with emphasis on the midfoot region.

  12. Feedback Linearization Control of a Shunt Active Power Filter Using a Fuzzy Controller

    Tianhua Li

    2013-09-01

    Full Text Available In this paper, a novel feedback linearization based sliding mode controlled parallel active power filter using a fuzzy controller is presented in a three-phase three-wire grid. A feedback linearization control with fuzzy parameter self-tuning is used to implement the DC side voltage regulation while a novel integral sliding mode controller is applied to reduce the total harmonic distortion of the supply current. Since traditional unit synchronous sinusoidal signal calculation methods are not applicable when the supply voltage contains harmonics, a novel unit synchronous sinusoidal signal computing method based on synchronous frame transforming theory is presented to overcome this disadvantage. The simulation results verify that the DC side voltage is very stable for the given value and responds quickly to the external disturbance. A comparison is also made to show the advantages of the novel unit sinusoidal signal calculating method and the super harmonic treatment property of the designed active power filter.

  13. Studies in astronomical time series analysis. IV - Modeling chaotic and random processes with linear filters

    Scargle, Jeffrey D.

    1990-01-01

    While chaos arises only in nonlinear systems, standard linear time series models are nevertheless useful for analyzing data from chaotic processes. This paper introduces such a model, the chaotic moving average. This time-domain model is based on the theorem that any chaotic process can be represented as the convolution of a linear filter with an uncorrelated process called the chaotic innovation. A technique, minimum phase-volume deconvolution, is introduced to estimate the filter and innovation. The algorithm measures the quality of a model using the volume covered by the phase-portrait of the innovation process. Experiments on synthetic data demonstrate that the algorithm accurately recovers the parameters of simple chaotic processes. Though tailored for chaos, the algorithm can detect both chaos and randomness, distinguish them from each other, and separate them if both are present. It can also recover nonminimum-delay pulse shapes in non-Gaussian processes, both random and chaotic.

  14. A simple bias correction in linear regression for quantitative trait association under two-tail extreme selection.

    Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C

    2011-09-01

    Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.

  15. Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression

    Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

    2013-01-01

    This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

  16. Comparison of the dosimetric parameters in linear accelerators with flattening filter-free (FFF) and flattening filter (FF)

    Souza, Anderson S.; Rostelato, Maria Elisa C.M.; Zeituni, Carlos A.; Moura, Eduardo S.; Rodrigues, Bruna T.; Souza, Daiane C.; Tiezzi, Rodrigo; Souza, Carla D.; Melo, Emerson R.; Camargo, Anderson R.; Batista, Talita Q.

    2015-01-01

    This paper discusses the main features associated with the dosimetric parameters between FFF and FF Linacs. A set of Varian TrueBeam Linac and Varian 23EX dosimetric measurements was acquired to perform the experimental measurements. The dose measurements were carried out in a water Blue phantom, with a waterproof ionization chambers: farmer ionization chamber (0.6 cm 3 ) and Exradin A1SL(0.053 cm 3 ) , for fields 5 x 5, 8 x 8, 10 x 10, 15 x 15, 30 x 30 cm 2 . The 6 MV FFF and FF was the energy used in this work. Percent Depth Dose (PDD) was the dosimetric parameters evaluated using a fixed Source Surface Distance of 100 cm. One depth were applied for the measurements, 10 cm (central axis) from the water surface. The 6 MV FFF showed less penetrating than the 6 MV FF. This is due to the removal flattening filter causes more lower energy photons on the central axis. The field sizes were equivalent for both FFF and FF. The main advantage in operate linear accelerators without flattening filter is due to the high doses rates delivered during the treatment. High doses rates could reduce the patient treatment time and may be beneficial for some treatment techniques such as IMRT and SRT. (author)

  17. Novel Automatic Filter-Class Feature Selection for Machine Learning Regression

    Wollsen, Morten Gill; Hallam, John; Jørgensen, Bo Nørregaard

    2017-01-01

    With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require...... model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature...

  18. Evaluation of non-linear adaptive smoothing filter by digital phantom

    Sato, Kazuhiro; Ishiya, Hiroki; Oshita, Ryosuke; Yanagawa, Isao; Goto, Mitsunori; Mori, Issei

    2008-01-01

    As a result of the development of multi-slice CT, diagnoses based on three-dimensional reconstruction images and multi-planar reconstruction have spread. For these applications, which require high z-resolution, thin slice imaging is essential. However, because z-resolution is always based on a trade-off with image noise, thin slice imaging is necessarily accompanied by an increase in noise level. To improve the quality of thin slice images, a non-linear adaptive smoothing filter has been developed, and is being widely applied to clinical use. We developed a digital bar pattern phantom for the purpose of evaluating the effect of this filter and attempted evaluation from an addition image of the bar pattern phantom and the image of the water phantom. The effect of this filter was changed in a complex manner by the contrast and spatial frequency of the original image. We have confirmed the reduced effect of image noise in the low frequency component of the image, but decreased contrast or increased quantity of noise in the image of the high frequency component. This result represents the effect of change in the adaptation of this filter. The digital phantom was useful for this evaluation, but to understand the total effect of filtering, much improvement of the shape of the digital phantom is required. (author)

  19. Effects of noise, nonlinear processing, and linear filtering on perceived music quality.

    Arehart, Kathryn H; Kates, James M; Anderson, Melinda C

    2011-03-01

    The purpose of this study was to determine the relative impact of different forms of hearing aid signal processing on quality ratings of music. Music quality was assessed using a rating scale for three types of music: orchestral classical music, jazz instrumental, and a female vocalist. The music stimuli were subjected to a wide range of simulated hearing aid processing conditions including, (1) noise and nonlinear processing, (2) linear filtering, and (3) combinations of noise, nonlinear, and linear filtering. Quality ratings were measured in a group of 19 listeners with normal hearing and a group of 15 listeners with sensorineural hearing impairment. Quality ratings in both groups were generally comparable, were reliable across test sessions, were impacted more by noise and nonlinear signal processing than by linear filtering, and were significantly affected by the genre of music. The average quality ratings for music were reasonably well predicted by the hearing aid speech quality index (HASQI), but additional work is needed to optimize the index to the wide range of music genres and processing conditions included in this study.

  20. Deconvolution of Defocused Image with Multivariate Local Polynomial Regression and Iterative Wiener Filtering in DWT Domain

    Liyun Su

    2010-01-01

    obtaining the point spread function (PSF parameter, iterative wiener filter is adopted to complete the restoration. We experimentally illustrate its performance on simulated data and real blurred image. Results show that the proposed PSF parameter estimation technique and the image restoration method are effective.

  1. Short-term wind speed prediction using an unscented Kalman filter based state-space support vector regression approach

    Chen, Kuilin; Yu, Jie

    2014-01-01

    Highlights: • A novel hybrid modeling method is proposed for short-term wind speed forecasting. • Support vector regression model is constructed to formulate nonlinear state-space framework. • Unscented Kalman filter is adopted to recursively update states under random uncertainty. • The new SVR–UKF approach is compared to several conventional methods for short-term wind speed prediction. • The proposed method demonstrates higher prediction accuracy and reliability. - Abstract: Accurate wind speed forecasting is becoming increasingly important to improve and optimize renewable wind power generation. Particularly, reliable short-term wind speed prediction can enable model predictive control of wind turbines and real-time optimization of wind farm operation. However, this task remains challenging due to the strong stochastic nature and dynamic uncertainty of wind speed. In this study, unscented Kalman filter (UKF) is integrated with support vector regression (SVR) based state-space model in order to precisely update the short-term estimation of wind speed sequence. In the proposed SVR–UKF approach, support vector regression is first employed to formulate a nonlinear state-space model and then unscented Kalman filter is adopted to perform dynamic state estimation recursively on wind sequence with stochastic uncertainty. The novel SVR–UKF method is compared with artificial neural networks (ANNs), SVR, autoregressive (AR) and autoregressive integrated with Kalman filter (AR-Kalman) approaches for predicting short-term wind speed sequences collected from three sites in Massachusetts, USA. The forecasting results indicate that the proposed method has much better performance in both one-step-ahead and multi-step-ahead wind speed predictions than the other approaches across all the locations

  2. Monopole and dipole estimation for multi-frequency sky maps by linear regression

    Wehus, I. K.; Fuskeland, U.; Eriksen, H. K.; Banday, A. J.; Dickinson, C.; Ghosh, T.; Górski, K. M.; Lawrence, C. R.; Leahy, J. P.; Maino, D.; Reich, P.; Reich, W.

    2017-01-01

    We describe a simple but efficient method for deriving a consistent set of monopole and dipole corrections for multi-frequency sky map data sets, allowing robust parametric component separation with the same data set. The computational core of this method is linear regression between pairs of frequency maps, often called T-T plots. Individual contributions from monopole and dipole terms are determined by performing the regression locally in patches on the sky, while the degeneracy between different frequencies is lifted whenever the dominant foreground component exhibits a significant spatial spectral index variation. Based on this method, we present two different, but each internally consistent, sets of monopole and dipole coefficients for the nine-year WMAP, Planck 2013, SFD 100 μm, Haslam 408 MHz and Reich & Reich 1420 MHz maps. The two sets have been derived with different analysis assumptions and data selection, and provide an estimate of residual systematic uncertainties. In general, our values are in good agreement with previously published results. Among the most notable results are a relative dipole between the WMAP and Planck experiments of 10-15μK (depending on frequency), an estimate of the 408 MHz map monopole of 8.9 ± 1.3 K, and a non-zero dipole in the 1420 MHz map of 0.15 ± 0.03 K pointing towards Galactic coordinates (l,b) = (308°,-36°) ± 14°. These values represent the sum of any instrumental and data processing offsets, as well as any Galactic or extra-Galactic component that is spectrally uniform over the full sky.

  3. Modeling of Soil Aggregate Stability using Support Vector Machines and Multiple Linear Regression

    Ali Asghar Besalatpour

    2016-02-01

    Full Text Available Introduction: Soil aggregate stability is a key factor in soil resistivity to mechanical stresses, including the impacts of rainfall and surface runoff, and thus to water erosion (Canasveras et al., 2010. Various indicators have been proposed to characterize and quantify soil aggregate stability, for example percentage of water-stable aggregates (WSA, mean weight diameter (MWD, geometric mean diameter (GMD of aggregates, and water-dispersible clay (WDC content (Calero et al., 2008. Unfortunately, the experimental methods available to determine these indicators are laborious, time-consuming and difficult to standardize (Canasveras et al., 2010. Therefore, it would be advantageous if aggregate stability could be predicted indirectly from more easily available data (Besalatpour et al., 2014. The main objective of this study is to investigate the potential use of support vector machines (SVMs method for estimating soil aggregate stability (as quantified by GMD as compared to multiple linear regression approach. Materials and Methods: The study area was part of the Bazoft watershed (31° 37′ to 32° 39′ N and 49° 34′ to 50° 32′ E, which is located in the Northern part of the Karun river basin in central Iran. A total of 160 soil samples were collected from the top 5 cm of soil surface. Some easily available characteristics including topographic, vegetation, and soil properties were used as inputs. Soil organic matter (SOM content was determined by the Walkley-Black method (Nelson & Sommers, 1986. Particle size distribution in the soil samples (clay, silt, sand, fine sand, and very fine sand were measured using the procedure described by Gee & Bauder (1986 and calcium carbonate equivalent (CCE content was determined by the back-titration method (Nelson, 1982. The modified Kemper & Rosenau (1986 method was used to determine wet-aggregate stability (GMD. The topographic attributes of elevation, slope, and aspect were characterized using a 20-m

  4. Auto Regressive Moving Average (ARMA) Modeling Method for Gyro Random Noise Using a Robust Kalman Filter

    Huang, Lei

    2015-01-01

    To solve the problem in which the conventional ARMA modeling methods for gyro random noise require a large number of samples and converge slowly, an ARMA modeling method using a robust Kalman filtering is developed. The ARMA model parameters are employed as state arguments. Unknown time-varying estimators of observation noise are used to achieve the estimated mean and variance of the observation noise. Using the robust Kalman filtering, the ARMA model parameters are estimated accurately. The developed ARMA modeling method has the advantages of a rapid convergence and high accuracy. Thus, the required sample size is reduced. It can be applied to modeling applications for gyro random noise in which a fast and accurate ARMA modeling method is required. PMID:26437409

  5. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P valuelinear regression P value). The statistical power of CAT test decreased, while the result of linear regression analysis remained the same when population size was reduced by 100 times and AMI incidence rate remained unchanged. The two statistical methods have their advantages and disadvantages. It is necessary to choose statistical method according the fitting degree of data, or comprehensively analyze the results of two methods.

  6. Development of statistical linear regression model for metals from transportation land uses.

    Maniquiz, Marla C; Lee, Soyoung; Lee, Eunju; Kim, Lee-Hyung

    2009-01-01

    The transportation landuses possessing impervious surfaces such as highways, parking lots, roads, and bridges were recognized as the highly polluted non-point sources (NPSs) in the urban areas. Lots of pollutants from urban transportation are accumulating on the paved surfaces during dry periods and are washed-off during a storm. In Korea, the identification and monitoring of NPSs still represent a great challenge. Since 2004, the Ministry of Environment (MOE) has been engaged in several researches and monitoring to develop stormwater management policies and treatment systems for future implementation. The data over 131 storm events during May 2004 to September 2008 at eleven sites were analyzed to identify correlation relationships between particulates and metals, and to develop simple linear regression (SLR) model to estimate event mean concentration (EMC). Results indicate that there was no significant relationship between metals and TSS EMC. However, the SLR estimation models although not providing useful results are valuable indicators of high uncertainties that NPS pollution possess. Therefore, long term monitoring employing proper methods and precise statistical analysis of the data should be undertaken to eliminate these uncertainties.

  7. Water quality control in Third River Reservoir (Argentina using geographical information systems and linear regression models

    Claudia Ledesma

    2013-08-01

    Full Text Available Water quality is traditionally monitored and evaluated based upon field data collected at limited locations. The storage capacity of reservoirs is reduced by deposits of suspended matter. The major factors affecting surface water quality are suspended sediments, chlorophyll and nutrients. Modeling and monitoring the biogeochemical status of reservoirs can be done through data from remote sensors. Since the improvement of sensors’ spatial and spectral resolutions, satellites have been used to monitor the interior areas of bodies of water. Water quality parameters, such as chlorophyll-a concentration and secchi disk depth, were found to have a high correlation with transformed spectral variables derived from bands 1, 2, 3 and 4 of LANDSAT 5TM satellite. We created models of estimated responses in regard to values of chlorophyll-a. To do so, we used population models of single and multiple linear regression, whose parameters are associated with the reflectance data of bands 2 and 4 of the sub-image of the satellite, as well as the data of chlorophyll-a obtained in 25 selected stations. According to the physico-chemical analyzes performed, the characteristics of the water in the reservoir of Rio Tercero, correspond to somewhat hard freshwater with calcium bicarbonate. The water was classified as usable as a source of plant treatment, excellent for irrigation because of its low salinity and low residual sodium carbonate content, but unsuitable for animal consumption because of its low salt content.

  8. An Application of Robust Method in Multiple Linear Regression Model toward Credit Card Debt

    Amira Azmi, Nur; Saifullah Rusiman, Mohd; Khalid, Kamil; Roslan, Rozaini; Sufahani, Suliadi; Mohamad, Mahathir; Salleh, Rohayu Mohd; Hamzah, Nur Shamsidah Amir

    2018-04-01

    Credit card is a convenient alternative replaced cash or cheque, and it is essential component for electronic and internet commerce. In this study, the researchers attempt to determine the relationship and significance variables between credit card debt and demographic variables such as age, household income, education level, years with current employer, years at current address, debt to income ratio and other debt. The provided data covers 850 customers information. There are three methods that applied to the credit card debt data which are multiple linear regression (MLR) models, MLR models with least quartile difference (LQD) method and MLR models with mean absolute deviation method. After comparing among three methods, it is found that MLR model with LQD method became the best model with the lowest value of mean square error (MSE). According to the final model, it shows that the years with current employer, years at current address, household income in thousands and debt to income ratio are positively associated with the amount of credit debt. Meanwhile variables for age, level of education and other debt are negatively associated with amount of credit debt. This study may serve as a reference for the bank company by using robust methods, so that they could better understand their options and choice that is best aligned with their goals for inference regarding to the credit card debt.

  9. Performance Prediction Modelling for Flexible Pavement on Low Volume Roads Using Multiple Linear Regression Analysis

    C. Makendran

    2015-01-01

    Full Text Available Prediction models for low volume village roads in India are developed to evaluate the progression of different types of distress such as roughness, cracking, and potholes. Even though the Government of India is investing huge quantum of money on road construction every year, poor control over the quality of road construction and its subsequent maintenance is leading to the faster road deterioration. In this regard, it is essential that scientific maintenance procedures are to be evolved on the basis of performance of low volume flexible pavements. Considering the above, an attempt has been made in this research endeavor to develop prediction models to understand the progression of roughness, cracking, and potholes in flexible pavements exposed to least or nil routine maintenance. Distress data were collected from the low volume rural roads covering about 173 stretches spread across Tamil Nadu state in India. Based on the above collected data, distress prediction models have been developed using multiple linear regression analysis. Further, the models have been validated using independent field data. It can be concluded that the models developed in this study can serve as useful tools for the practicing engineers maintaining flexible pavements on low volume roads.

  10. A consensus successive projections algorithm--multiple linear regression method for analyzing near infrared spectra.

    Liu, Ke; Chen, Xiaojing; Li, Limin; Chen, Huiling; Ruan, Xiukai; Liu, Wenbin

    2015-02-09

    The successive projections algorithm (SPA) is widely used to select variables for multiple linear regression (MLR) modeling. However, SPA used only once may not obtain all the useful information of the full spectra, because the number of selected variables cannot exceed the number of calibration samples in the SPA algorithm. Therefore, the SPA-MLR method risks the loss of useful information. To make a full use of the useful information in the spectra, a new method named "consensus SPA-MLR" (C-SPA-MLR) is proposed herein. This method is the combination of consensus strategy and SPA-MLR method. In the C-SPA-MLR method, SPA-MLR is used to construct member models with different subsets of variables, which are selected from the remaining variables iteratively. A consensus prediction is obtained by combining the predictions of the member models. The proposed method is evaluated by analyzing the near infrared (NIR) spectra of corn and diesel. The results of C-SPA-MLR method showed a better prediction performance compared with the SPA-MLR and full-spectra PLS methods. Moreover, these results could serve as a reference for combination the consensus strategy and other variable selection methods when analyzing NIR spectra and other spectroscopic techniques. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. [Multiple linear regression and ROC curve analysis of the factors of lumbar spine bone mineral density].

    Zhang, Xiaodong; Zhao, Yinxia; Hu, Shaoyong; Hao, Shuai; Yan, Jiewen; Zhang, Lingyan; Zhao, Jing; Li, Shaolin

    2015-09-01

    To investigate the correlation between the lumbar vertebra bone mineral density (BMD) and age, gender, height, weight, body mass index, waistline, hipline, bone marrow and abdomen fat, and to explore the key factor affecting the BMD. A total of 72 cases were randomly recruited. All the subjects underwent a spectroscopic examination of the third lumber vertebra with single-voxel method in 1.5T MR. Lipid fractions (FF%) were measured. Quantitative CT were also performed to get the BMD of L3 and the corresponding abdomen subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT). The statistical analysis were performed by SPSS 19.0. Multiple linear regression showed except the age and FF% showed significant difference (P0.05). The correlation of age and FF% with BMD was statistically negatively significant (r=-0.830, -0.521, P<0.05). The ROC curve analysis showed that the sensitivety and specificity of predicting osteoporosis were 81.8% and 86.9%, with a threshold of 58.5 years old. And it showed that the sensitivety and specificity of predicting osteoporosis were 90.9% and 55.7%, with a threshold of 52.8% for FF%. The lumbar vertebra BMD was significantly and negatively correlated with age and bone marrow FF%, but it was not significantly correlated with gender, height, weight, BMI, waistline, hipline, SAT and VAT. And age was the critical factor.

  12. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  13. An Ionospheric Index Model based on Linear Regression and Neural Network Approaches

    Tshisaphungo, Mpho; McKinnell, Lee-Anne; Bosco Habarulema, John

    2017-04-01

    The ionosphere is well known to reflect radio wave signals in the high frequency (HF) band due to the present of electron and ions within the region. To optimise the use of long distance HF communications, it is important to understand the drivers of ionospheric storms and accurately predict the propagation conditions especially during disturbed days. This paper presents the development of an ionospheric storm-time index over the South African region for the application of HF communication users. The model will result into a valuable tool to measure the complex ionospheric behaviour in an operational space weather monitoring and forecasting environment. The development of an ionospheric storm-time index is based on a single ionosonde station data over Grahamstown (33.3°S,26.5°E), South Africa. Critical frequency of the F2 layer (foF2) measurements for a period 1996-2014 were considered for this study. The model was developed based on linear regression and neural network approaches. In this talk validation results for low, medium and high solar activity periods will be discussed to demonstrate model's performance.

  14. Time series linear regression of half-hourly radon levels in a residence

    Hull, D.A.

    1990-01-01

    This paper uses time series linear regression modelling to assess the impact of temperature and pressure differences on the radon measured in the basement and in the basement drain of a research house in the Princeton area of New Jersey. The models examine half-hour averages of several climate and house parameters for several periods of up to 11 days. The drain radon concentrations follow a strong diurnal pattern that shifts 12 hours in phase between the summer and the fall seasons. This shift can be linked both to the change in temperature differences between seasons and to an experiment which involved sealing the connection between the drain and the basement. We have found that both the basement and the drain radon concentrations are correlated to basement-outdoor and soil-outdoor temperature differences (the coefficient of determination varies between 0.6 and 0.8). The statistical models for the summer periods clearly describe a physical system where the basement drain pumps radon in during the night and sucks radon out during the day

  15. A linear regression approach to evaluate the green supply chain management impact on industrial organizational performance.

    Mumtaz, Ubaidullah; Ali, Yousaf; Petrillo, Antonella

    2018-05-15

    The increase in the environmental pollution is one of the most important topic in today's world. In this context, the industrial activities can pose a significant threat to the environment. To manage problems associate to industrial activities several methods, techniques and approaches have been developed. Green supply chain management (GSCM) is considered one of the most important "environmental management approach". In developing countries such as Pakistan the implementation of GSCM practices is still in its initial stages. Lack of knowledge about its effects on economic performance is the reason because of industries fear to implement these practices. The aim of this research is to perceive the effects of GSCM practices on organizational performance in Pakistan. In this research the GSCM practices considered are: internal practices, external practices, investment recovery and eco-design. While, the performance parameters considered are: environmental pollution, operational cost and organizational flexibility. A set of hypothesis propose the effect of each GSCM practice on the performance parameters. Factor analysis and linear regression are used to analyze the survey data of Pakistani industries, in order to authenticate these hypotheses. The findings of this research indicate a decrease in environmental pollution and operational cost with the implementation of GSCM practices, whereas organizational flexibility has not improved for Pakistani industries. These results aim to help managers regarding their decision of implementing GSCM practices in the industrial sector of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Influence of plant root morphology and tissue composition on phenanthrene uptake: Stepwise multiple linear regression analysis

    Zhan, Xinhua; Liang, Xiao; Xu, Guohua; Zhou, Lixiang

    2013-01-01

    Polycyclic aromatic hydrocarbons (PAHs) are contaminants that reside mainly in surface soils. Dietary intake of plant-based foods can make a major contribution to total PAH exposure. Little information is available on the relationship between root morphology and plant uptake of PAHs. An understanding of plant root morphologic and compositional factors that affect root uptake of contaminants is important and can inform both agricultural (chemical contamination of crops) and engineering (phytoremediation) applications. Five crop plant species are grown hydroponically in solutions containing the PAH phenanthrene. Measurements are taken for 1) phenanthrene uptake, 2) root morphology – specific surface area, volume, surface area, tip number and total root length and 3) root tissue composition – water, lipid, protein and carbohydrate content. These factors are compared through Pearson's correlation and multiple linear regression analysis. The major factors which promote phenanthrene uptake are specific surface area and lipid content. -- Highlights: •There is no correlation between phenanthrene uptake and total root length, and water. •Specific surface area and lipid are the most crucial factors for phenanthrene uptake. •The contribution of specific surface area is greater than that of lipid. -- The contribution of specific surface area is greater than that of lipid in the two most important root morphological and compositional factors affecting phenanthrene uptake

  17. Forecasting on the total volumes of Malaysia's imports and exports by multiple linear regression

    Beh, W. L.; Yong, M. K. Au

    2017-04-01

    This study is to give an insight on the doubt of the important of macroeconomic variables that affecting the total volumes of Malaysia's imports and exports by using multiple linear regression (MLR) analysis. The time frame for this study will be determined by using quarterly data of the total volumes of Malaysia's imports and exports covering the period between 2000-2015. The macroeconomic variables will be limited to eleven variables which are the exchange rate of US Dollar with Malaysia Ringgit (USD-MYR), exchange rate of China Yuan with Malaysia Ringgit (RMB-MYR), exchange rate of European Euro with Malaysia Ringgit (EUR-MYR), exchange rate of Singapore Dollar with Malaysia Ringgit (SGD-MYR), crude oil prices, gold prices, producer price index (PPI), interest rate, consumer price index (CPI), industrial production index (IPI) and gross domestic product (GDP). This study has applied the Johansen Co-integration test to investigate the relationship among the total volumes to Malaysia's imports and exports. The result shows that crude oil prices, RMB-MYR, EUR-MYR and IPI play important roles in the total volumes of Malaysia's imports. Meanwhile crude oil price, USD-MYR and GDP play important roles in the total volumes of Malaysia's exports.

  18. A hybrid genetic algorithm and linear regression for prediction of NOx emission in power generation plant

    Bunyamin, Muhammad Afif; Yap, Keem Siah; Aziz, Nur Liyana Afiqah Abdul; Tiong, Sheih Kiong; Wong, Shen Yuong; Kamal, Md Fauzan

    2013-01-01

    This paper presents a new approach of gas emission estimation in power generation plant using a hybrid Genetic Algorithm (GA) and Linear Regression (LR) (denoted as GA-LR). The LR is one of the approaches that model the relationship between an output dependant variable, y, with one or more explanatory variables or inputs which denoted as x. It is able to estimate unknown model parameters from inputs data. On the other hand, GA is used to search for the optimal solution until specific criteria is met causing termination. These results include providing good solutions as compared to one optimal solution for complex problems. Thus, GA is widely used as feature selection. By combining the LR and GA (GA-LR), this new technique is able to select the most important input features as well as giving more accurate prediction by minimizing the prediction errors. This new technique is able to produce more consistent of gas emission estimation, which may help in reducing population to the environment. In this paper, the study's interest is focused on nitrous oxides (NOx) prediction. The results of the experiment are encouraging.

  19. Correction of TRMM 3B42V7 Based on Linear Regression Models over China

    Shaohua Liu

    2016-01-01

    Full Text Available High temporal-spatial precipitation is necessary for hydrological simulation and water resource management, and remotely sensed precipitation products (RSPPs play a key role in supporting high temporal-spatial precipitation, especially in sparse gauge regions. TRMM 3B42V7 data (TRMM precipitation is an essential RSPP outperforming other RSPPs. Yet the utilization of TRMM precipitation is still limited by the inaccuracy and low spatial resolution at regional scale. In this paper, linear regression models (LRMs have been constructed to correct and downscale the TRMM precipitation based on the gauge precipitation at 2257 stations over China from 1998 to 2013. Then, the corrected TRMM precipitation was validated by gauge precipitation at 839 out of 2257 stations in 2014 at station and grid scales. The results show that both monthly and annual LRMs have obviously improved the accuracy of corrected TRMM precipitation with acceptable error, and monthly LRM performs slightly better than annual LRM in Mideastern China. Although the performance of corrected TRMM precipitation from the LRMs has been increased in Northwest China and Tibetan plateau, the error of corrected TRMM precipitation is still significant due to the large deviation between TRMM precipitation and low-density gauge precipitation.

  20. A non-linear algorithm for current signal filtering and peak detection in SiPM

    Putignano, M; Intermite, A; Welsch, C P

    2012-01-01

    Read-out of Silicon Photomultipliers is commonly achieved by means of charge integration, a method particularly susceptible to after-pulsing noise and not efficient for low level light signals. Current signal monitoring, characterized by easier electronic implementation and intrinsically faster than charge integration, is also more suitable for low level light signals and can potentially result in much decreased after-pulsing noise effects. However, its use is to date limited by the need of developing a suitable read-out algorithm for signal analysis and filtering able to achieve current peak detection and measurement with the needed precision and accuracy. In this paper we present an original algorithm, based on a piecewise linear-fitting approach, to filter the noise of the current signal and hence efficiently identifying and measuring current peaks. The proposed algorithm is then compared with the optimal linear filtering algorithm for time-encoded peak detection, based on a moving average routine, and assessed in terms of accuracy, precision, and peak detection efficiency, demonstrating improvements of 1÷2 orders of magnitude in all these quality factors.

  1. Fast Kalman-like filtering for large-dimensional linear and Gaussian state-space models

    Ait-El-Fquih, Boujemaa; Hoteit, Ibrahim

    2015-01-01

    This paper considers the filtering problem for linear and Gaussian state-space models with large dimensions, a setup in which the optimal Kalman Filter (KF) might not be applicable owing to the excessive cost of manipulating huge covariance matrices. Among the most popular alternatives that enable cheaper and reasonable computation is the Ensemble KF (EnKF), a Monte Carlo-based approximation. In this paper, we consider a class of a posteriori distributions with diagonal covariance matrices and propose fast approximate deterministic-based algorithms based on the Variational Bayesian (VB) approach. More specifically, we derive two iterative KF-like algorithms that differ in the way they operate between two successive filtering estimates; one involves a smoothing estimate and the other involves a prediction estimate. Despite its iterative nature, the prediction-based algorithm provides a computational cost that is, on the one hand, independent of the number of iterations in the limit of very large state dimensions, and on the other hand, always much smaller than the cost of the EnKF. The cost of the smoothing-based algorithm depends on the number of iterations that may, in some situations, make this algorithm slower than the EnKF. The performances of the proposed filters are studied and compared to those of the KF and EnKF through a numerical example.

  2. Fast Kalman-like filtering for large-dimensional linear and Gaussian state-space models

    Ait-El-Fquih, Boujemaa

    2015-08-13

    This paper considers the filtering problem for linear and Gaussian state-space models with large dimensions, a setup in which the optimal Kalman Filter (KF) might not be applicable owing to the excessive cost of manipulating huge covariance matrices. Among the most popular alternatives that enable cheaper and reasonable computation is the Ensemble KF (EnKF), a Monte Carlo-based approximation. In this paper, we consider a class of a posteriori distributions with diagonal covariance matrices and propose fast approximate deterministic-based algorithms based on the Variational Bayesian (VB) approach. More specifically, we derive two iterative KF-like algorithms that differ in the way they operate between two successive filtering estimates; one involves a smoothing estimate and the other involves a prediction estimate. Despite its iterative nature, the prediction-based algorithm provides a computational cost that is, on the one hand, independent of the number of iterations in the limit of very large state dimensions, and on the other hand, always much smaller than the cost of the EnKF. The cost of the smoothing-based algorithm depends on the number of iterations that may, in some situations, make this algorithm slower than the EnKF. The performances of the proposed filters are studied and compared to those of the KF and EnKF through a numerical example.

  3. Dual linear structured support vector machine tracking method via scale correlation filter

    Li, Weisheng; Chen, Yanquan; Xiao, Bin; Feng, Chen

    2018-01-01

    Adaptive tracking-by-detection methods based on structured support vector machine (SVM) performed well on recent visual tracking benchmarks. However, these methods did not adopt an effective strategy of object scale estimation, which limits the overall tracking performance. We present a tracking method based on a dual linear structured support vector machine (DLSSVM) with a discriminative scale correlation filter. The collaborative tracker comprised of a DLSSVM model and a scale correlation filter obtains good results in tracking target position and scale estimation. The fast Fourier transform is applied for detection. Extensive experiments show that our tracking approach outperforms many popular top-ranking trackers. On a benchmark including 100 challenging video sequences, the average precision of the proposed method is 82.8%.

  4. Decentralized Observer with a Consensus Filter for Distributed Discrete-Time Linear Systems

    Acikmese, Behcet; Mandic, Milan

    2011-01-01

    This paper presents a decentralized observer with a consensus filter for the state observation of a discrete-time linear distributed systems. In this setup, each agent in the distributed system has an observer with a model of the plant that utilizes the set of locally available measurements, which may not make the full plant state detectable. This lack of detectability is overcome by utilizing a consensus filter that blends the state estimate of each agent with its neighbors' estimates. We assume that the communication graph is connected for all times as well as the sensing graph. It is proven that the state estimates of the proposed observer asymptotically converge to the actual plant states under arbitrarily changing, but connected, communication and sensing topologies. As a byproduct of this research, we also obtained a result on the location of eigenvalues, the spectrum, of the Laplacian for a family of graphs with self-loops.

  5. Joint polarization tracking and channel equalization based on radius-directed linear Kalman filter

    Zhang, Qun; Yang, Yanfu; Zhong, Kangping; Liu, Jie; Wu, Xiong; Yao, Yong

    2018-01-01

    We propose a joint polarization tracking and channel equalization scheme based on radius-directed linear Kalman filter (RD-LKF) by introducing the butterfly finite-impulse-response (FIR) filter in our previously proposed RD-LKF method. Along with the fast polarization tracking, it can also simultaneously compensate the inter-symbol interference (ISI) effects including residual chromatic dispersion and polarization mode dispersion. Compared with the conventional radius-directed equalizer (RDE) algorithm, it is demonstrated experimentally that three times faster convergence speed, one order of magnitude better tracking capability, and better BER performance is obtained in polarization division multiplexing 16 quadrature amplitude modulation system. Besides, the influences of the algorithm parameters on the convergence and the tracking performance are investigated by numerical simulation.

  6. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

    Delwiche, Stephen R; Reeves, James B

    2010-01-01

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various

  7. Treatment vault shielding for a flattening filter-free medical linear accelerator

    Kry, Stephen F.; Howell, Rebecca M.; Polf, Jerimy; Mohan, Radhe; Vassiliev, Oleg N.

    2009-03-01

    The requirements for shielding a treatment vault with a Varian Clinac 2100 medical linear accelerator operated both with and without the flattening filter were assessed. Basic shielding parameters, such as primary beam tenth-value layers (TVLs), patient scatter fractions, and wall scatter fractions, were calculated using Monte Carlo simulations of 6, 10 and 18 MV beams. Relative integral target current requirements were determined from treatment planning studies of several disease sites with, and without, the flattening filter. The flattened beam shielding data were compared to data published in NCRP Report No. 151, and the unflattened beam shielding data were presented relative to the NCRP data. Finally, the shielding requirements for a typical treatment vault were determined for a single-energy (6 MV) linac and a dual-energy (6 MV/18 MV) linac. With the exception of large-angle patient scatter fractions and wall scatter fractions, the vault shielding parameters were reduced when the flattening filter was removed. Much of this reduction was consistent with the reduced average energy of the FFF beams. Primary beam TVLs were reduced by 12%, on average, and small-angle scatter fractions were reduced by up to 30%. Head leakage was markedly reduced because less integral target current was required to deliver the target dose. For the treatment vault examined in the current study, removal of the flattening filter reduced the required thickness of the primary and secondary barriers by 10-20%, corresponding to 18 m3 less concrete to shield the single-energy linac and 36 m3 less concrete to shield the dual-energy linac. Thus, a shielding advantage was found when the linac was operated without the flattening filter. This translates into a reduction in occupational exposure and/or the cost and space of shielding.

  8. Treatment vault shielding for a flattening filter-free medical linear accelerator

    Kry, Stephen F; Howell, Rebecca M; Polf, Jerimy; Mohan, Radhe; Vassiliev, Oleg N [Department of Radiation Physics, University of Texas M. D. Anderson Cancer Center, Houston, TX (United States)], E-mail: sfkry@mdanderson.org

    2009-03-07

    The requirements for shielding a treatment vault with a Varian Clinac 2100 medical linear accelerator operated both with and without the flattening filter were assessed. Basic shielding parameters, such as primary beam tenth-value layers (TVLs), patient scatter fractions, and wall scatter fractions, were calculated using Monte Carlo simulations of 6, 10 and 18 MV beams. Relative integral target current requirements were determined from treatment planning studies of several disease sites with, and without, the flattening filter. The flattened beam shielding data were compared to data published in NCRP Report No. 151, and the unflattened beam shielding data were presented relative to the NCRP data. Finally, the shielding requirements for a typical treatment vault were determined for a single-energy (6 MV) linac and a dual-energy (6 MV/18 MV) linac. With the exception of large-angle patient scatter fractions and wall scatter fractions, the vault shielding parameters were reduced when the flattening filter was removed. Much of this reduction was consistent with the reduced average energy of the FFF beams. Primary beam TVLs were reduced by 12%, on average, and small-angle scatter fractions were reduced by up to 30%. Head leakage was markedly reduced because less integral target current was required to deliver the target dose. For the treatment vault examined in the current study, removal of the flattening filter reduced the required thickness of the primary and secondary barriers by 10-20%, corresponding to 18 m{sup 3} less concrete to shield the single-energy linac and 36 m{sup 3} less concrete to shield the dual-energy linac. Thus, a shielding advantage was found when the linac was operated without the flattening filter. This translates into a reduction in occupational exposure and/or the cost and space of shielding.

  9. Treatment vault shielding for a flattening filter-free medical linear accelerator

    Kry, Stephen F; Howell, Rebecca M; Polf, Jerimy; Mohan, Radhe; Vassiliev, Oleg N

    2009-01-01

    The requirements for shielding a treatment vault with a Varian Clinac 2100 medical linear accelerator operated both with and without the flattening filter were assessed. Basic shielding parameters, such as primary beam tenth-value layers (TVLs), patient scatter fractions, and wall scatter fractions, were calculated using Monte Carlo simulations of 6, 10 and 18 MV beams. Relative integral target current requirements were determined from treatment planning studies of several disease sites with, and without, the flattening filter. The flattened beam shielding data were compared to data published in NCRP Report No. 151, and the unflattened beam shielding data were presented relative to the NCRP data. Finally, the shielding requirements for a typical treatment vault were determined for a single-energy (6 MV) linac and a dual-energy (6 MV/18 MV) linac. With the exception of large-angle patient scatter fractions and wall scatter fractions, the vault shielding parameters were reduced when the flattening filter was removed. Much of this reduction was consistent with the reduced average energy of the FFF beams. Primary beam TVLs were reduced by 12%, on average, and small-angle scatter fractions were reduced by up to 30%. Head leakage was markedly reduced because less integral target current was required to deliver the target dose. For the treatment vault examined in the current study, removal of the flattening filter reduced the required thickness of the primary and secondary barriers by 10-20%, corresponding to 18 m 3 less concrete to shield the single-energy linac and 36 m 3 less concrete to shield the dual-energy linac. Thus, a shielding advantage was found when the linac was operated without the flattening filter. This translates into a reduction in occupational exposure and/or the cost and space of shielding.

  10. Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems

    Faridah Hani Mohamed Salleh

    2017-01-01

    Full Text Available Gene regulatory network (GRN reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C as a direct interaction (A → C. Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.

  11. Correlation of concentration of modified cassava flour for banana fritter flour using simple linear regression

    Herminiati, A.; Rahman, T.; Turmala, E.; Fitriany, C. G.

    2017-12-01

    The purpose of this study was to determine the correlation of different concentrations of modified cassava flour that was processed for banana fritter flour. The research method consists of two stages: (1) to determine the different types of flour: cassava flour, modified cassava flour-A (using the method of the lactid acid bacteria), and modified cassava flour-B (using the method of the autoclaving cooling cycle), then conducted on organoleptic test and physicochemical analysis; (2) to determine the correlation of concentration of modified cassava flour for banana fritter flour, by design was used simple linear regression. The factors were used different concentrations of modified cassava flour-B (y1) 40%, (y2) 50%, and (y3) 60%. The response in the study includes physical analysis (whiteness of flour, water holding capacity-WHC, oil holding capacity-OHC), chemical analysis (moisture content, ash content, crude fiber content, starch content), and organoleptic (color, aroma, taste, texture). The results showed that the type of flour selected from the organoleptic test was modified cassava flour-B. Analysis results of modified cassava flour-B component containing whiteness of flour 60.42%; WHC 41.17%; OHC 21.15%; moisture content 4.4%; ash content 1.75%; crude fiber content 1.86%; starch content 67.31%. The different concentrations of modified cassava flour-B with the results of the analysis provides correlation to the whiteness of flour, WHC, OHC, moisture content, ash content, crude fiber content, and starch content. The different concentrations of modified cassava flour-B does not affect the color, aroma, taste, and texture.

  12. Multiple Linear Regression and Artificial Neural Network to Predict Blood Glucose in Overweight Patients.

    Wang, J; Wang, F; Liu, Y; Xu, J; Lin, H; Jia, B; Zuo, W; Jiang, Y; Hu, L; Lin, F

    2016-01-01

    Overweight individuals are at higher risk for developing type II diabetes than the general population. We conducted this study to analyze the correlation between blood glucose and biochemical parameters, and developed a blood glucose prediction model tailored to overweight patients. A total of 346 overweight Chinese people patients ages 18-81 years were involved in this study. Their levels of fasting glucose (fs-GLU), blood lipids, and hepatic and renal functions were measured and analyzed by multiple linear regression (MLR). Based the MLR results, we developed a back propagation artificial neural network (BP-ANN) model by selecting tansig as the transfer function of the hidden layers nodes, and purelin for the output layer nodes, with training goal of 0.5×10(-5). There was significant correlation between fs-GLU with age, BMI, and blood biochemical indexes (P<0.05). The results of MLR analysis indicated that age, fasting alanine transaminase (fs-ALT), blood urea nitrogen (fs-BUN), total protein (fs-TP), uric acid (fs-BUN), and BMI are 6 independent variables related to fs-GLU. Based on these parameters, the BP-ANN model was performed well and reached high prediction accuracy when training 1 000 epoch (R=0.9987). The level of fs-GLU was predictable using the proposed BP-ANN model based on 6 related parameters (age, fs-ALT, fs-BUN, fs-TP, fs-UA and BMI) in overweight patients. © Georg Thieme Verlag KG Stuttgart · New York.

  13. Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems.

    Salleh, Faridah Hani Mohamed; Zainudin, Suhaila; Arif, Shereena M

    2017-01-01

    Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.

  14. Reflexion on linear regression trip production modelling method for ensuring good model quality

    Suprayitno, Hitapriya; Ratnasari, Vita

    2017-11-01

    Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.

  15. Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression.

    Charles K Fisher

    Full Text Available Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is now possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an important goal of microbial ecology is to infer the ecological interactions between species directly from sequence data. Any algorithm for inferring ecological interactions must overcome three major obstacles: 1 a correlation between the abundances of two species does not imply that those species are interacting, 2 the sum constraint on the relative abundances obtained from metagenomic studies makes it difficult to infer the parameters in timeseries models, and 3 errors due to experimental uncertainty, or mis-assignment of sequencing reads into operational taxonomic units, bias inferences of species interactions due to a statistical problem called "errors-in-variables". Here we introduce an approach, Learning Interactions from MIcrobial Time Series (LIMITS, that overcomes these obstacles. LIMITS uses sparse linear regression with boostrap aggregation to infer a discrete-time Lotka-Volterra model for microbial dynamics. We tested LIMITS on synthetic data and showed that it could reliably infer the topology of the inter-species ecological interactions. We then used LIMITS to characterize the species interactions in the gut microbiomes of two individuals and found that the interaction networks varied significantly between individuals. Furthermore, we found that the interaction networks of the two individuals are dominated by distinct "keystone species", Bacteroides fragilis and Bacteroided stercosis, that have a disproportionate influence on the structure of the gut microbiome even though they are only found in moderate abundance. Based on our results, we hypothesize that the abundances of certain keystone species may be responsible for individuality in

  16. H{infinity} Filtering for Dynamic Compensation of Self-Powered Neutron Detectors - A Linear Matrix Inequality Based Method -

    Park, M.G.; Kim, Y.H.; Cha, K.H.; Kim, M.K. [Korea Electric Power Research Institute, Taejon (Korea)

    1999-07-01

    A method is described to develop and H{infinity} filtering method for the dynamic compensation of self-powered neutron detectors normally used for fixed incore instruments. An H{infinity} norm of the filter transfer matrix is used as the optimization criteria in the worst-case estimation error sense. Filter modeling is performed for both continuous- and discrete-time models. The filter gains are optimized in the sense of noise attenuation level of H{infinity} setting. By introducing Bounded Real Lemma, the conventional algebraic Riccati inequalities are converted into Linear Matrix Inequalities (LMIs). Finally, the filter design problem is solved via the convex optimization framework using LMIs. The simulation results show that remarkable improvements are achieved in view of the filter response time and the filter design efficiency. (author). 15 refs., 4 figs., 3 tabs.

  17. A novel simple QSAR model for the prediction of anti-HIV activity using multiple linear regression analysis.

    Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga

    2006-08-01

    A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.

  18. Target Response Adaptation for Correlation Filter Tracking

    Bibi, Adel Aamer; Mueller, Matthias; Ghanem, Bernard

    2016-01-01

    Most correlation filter (CF) based trackers utilize the circulant structure of the training data to learn a linear filter that best regresses this data to a hand-crafted target response. These circularly shifted patches are only approximations

  19. H-/H∞ structural damage detection filter design using an iterative linear matrix inequality approach

    Chen, B; Nagarajaiah, S

    2008-01-01

    The existence of damage in different members of a structure can be posed as a fault detection problem. It is also necessary to isolate structural members in which damage exists, which can be posed as a fault isolation problem. It is also important to detect the time instants of occurrence of the faults/damage. The structural damage detection filter developed in this paper is a model-based fault detection and isolation (FDI) observer suitable for detecting and isolating structural damage. In systems, possible faults, disturbances and noise are coupled together. When system disturbances and sensor noise cannot be decoupled from faults/damage, the detection filter needs to be designed to be robust to disturbances as well as sensitive to faults/damage. In this paper, a new H - /H ∞ and iterative linear matrix inequality (LMI) technique is developed and a new stabilizing FDI filter is proposed, which bounds the H ∞ norm of the transfer function from disturbances to the output residual and simultaneously does not degrade the component of the output residual due to damage. The reduced-order error dynamic system is adopted to form bilinear matrix inequalities (BMIs), then an iterative LMI algorithm is developed to solve the BMIs. The numerical example and experimental verification demonstrate that the proposed algorithm can successfully detect and isolate structural damage in the presence of measurement noise

  20. Low-sensitivity H ∞ filter design for linear delta operator systems with sampling time jitter

    Guo, Xiang-Gui; Yang, Guang-Hong

    2012-04-01

    This article is concerned with the problem of designing H ∞ filters for a class of linear discrete-time systems with low-sensitivity to sampling time jitter via delta operator approach. Delta-domain model is used to avoid the inherent numerical ill-condition resulting from the use of the standard shift-domain model at high sampling rates. Based on projection lemma in combination with the descriptor system approach often used to solve problems related to delay, a novel bounded real lemma with three slack variables for delta operator systems is presented. A sensitivity approach based on this novel lemma is proposed to mitigate the effects of sampling time jitter on system performance. Then, the problem of designing a low-sensitivity filter can be reduced to a convex optimisation problem. An important consideration in the design of correlation filters is the optimal trade-off between the standard H ∞ criterion and the sensitivity of the transfer function with respect to sampling time jitter. Finally, a numerical example demonstrating the validity of the proposed design method is given.

  1. Beam Characterization of 10-MV Photon Beam from Medical Linear Accelerator without Flattening Filter.

    Shimozato, Tomohiro; Aoyama, Yuichi; Matsunaga, Takuma; Tabushi, Katsuyoshi

    2017-01-01

    This work investigated the dosimetric properties of a 10-MV photon beam emitted from a medical linear accelerator (linac) with no flattening filter (FF). The aim of this study is to analyze the radiation fluence and energy emitted from the flattening filter free (FFF) linac using Monte Carlo (MC) simulations. The FFF linac was created by removing the FF from a linac in clinical use. Measurements of the depth dose (DD) and the off-axis profile were performed using a three-dimensional water phantom with an ionization chamber. A MC simulation for a 10-MV photon beam from this FFF linac was performed using the BEAMnrc code. The off-axis profiles for the FFF linac exhibited a chevron-like distribution, and the dose outside the irradiation field was found to be lower for the FFF linac than for a linac with an FF (FF linac). The DD curves for the FFF linac included many contaminant electrons in the build-up region. Therefore, for clinical use, a metal filter is additionally required to reduce the effects of the electron contamination. The mean energy of the FFF linac was found to be lower than that of the FF linac owing to the absence of beam hardening caused by the FF.

  2. Evaluating Non-Linear Regression Models in Analysis of Persian Walnut Fruit Growth

    I. Karamatlou

    2016-02-01

    Full Text Available Introduction: Persian walnut (Juglans regia L. is a large, wind-pollinated, monoecious, dichogamous, long lived, perennial tree cultivated for its high quality wood and nuts throughout the temperate regions of the world. Growth model methodology has been widely used in the modeling of plant growth. Mathematical models are important tools to study the plant growth and agricultural systems. These models can be applied for decision-making anddesigning management procedures in horticulture. Through growth analysis, planning for planting systems, fertilization, pruning operations, harvest time as well as obtaining economical yield can be more accessible.Non-linear models are more difficult to specify and estimate than linear models. This research was aimed to studynon-linear regression models based on data obtained from fruit weight, length and width. Selecting the best models which explain that fruit inherent growth pattern of Persian walnut was a further goal of this study. Materials and Methods: The experimental material comprising 14 Persian walnut genotypes propagated by seed collected from a walnut orchard in Golestan province, Minoudasht region, Iran, at latitude 37◦04’N; longitude 55◦32’E; altitude 1060 m, in a silt loam soil type. These genotypes were selected as a representative sampling of the many walnut genotypes available throughout the Northeastern Iran. The age range of walnut trees was 30 to 50 years. The annual mean temperature at the location is16.3◦C, with annual mean rainfall of 690 mm.The data used here is the average of walnut fresh fruit and measured withgram/millimeter/day in2011.According to the data distribution pattern, several equations have been proposed to describesigmoidal growth patterns. Here, we used double-sigmoid and logistic–monomolecular models to evaluate fruit growth based on fruit weight and4different regression models in cluding Richards, Gompertz, Logistic and Exponential growth for evaluation

  3. Trend analysis by a piecewise linear regression model applied to surface air temperatures in Southeastern Spain (1973–2014)

    Campra, Pablo; Morales, Maria

    2016-01-01

    The magnitude of the trends of environmental and climatic changes is mostly derived from the slopes of the linear trends using ordinary least-square fitting. An alternative flexible fitting model, piecewise regression, has been applied here to surface air temperature records in southeastern Spain for the recent warming period (1973–2014) to gain accuracy in the description of the inner structure of change, dividing the time series into linear segments with different slopes. Breakpoint y...

  4. Kalman filter-based tracking of moving objects using linear ultrasonic sensor array for road vehicles

    Li, Shengbo Eben; Li, Guofa; Yu, Jiaying; Liu, Chang; Cheng, Bo; Wang, Jianqiang; Li, Keqiang

    2018-01-01

    Detection and tracking of objects in the side-near-field has attracted much attention for the development of advanced driver assistance systems. This paper presents a cost-effective approach to track moving objects around vehicles using linearly arrayed ultrasonic sensors. To understand the detection characteristics of a single sensor, an empirical detection model was developed considering the shapes and surface materials of various detected objects. Eight sensors were arrayed linearly to expand the detection range for further application in traffic environment recognition. Two types of tracking algorithms, including an Extended Kalman filter (EKF) and an Unscented Kalman filter (UKF), for the sensor array were designed for dynamic object tracking. The ultrasonic sensor array was designed to have two types of fire sequences: mutual firing or serial firing. The effectiveness of the designed algorithms were verified in two typical driving scenarios: passing intersections with traffic sign poles or street lights, and overtaking another vehicle. Experimental results showed that both EKF and UKF had more precise tracking position and smaller RMSE (root mean square error) than a traditional triangular positioning method. The effectiveness also encourages the application of cost-effective ultrasonic sensors in the near-field environment perception in autonomous driving systems.

  5. Seasonal Variability of Aragonite Saturation State in the North Pacific Ocean Predicted by Multiple Linear Regression

    Kim, T. W.; Park, G. H.

    2014-12-01

    Seasonal variation of aragonite saturation state (Ωarag) in the North Pacific Ocean (NPO) was investigated, using multiple linear regression (MLR) models produced from the PACIFICA (Pacific Ocean interior carbon) dataset. Data within depth ranges of 50-1200m were used to derive MLR models, and three parameters (potential temperature, nitrate, and apparent oxygen utilization (AOU)) were chosen as predictor variables because these parameters are associated with vertical mixing, DIC (dissolved inorganic carbon) removal and release which all affect Ωarag in water column directly or indirectly. The PACIFICA dataset was divided into 5° × 5° grids, and a MLR model was produced in each grid, giving total 145 independent MLR models over the NPO. Mean RMSE (root mean square error) and r2 (coefficient of determination) of all derived MLR models were approximately 0.09 and 0.96, respectively. Then the obtained MLR coefficients for each of predictor variables and an intercept were interpolated over the study area, thereby making possible to allocate MLR coefficients to data-sparse ocean regions. Predictability from the interpolated coefficients was evaluated using Hawaiian time-series data, and as a result mean residual between measured and predicted Ωarag values was approximately 0.08, which is less than the mean RMSE of our MLR models. The interpolated MLR coefficients were combined with seasonal climatology of World Ocean Atlas 2013 (1° × 1°) to produce seasonal Ωarag distributions over various depths. Large seasonal variability in Ωarag was manifested in the mid-latitude Western NPO (24-40°N, 130-180°E) and low-latitude Eastern NPO (0-12°N, 115-150°W). In the Western NPO, seasonal fluctuations of water column stratification appeared to be responsible for the seasonal variation in Ωarag (~ 0.5 at 50 m) because it closely followed temperature variations in a layer of 0-75 m. In contrast, remineralization of organic matter was the main cause for the seasonal

  6. Multiple linear regression to estimate time-frequency electrophysiological responses in single trials.

    Hu, L; Zhang, Z G; Mouraux, A; Iannetti, G D

    2015-05-01

    Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical

  7. Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression

    Beckstead, Jason W.

    2012-01-01

    The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…

  8. Toward Customer-Centric Organizational Science: A Common Language Effect Size Indicator for Multiple Linear Regressions and Regressions With Higher-Order Terms.

    Krasikova, Dina V; Le, Huy; Bachura, Eric

    2018-01-22

    To address a long-standing concern regarding a gap between organizational science and practice, scholars called for more intuitive and meaningful ways of communicating research results to users of academic research. In this article, we develop a common language effect size index (CLβ) that can help translate research results to practice. We demonstrate how CLβ can be computed and used to interpret the effects of continuous and categorical predictors in multiple linear regression models. We also elaborate on how the proposed CLβ index is computed and used to interpret interactions and nonlinear effects in regression models. In addition, we test the robustness of the proposed index to violations of normality and provide means for computing standard errors and constructing confidence intervals around its estimates. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  9. QSAR models for prediction study of HIV protease inhibitors using support vector machines, neural networks and multiple linear regression

    Rachid Darnag

    2017-02-01

    Full Text Available Support vector machines (SVM represent one of the most promising Machine Learning (ML tools that can be applied to develop a predictive quantitative structure–activity relationship (QSAR models using molecular descriptors. Multiple linear regression (MLR and artificial neural networks (ANNs were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of HIV activity; also, the results reveal the superiority of the SVM over MLR and ANN model. The contribution of each descriptor to the structure–activity relationships was evaluated.

  10. WekaPyScript: Classification, Regression, and Filter Schemes for WEKA Implemented in Python

    Christopher Beckham

    2016-08-01

    Full Text Available WekaPyScript is a package for the machine learning software WEKA that allows learning algorithms and preprocessing methods for classification and regression to be written in Python, as opposed to WEKA’s implementation language, Java. This opens up WEKA to its machine learning and scientific computing ecosystem. Furthermore, due to Python’s minimalist syntax, learning algorithms and preprocessing methods can be prototyped easily and utilised from within WEKA. WekaPyScript works by running a local Python server using the host’s installation of Python; as a result, any libraries installed in the host installation can be leveraged when writing a script for WekaPyScript. Three example scripts (two learning algorithms and one preprocessing method are presented.

  11. Improving model predictions for RNA interference activities that use support vector machine regression by combining and filtering features

    Peek Andrew S

    2007-06-01

    Full Text Available Abstract Background RNA interference (RNAi is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid

  12. Linear regressive model structures for estimation and prediction of compartmental diffusive systems

    Vries, D; Keesman, K.J.; Zwart, Heiko J.

    In input-output relations of (compartmental) diffusive systems, physical parameters appear non-linearly, resulting in the use of (constrained) non-linear parameter estimation techniques with its short-comings regarding global optimality and computational effort. Given a LTI system in state space

  13. Linear regressive model structures for estimation and prediction of compartmental diffusive systems

    Vries, D.; Keesman, K.J.; Zwart, H.

    2006-01-01

    Abstract In input-output relations of (compartmental) diffusive systems, physical parameters appear non-linearly, resulting in the use of (constrained) non-linear parameter estimation techniques with its short-comings regarding global optimality and computational effort. Given a LTI system in state

  14. A STATISTICAL ANALYSIS OF GDP AND FINAL CONSUMPTION USING SIMPLE LINEAR REGRESSION. THE CASE OF ROMANIA 1990–2010

    Aniela Balacescu; Marian Zaharia

    2011-01-01

    This paper aims to examine the causal relationship between GDP and final consumption. The authors used linear regression model in which GDP is considered variable results, and final consumption variable factor. In drafting article we used Excel software application that is a modern computing and statistical data analysis.

  15. A simple bias correction in linear regression for quantitative trait association under two-tail extreme selection

    Kwan, Johnny S. H.; Kung, Annie W. C.; Sham, Pak C.

    2011-01-01

    Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias. © The Author(s) 2011.

  16. Estimation of error components in a multi-error linear regression model, with an application to track fitting

    Fruehwirth, R.

    1993-01-01

    We present an estimation procedure of the error components in a linear regression model with multiple independent stochastic error contributions. After solving the general problem we apply the results to the estimation of the actual trajectory in track fitting with multiple scattering. (orig.)

  17. The Prediction Properties of Inverse and Reverse Regression for the Simple Linear Calibration Problem

    Parker, Peter A.; Geoffrey, Vining G.; Wilson, Sara R.; Szarka, John L., III; Johnson, Nels G.

    2010-01-01

    The calibration of measurement systems is a fundamental but under-studied problem within industrial statistics. The origins of this problem go back to basic chemical analysis based on NIST standards. In today's world these issues extend to mechanical, electrical, and materials engineering. Often, these new scenarios do not provide "gold standards" such as the standard weights provided by NIST. This paper considers the classic "forward regression followed by inverse regression" approach. In this approach the initial experiment treats the "standards" as the regressor and the observed values as the response to calibrate the instrument. The analyst then must invert the resulting regression model in order to use the instrument to make actual measurements in practice. This paper compares this classical approach to "reverse regression," which treats the standards as the response and the observed measurements as the regressor in the calibration experiment. Such an approach is intuitively appealing because it avoids the need for the inverse regression. However, it also violates some of the basic regression assumptions.

  18. Use of empirical likelihood to calibrate auxiliary information in partly linear monotone regression models.

    Chen, Baojiang; Qin, Jing

    2014-05-10

    In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.

  19. Non-linear feedback control of the p53 protein-mdm2 inhibitor system using the derivative-free non-linear Kalman filter.

    Rigatos, Gerasimos G

    2016-06-01

    It is proven that the model of the p53-mdm2 protein synthesis loop is a differentially flat one and using a diffeomorphism (change of state variables) that is proposed by differential flatness theory it is shown that the protein synthesis model can be transformed into the canonical (Brunovsky) form. This enables the design of a feedback control law that maintains the concentration of the p53 protein at the desirable levels. To estimate the non-measurable elements of the state vector describing the p53-mdm2 system dynamics, the derivative-free non-linear Kalman filter is used. Moreover, to compensate for modelling uncertainties and external disturbances that affect the p53-mdm2 system, the derivative-free non-linear Kalman filter is re-designed as a disturbance observer. The derivative-free non-linear Kalman filter consists of the Kalman filter recursion applied on the linearised equivalent of the protein synthesis model together with an inverse transformation based on differential flatness theory that enables to retrieve estimates for the state variables of the initial non-linear model. The proposed non-linear feedback control and perturbations compensation method for the p53-mdm2 system can result in more efficient chemotherapy schemes where the infusion of medication will be better administered.

  20. Evaluation of a multiple linear regression model and SARIMA model in forecasting heat demand for district heating system

    Fang, Tingting; Lahdelma, Risto

    2016-01-01

    Highlights: • Social factor is considered for the linear regression models besides weather file. • Simultaneously optimize all the coefficients for linear regression models. • SARIMA combined with linear regression is used to forecast the heat demand. • The accuracy for both linear regression and time series models are evaluated. - Abstract: Forecasting heat demand is necessary for production and operation planning of district heating (DH) systems. In this study we first propose a simple regression model where the hourly outdoor temperature and wind speed forecast the heat demand. Weekly rhythm of heat consumption as a social component is added to the model to significantly improve the accuracy. The other type of model is the seasonal autoregressive integrated moving average (SARIMA) model with exogenous variables as a combination to take weather factors, and the historical heat consumption data as depending variables. One outstanding advantage of the model is that it peruses the high accuracy for both long-term and short-term forecast by considering both exogenous factors and time series. The forecasting performance of both linear regression models and time series model are evaluated based on real-life heat demand data for the city of Espoo in Finland by out-of-sample tests for the last 20 full weeks of the year. The results indicate that the proposed linear regression model (T168h) using 168-h demand pattern with midweek holidays classified as Saturdays or Sundays gives the highest accuracy and strong robustness among all the tested models based on the tested forecasting horizon and corresponding data. Considering the parsimony of the input, the ease of use and the high accuracy, the proposed T168h model is the best in practice. The heat demand forecasting model can also be developed for individual buildings if automated meter reading customer measurements are available. This would allow forecasting the heat demand based on more accurate heat consumption

  1. Noise Reduction and Gap Filling of fAPAR Time Series Using an Adapted Local Regression Filter

    Álvaro Moreno

    2014-08-01

    Full Text Available Time series of remotely sensed data are an important source of information for understanding land cover dynamics. In particular, the fraction of absorbed photosynthetic active radiation (fAPAR is a key variable in the assessment of vegetation primary production over time. However, the fAPAR series derived from polar orbit satellites are not continuous and consistent in space and time. Filtering methods are thus required to fill in gaps and produce high-quality time series. This study proposes an adapted (iteratively reweighted local regression filter (LOESS and performs a benchmarking intercomparison with four popular and generally applicable smoothing methods: Double Logistic (DLOG, smoothing spline (SSP, Interpolation for Data Reconstruction (IDR and adaptive Savitzky-Golay (ASG. This paper evaluates the main advantages and drawbacks of the considered techniques. The results have shown that ASG and the adapted LOESS perform better in recovering fAPAR time series over multiple controlled noisy scenarios. Both methods can robustly reconstruct the fAPAR trajectories, reducing the noise up to 80% in the worst simulation scenario, which might be attributed to the quality control (QC MODIS information incorporated into these filtering algorithms, their flexibility and adaptation to the upper envelope. The adapted LOESS is particularly resistant to outliers. This method clearly outperforms the other considered methods to deal with the high presence of gaps and noise in satellite data records. The low RMSE and biases obtained with the LOESS method (|rMBE| < 8%; rRMSE < 20% reveals an optimal reconstruction even in most extreme situations with long seasonal gaps. An example of application of the LOESS method to fill in invalid values in real MODIS images presenting persistent cloud and snow coverage is also shown. The LOESS approach is recommended in most remote sensing applications, such as gap-filling, cloud-replacement, and observing temporal

  2. Data-Based Control for Humanoid Robots Using Support Vector Regression, Fuzzy Logic, and Cubature Kalman Filter

    Liyang Wang

    2016-01-01

    Full Text Available Time-varying external disturbances cause instability of humanoid robots or even tip robots over. In this work, a trapezoidal fuzzy least squares support vector regression- (TF-LSSVR- based control system is proposed to learn the external disturbances and increase the zero-moment-point (ZMP stability margin of humanoid robots. First, the humanoid states and the corresponding control torques of the joints for training the controller are collected by implementing simulation experiments. Secondly, a TF-LSSVR with a time-related trapezoidal fuzzy membership function (TFMF is proposed to train the controller using the simulated data. Thirdly, the parameters of the proposed TF-LSSVR are updated using a cubature Kalman filter (CKF. Simulation results are provided. The proposed method is shown to be effective in learning and adapting occasional external disturbances and ensuring the stability margin of the robot.

  3. Phase unwrapping algorithm using polynomial phase approximation and linear Kalman filter.

    Kulkarni, Rishikesh; Rastogi, Pramod

    2018-02-01

    A noise-robust phase unwrapping algorithm is proposed based on state space analysis and polynomial phase approximation using wrapped phase measurement. The true phase is approximated as a two-dimensional first order polynomial function within a small sized window around each pixel. The estimates of polynomial coefficients provide the measurement of phase and local fringe frequencies. A state space representation of spatial phase evolution and the wrapped phase measurement is considered with the state vector consisting of polynomial coefficients as its elements. Instead of using the traditional nonlinear Kalman filter for the purpose of state estimation, we propose to use the linear Kalman filter operating directly with the wrapped phase measurement. The adaptive window width is selected at each pixel based on the local fringe density to strike a balance between the computation time and the noise robustness. In order to retrieve the unwrapped phase, either a line-scanning approach or a quality guided strategy of pixel selection is used depending on the underlying continuous or discontinuous phase distribution, respectively. Simulation and experimental results are provided to demonstrate the applicability of the proposed method.

  4. Realisation and optical engineering of linear variable bandpass filters in nanoporous anodic alumina photonic crystals.

    Sukarno; Law, Cheryl Suwen; Santos, Abel

    2017-06-08

    We present the first realisation of linear variable bandpass filters in nanoporous anodic alumina (NAA-LVBPFs) photonic crystal structures. NAA gradient-index filters (NAA-GIFs) are produced by sinusoidal pulse anodisation and used as photonic crystal platforms to generate NAA-LVBPFs. The anodisation period of NAA-GIFs is modified from 650 to 850 s to systematically tune the characteristic photonic stopband of these photonic crystals across the UV-visible-NIR spectrum. Then, the nanoporous structure of NAA-GIFs is gradually widened along the surface under controlled conditions by wet chemical etching using a dip coating approach aiming to create NAA-LVBPFs with finely engineered optical properties. We demonstrate that the characteristic photonic stopband and the iridescent interferometric colour displayed by these photonic crystals can be tuned with precision across the surface of NAA-LVBPFs by adjusting the fabrication and etching conditions. Here, we envisage for the first time the combination of the anodisation period and etching conditions as a cost-competitive, facile, and versatile nanofabrication approach that enables the generation of a broad range of unique LVBPFs covering the spectral regions. These photonic crystal structures open new opportunities for multiple applications, including adaptive optics, hyperspectral imaging, fluorescence diagnostics, spectroscopy, and sensing.

  5. Structural shielding design of a 6 MV flattening filter free linear accelerator: Indian scenario

    Bibekananda Mishra

    2017-01-01

    Full Text Available Detailed structural shielding of primary and secondary barriers for a 6 MV medical linear accelerator (LINAC operated with flattening filter (FF and flattening filter free (FFF modes are calculated. The calculations have been carried out by two methods, one using the approach given in National Council on Radiation Protection (NCRP Report No. 151 and the other based on the monitor units (MUs delivered in clinical practice. Radiation survey of the installations was also carried out. NCRP approach suggests that the primary and secondary barrier thicknesses are higher by 24% and 26%. respectively, for a LINAC operated in FF mode to that of a LINAC operated in both FF and FFF modes with an assumption that only 20% of the workload is shared in FFF mode. Primary and secondary barrier thicknesses calculated from MUs delivered on clinical practice method also show the same trend and are higher by 20% and 19%, respectively, for a LINAC operated in FF mode to that of a LINAC operated in both FF and FFF modes. Overall, the barrier thickness for a LINAC operated in FF mode is higher about 20% to that of a LINAC operated in both FF and FFF modes.

  6. Structural Shielding Design of a 6 MV Flattening Filter Free Linear Accelerator: Indian Scenario.

    Mishra, Bibekananda; Selvam, T Palani; Sharma, P K Dash

    2017-01-01

    Detailed structural shielding of primary and secondary barriers for a 6 MV medical linear accelerator (LINAC) operated with flattening filter (FF) and flattening filter free (FFF) modes are calculated. The calculations have been carried out by two methods, one using the approach given in National Council on Radiation Protection (NCRP) Report No. 151 and the other based on the monitor units (MUs) delivered in clinical practice. Radiation survey of the installations was also carried out. NCRP approach suggests that the primary and secondary barrier thicknesses are higher by 24% and 26%. respectively, for a LINAC operated in FF mode to that of a LINAC operated in both FF and FFF modes with an assumption that only 20% of the workload is shared in FFF mode. Primary and secondary barrier thicknesses calculated from MUs delivered on clinical practice method also show the same trend and are higher by 20% and 19%, respectively, for a LINAC operated in FF mode to that of a LINAC operated in both FF and FFF modes. Overall, the barrier thickness for a LINAC operated in FF mode is higher about 20% to that of a LINAC operated in both FF and FFF modes.

  7. Acquiring beam data for a flattening-filter free linear accelerator using organic scintillators

    Beierholm, A.R.; Behrens, C.F.; Hoffmann, L.; Andersen, C.E.

    2013-01-01

    Fibre-coupled organic scintillators have been proven a credible alternative to clinically implemented methods for radiotherapy dosimetry, primarily due to their water equivalence and good spatial resolution. Furthermore, the fast response of the scintillators can be exploited to perform time-resolved dosimetry on a highly detailed level. In this study, we present beam data for a Varian TrueBeam linear accelerator, which is capable of delivering flattening-filter free (FFF 1 ) clinical X-ray beams. The beam data have been acquired using an in-house developed dosimetry system based on fibre-coupled organic scintillators. The presented data exhibit high accuracy and precision when compared with data obtained using commercial dosimetry methods, and agree well with results published in the literature. -- Highlights: •A dosimetry system based on fibre-coupled organic scintillators is presented. •The system is used for radiotherapy beams with and without flattening filter. •Measurements show good agreement with various commercial dosimeters

  8. Performance improvement of shunt active power filter based on non-linear least-square approach

    Terriche, Yacine

    2018-01-01

    Nowadays, the shunt active power filters (SAPFs) have become a popular solution for power quality issues. A crucial issue in controlling the SAPFs which is highly correlated with their accuracy, flexibility and dynamic behavior, is generating the reference compensating current (RCC). The synchron......Nowadays, the shunt active power filters (SAPFs) have become a popular solution for power quality issues. A crucial issue in controlling the SAPFs which is highly correlated with their accuracy, flexibility and dynamic behavior, is generating the reference compensating current (RCC......). The synchronous reference frame (SRF) approach is widely used for generating the RCC due to its simplicity and computation efficiency. However, the SRF approach needs precise information of the voltage phase which becomes a challenge under adverse grid conditions. A typical solution to answer this need....... This paper proposes an improved open loop strategy which is unconditionally stable and flexible. The proposed method which is based on non-linear least square (NLS) approach can extract the fundamental voltage and estimates its phase within only half cycle, even in the presence of odd harmonics and dc offset...

  9. Comparison of some biased estimation methods (including ordinary subset regression) in the linear model

    Sidik, S. M.

    1975-01-01

    Ridge, Marquardt's generalized inverse, shrunken, and principal components estimators are discussed in terms of the objectives of point estimation of parameters, estimation of the predictive regression function, and hypothesis testing. It is found that as the normal equations approach singularity, more consideration must be given to estimable functions of the parameters as opposed to estimation of the full parameter vector; that biased estimators all introduce constraints on the parameter space; that adoption of mean squared error as a criterion of goodness should be independent of the degree of singularity; and that ordinary least-squares subset regression is the best overall method.

  10. Straight line fitting and predictions: On a marginal likelihood approach to linear regression and errors-in-variables models

    Christiansen, Bo

    2015-04-01

    Linear regression methods are without doubt the most used approaches to describe and predict data in the physical sciences. They are often good first order approximations and they are in general easier to apply and interpret than more advanced methods. However, even the properties of univariate regression can lead to debate over the appropriateness of various models as witnessed by the recent discussion about climate reconstruction methods. Before linear regression is applied important choices have to be made regarding the origins of the noise terms and regarding which of the two variables under consideration that should be treated as the independent variable. These decisions are often not easy to make but they may have a considerable impact on the results. We seek to give a unified probabilistic - Bayesian with flat priors - treatment of univariate linear regression and prediction by taking, as starting point, the general errors-in-variables model (Christiansen, J. Clim., 27, 2014-2031, 2014). Other versions of linear regression can be obtained as limits of this model. We derive the likelihood of the model parameters and predictands of the general errors-in-variables model by marginalizing over the nuisance parameters. The resulting likelihood is relatively simple and easy to analyze and calculate. The well known unidentifiability of the errors-in-variables model is manifested as the absence of a well-defined maximum in the likelihood. However, this does not mean that probabilistic inference can not be made; the marginal likelihoods of model parameters and the predictands have, in general, well-defined maxima. We also include a probabilistic version of classical calibration and show how it is related to the errors-in-variables model. The results are illustrated by an example from the coupling between the lower stratosphere and the troposphere in the Northern Hemisphere winter.

  11. Endogenous glucose production from infancy to adulthood: a non-linear regression model

    Huidekoper, Hidde H.; Ackermans, Mariëtte T.; Ruiter, An F. C.; Sauerwein, Hans P.; Wijburg, Frits A.

    2014-01-01

    To construct a regression model for endogenous glucose production (EGP) as a function of age, and compare this with glucose supplementation using commonly used dextrose-based saline solutions at fluid maintenance rate in children. A model was constructed based on EGP data, as quantified by

  12. Weighted linear regression using D2H and D2 as the independent variables

    Hans T. Schreuder; Michael S. Williams

    1998-01-01

    Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...

  13. Comparing Linear Discriminant Function with Logistic Regression for the Two-Group Classification Problem.

    Fan, Xitao; Wang, Lin

    The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…

  14. NetRaVE: constructing dependency networks using sparse linear regression

    Phatak, A.; Kiiveri, H.; Clemmensen, Line Katrine Harder

    2010-01-01

    NetRaVE is a small suite of R functions for generating dependency networks using sparse regression methods. Such networks provide an alternative to interpreting 'top n lists' of genes arising out of an analysis of microarray data, and they provide a means of organizing and visualizing the resulting...

  15. Third-Order Elliptic Lowpass Filter for Multi-Standard Baseband Chain Using Highly Linear Digitally Programmable OTA

    Elamien, Mohamed B.; Mahmoud, Soliman A.

    2018-03-01

    In this paper, a third-order elliptic lowpass filter is designed using highly linear digital programmable balanced OTA. The filter exhibits a cutoff frequency tuning range from 2.2 MHz to 7.1 MHz, thus, it covers W-CDMA, UMTS, and DVB-H standards. The programmability concept in the filter is achieved by using digitally programmable operational transconductors amplifier (DPOTA). The DPOTA employs three linearization techniques which are the source degeneration, double differential pair and the adaptive biasing. Two current division networks (CDNs) are used to control the value of the transconductance. For the DPOTA, the third-order harmonic distortion (HD3) remains below -65 dB up to 0.4 V differential input voltage at 1.2 V supply voltage. The DPOTA and the filter are designed and simulated in 90 nm CMOS technology with LTspice simulator.

  16. FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

    Lorenzo-Seva, Urbano; Ferrando, Pere J

    2011-03-01

    We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

  17. Isotherms and thermodynamics by linear and non-linear regression analysis for the sorption of methylene blue onto activated carbon: Comparison of various error functions

    Kumar, K. Vasanth; Porkodi, K.; Rocha, F.

    2008-01-01

    A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of methylene blue sorption by activated carbon. The r 2 was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions, namely coefficient of determination (r 2 ), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r 2 was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K 2 was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm

  18. Regression Is a Univariate General Linear Model Subsuming Other Parametric Methods as Special Cases.

    Vidal, Sherry

    Although the concept of the general linear model (GLM) has existed since the 1960s, other univariate analyses such as the t-test and the analysis of variance models have remained popular. The GLM produces an equation that minimizes the mean differences of independent variables as they are related to a dependent variable. From a computer printout…

  19. Model structure learning: A support vector machine approach for LPV linear-regression models

    Toth, R.; Laurain, V.; Zheng, W-X.; Poolla, K.

    2011-01-01

    Accurate parametric identification of Linear Parameter-Varying (LPV) systems requires an optimal prior selection of a set of functional dependencies for the parametrization of the model coefficients. Inaccurate selection leads to structural bias while over-parametrization results in a variance

  20. The use of linear programming techniques to design optimal digital filters for pulse shaping and channel equalization

    Houts, R. C.; Burlage, D. W.

    1972-01-01

    A time domain technique is developed to design finite-duration impulse response digital filters using linear programming. Two related applications of this technique in data transmission systems are considered. The first is the design of pulse shaping digital filters to generate or detect signaling waveforms transmitted over bandlimited channels that are assumed to have ideal low pass or bandpass characteristics. The second is the design of digital filters to be used as preset equalizers in cascade with channels that have known impulse response characteristics. Example designs are presented which illustrate that excellent waveforms can be generated with frequency-sampling filters and the ease with which digital transversal filters can be designed for preset equalization.

  1. The estimation and prediction of the inventories for the liquid and gaseous radwaste systems using the linear regression analysis

    Kim, J. Y.; Shin, C. H.; Kim, J. K.; Lee, J. K.; Park, Y. J.

    2003-01-01

    The variation transitions of the inventories for the liquid radwaste system and the radioactive gas have being released in containment, and their predictive values according to the operation histories of Yonggwang(YGN) 3 and 4 were analyzed by linear regression analysis methodology. The results show that the variation transitions of the inventories for those systems are linearly increasing according to the operation histories but the inventories released to the environment are considerably lower than the recommended values based on the FSAR suggestions. It is considered that some conservation were presented in the estimation methodology in preparing stage of FSAR

  2. Discussion on Regression Methods Based on Ensemble Learning and Applicability Domains of Linear Submodels.

    Kaneko, Hiromasa

    2018-02-26

    To develop a new ensemble learning method and construct highly predictive regression models in chemoinformatics and chemometrics, applicability domains (ADs) are introduced into the ensemble learning process of prediction. When estimating values of an objective variable using subregression models, only the submodels with ADs that cover a query sample, i.e., the sample is inside the model's AD, are used. By constructing submodels and changing a list of selected explanatory variables, the union of the submodels' ADs, which defines the overall AD, becomes large, and the prediction performance is enhanced for diverse compounds. By analyzing a quantitative structure-activity relationship data set and a quantitative structure-property relationship data set, it is confirmed that the ADs can be enlarged and the estimation performance of regression models is improved compared with traditional methods.

  3. Linear Regression with a Randomly Censored Covariate: Application to an Alzheimer's Study.

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2017-01-01

    The association between maternal age of onset of dementia and amyloid deposition (measured by in vivo positron emission tomography (PET) imaging) in cognitively normal older offspring is of interest. In a regression model for amyloid, special methods are required due to the random right censoring of the covariate of maternal age of onset of dementia. Prior literature has proposed methods to address the problem of censoring due to assay limit of detection, but not random censoring. We propose imputation methods and a survival regression method that do not require parametric assumptions about the distribution of the censored covariate. Existing imputation methods address missing covariates, but not right censored covariates. In simulation studies, we compare these methods to the simple, but inefficient complete case analysis, and to thresholding approaches. We apply the methods to the Alzheimer's study.

  4. Linear discrete-time state space realization of a modified quadruple tank system with state estimation using Kalman filter

    Mohd. Azam, Sazuan Nazrah

    2017-01-01

    In this paper, we used the modified quadruple tank system that represents a multi-input-multi-output (MIMO) system as an example to present the realization of a linear discrete-time state space model and to obtain the state estimation using Kalman filter in a methodical mannered. First, an existing...... part of the Kalman filter is used to estimates the current state, based on the model and the measurements. The static and dynamic Kalman filter is compared and all results is demonstrated through simulations....

  5. SOCP relaxation bounds for the optimal subset selection problem applied to robust linear regression

    Flores, Salvador

    2015-01-01

    This paper deals with the problem of finding the globally optimal subset of h elements from a larger set of n elements in d space dimensions so as to minimize a quadratic criterion, with an special emphasis on applications to computing the Least Trimmed Squares Estimator (LTSE) for robust regression. The computation of the LTSE is a challenging subset selection problem involving a nonlinear program with continuous and binary variables, linked in a highly nonlinear fashion. The selection of a ...

  6. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  7. Development of a Multiple Linear Regression Model to Forecast Facility Electrical Consumption at an Air Force Base.

    1981-09-01

    corresponds to the same square footage that consumed the electrical energy. 3. The basic assumptions of multiple linear regres- sion, as enumerated in...7. Data related to the sample of bases is assumed to be representative of bases in the population. Limitations Basic limitations on this research were... Ratemaking --Overview. Rand Report R-5894, Santa Monica CA, May 1977. Chatterjee, Samprit, and Bertram Price. Regression Analysis by Example. New York: John

  8. Mathematical considerations regarding the stability of the trace element systems by linear regressions

    Mihai, Maria; Popescu, I.V.

    2002-01-01

    In this paper we present a mathematical model that would describe the stability and instability conditions, respectively of the organs of human body assumed as a living cybernetic system with feedback. We tested the theoretical model on the following trace elements: Mn, Zn and As. The trace elements were determined from the nose-pharyngeal carcinoma. We utilise the linear approximation to describe the dependencies between the trace elements determined in the hair of the patient. We present the results graphically. (authors)

  9. Lattice Designs in Standard and Simple Implicit Multi-linear Regression

    Wooten, Rebecca D.

    2016-01-01

    Statisticians generally use ordinary least squares to minimize the random error in a subject response with respect to independent explanatory variable. However, Wooten shows illustrates how ordinary least squares can be used to minimize the random error in the system without defining a subject response. Using lattice design Wooten shows that non-response analysis is a superior alternative rotation of the pyramidal relationship between random variables and parameter estimates in multi-linear r...

  10. Modeling daily soil temperature over diverse climate conditions in Iran—a comparison of multiple linear regression and support vector regression techniques

    Delbari, Masoomeh; Sharifazari, Salman; Mohammadi, Ehsan

    2018-02-01

    The knowledge of soil temperature at different depths is important for agricultural industry and for understanding climate change. The aim of this study is to evaluate the performance of a support vector regression (SVR)-based model in estimating daily soil temperature at 10, 30 and 100 cm depth at different climate conditions over Iran. The obtained results were compared to those obtained from a more classical multiple linear regression (MLR) model. The correlation sensitivity for the input combinations and periodicity effect were also investigated. Climatic data used as inputs to the models were minimum and maximum air temperature, solar radiation, relative humidity, dew point, and the atmospheric pressure (reduced to see level), collected from five synoptic stations Kerman, Ahvaz, Tabriz, Saghez, and Rasht located respectively in the hyper-arid, arid, semi-arid, Mediterranean, and hyper-humid climate conditions. According to the results, the performance of both MLR and SVR models was quite well at surface layer, i.e., 10-cm depth. However, SVR performed better than MLR in estimating soil temperature at deeper layers especially 100 cm depth. Moreover, both models performed better in humid climate condition than arid and hyper-arid areas. Further, adding a periodicity component into the modeling process considerably improved the models' performance especially in the case of SVR.

  11. Modeling the frequency of opposing left-turn conflicts at signalized intersections using generalized linear regression models.

    Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei

    2014-01-01

    The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.

  12. Improving ASTER GDEM Accuracy Using Land Use-Based Linear Regression Methods: A Case Study of Lianyungang, East China

    Xiaoyan Yang

    2018-04-01

    Full Text Available The Advanced Spaceborne Thermal-Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM is important to a wide range of geographical and environmental studies. Its accuracy, to some extent associated with land-use types reflecting topography, vegetation coverage, and human activities, impacts the results and conclusions of these studies. In order to improve the accuracy of ASTER GDEM prior to its application, we investigated ASTER GDEM errors based on individual land-use types and proposed two linear regression calibration methods, one considering only land use-specific errors and the other considering the impact of both land-use and topography. Our calibration methods were tested on the coastal prefectural city of Lianyungang in eastern China. Results indicate that (1 ASTER GDEM is highly accurate for rice, wheat, grass and mining lands but less accurate for scenic, garden, wood and bare lands; (2 despite improvements in ASTER GDEM2 accuracy, multiple linear regression calibration requires more data (topography and a relatively complex calibration process; (3 simple linear regression calibration proves a practicable and simplified means to systematically investigate and improve the impact of land-use on ASTER GDEM accuracy. Our method is applicable to areas with detailed land-use data based on highly accurate field-based point-elevation measurements.

  13. A study on direct determination of uranium in ore by analyzing γ-ray spectrum with dual linear regression

    Liu Chunkui

    1996-01-01

    The method introduced is based on different energy of γ-ray emitted from radionuclide in the uranium-radium decay series in ore. The pulse counting rates of two spectra bands, i.e. N 1 (55∼193 keV) and N 2 (260∼1500 keV), are measured by portable type HYX-3 400-channel γ-ray spectrometer. On the other side, the uranium content (Q U ) is obtained by chemical analysis of channel sampling. Then the regression coefficients (b 0 , b 1 ,b 2 ) can be determined through dual linear regression by using Q U and N 1 , N 2 . The direct determination of uranium can be made with the regression equation Q U = b 0 + b 1 N 1 + b 2 N 2

  14. Data processing for potentiometric precipitation titration of mixtures of isovalent ions by linear regression analysis

    Mar'yanov, B.M.; Shumar, S.V.; Gavrilenko, M.A.

    1994-01-01

    A method for the computer processing of the curves of potentiometric differential titration using the precipitation reactions is developed. This method is based on transformation of the titration curve into a line of multiphase regression, whose parameters determine the equivalence points and the solubility products of the formed precipitates. The computational algorithm is tested using experimental curves for the titration of solutions containing Hg(2) and Cd(2) by the solution of sodium diethyldithiocarbamate. The random errors (RSD) for the titration of 1x10 -4 M solutions are in the range of 3-6%. 7 refs.; 2 figs.; 1 tab

  15. Single camera multi-view anthropometric measurement of human height and mid-upper arm circumference using linear regression.

    Liu, Yingying; Sowmya, Arcot; Khamis, Heba

    2018-01-01

    Manually measured anthropometric quantities are used in many applications including human malnutrition assessment. Training is required to collect anthropometric measurements manually, which is not ideal in resource-constrained environments. Photogrammetric methods have been gaining attention in recent years, due to the availability and affordability of digital cameras. The primary goal is to demonstrate that height and mid-upper arm circumference (MUAC)-indicators of malnutrition-can be accurately estimated by applying linear regression to distance measurements from photographs of participants taken from five views, and determine the optimal view combinations. A secondary goal is to observe the effect on estimate error of two approaches which reduce complexity of the setup, computational requirements and the expertise required of the observer. Thirty-one participants (11 female, 20 male; 18-37 years) were photographed from five views. Distances were computed using both camera calibration and reference object techniques from manually annotated photos. To estimate height, linear regression was applied to the distances between the top of the participants head and the floor, as well as the height of a bounding box enclosing the participant's silhouette which eliminates the need to identify the floor. To estimate MUAC, linear regression was applied to the mid-upper arm width. Estimates were computed for all view combinations and performance was compared to other photogrammetric methods from the literature-linear distance method for height, and shape models for MUAC. The mean absolute difference (MAD) between the linear regression estimates and manual measurements were smaller compared to other methods. For the optimal view combinations (smallest MAD), the technical error of measurement and coefficient of reliability also indicate the linear regression methods are more reliable. The optimal view combination was the front and side views. When estimating height by linear

  16. Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis.

    Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X

    2016-09-01

    The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.

  17. 10 km running performance predicted by a multiple linear regression model with allometrically adjusted variables.

    Abad, Cesar C C; Barros, Ronaldo V; Bertuzzi, Romulo; Gagliardi, João F L; Lima-Silva, Adriano E; Lambert, Mike I; Pires, Flavio O

    2016-06-01

    The aim of this study was to verify the power of VO 2max , peak treadmill running velocity (PTV), and running economy (RE), unadjusted or allometrically adjusted, in predicting 10 km running performance. Eighteen male endurance runners performed: 1) an incremental test to exhaustion to determine VO 2max and PTV; 2) a constant submaximal run at 12 km·h -1 on an outdoor track for RE determination; and 3) a 10 km running race. Unadjusted (VO 2max , PTV and RE) and adjusted variables (VO 2max 0.72 , PTV 0.72 and RE 0.60 ) were investigated through independent multiple regression models to predict 10 km running race time. There were no significant correlations between 10 km running time and either the adjusted or unadjusted VO 2max . Significant correlations (p 0.84 and power > 0.88. The allometrically adjusted predictive model was composed of PTV 0.72 and RE 0.60 and explained 83% of the variance in 10 km running time with a standard error of the estimate (SEE) of 1.5 min. The unadjusted model composed of a single PVT accounted for 72% of the variance in 10 km running time (SEE of 1.9 min). Both regression models provided powerful estimates of 10 km running time; however, the unadjusted PTV may provide an uncomplicated estimation.

  18. Adaptive methods for flood forecasting using linear regression models in the upper basin of Senegal River

    Sambou, Soussou

    2004-01-01

    In flood forecasting modelling, large basins are often considered as hydrological systems with multiple inputs and one output. Inputs are hydrological variables such rainfall, runoff and physical characteristics of basin; output is runoff. Relating inputs to output can be achieved using deterministic, conceptual, or stochastic models. Rainfall runoff models generally lack of accuracy. Physical hydrological processes based models, either deterministic or conceptual are highly data requirement demanding and by the way very complex. Stochastic multiple input-output models, using only historical chronicles of hydrological variables particularly runoff are by the way very popular among the hydrologists for large river basin flood forecasting. Application is made on the Senegal River upstream of Bakel, where the River is formed by the main branch, Bafing, and two tributaries, Bakoye and Faleme; Bafing being regulated by Manantaly Dam. A three inputs and one output model has been used for flood forecasting on Bakel. Influence of the lead forecasting, and of the three inputs taken separately, then associated two by two, and altogether has been verified using a dimensionless variance as criterion of quality. Inadequacies occur generally between model output and observations; to put model in better compliance with current observations, we have compared four parameter updating procedure, recursive least squares, Kalman filtering, stochastic gradient method, iterative method, and an AR errors forecasting model. A combination of these model updating have been used in real time flood forecasting.(Author)

  19. Application of empirical mode decomposition with local linear quantile regression in financial time series forecasting.

    Jaber, Abobaker M; Ismail, Mohd Tahir; Altaher, Alsaidi M

    2014-01-01

    This paper mainly forecasts the daily closing price of stock markets. We propose a two-stage technique that combines the empirical mode decomposition (EMD) with nonparametric methods of local linear quantile (LLQ). We use the proposed technique, EMD-LLQ, to forecast two stock index time series. Detailed experiments are implemented for the proposed method, in which EMD-LPQ, EMD, and Holt-Winter methods are compared. The proposed EMD-LPQ model is determined to be superior to the EMD and Holt-Winter methods in predicting the stock closing prices.

  20. Impact of a flattening filter free linear accelerator on structural shielding design

    Jank, Julia; Kragl, Gabriele; Georg, Dietmar; Medical University of Vienna

    2014-01-01

    Purpose: The present study aimed to assess the effects of a flattening filter free medical accelerator on structural shielding demands of a treatment vault of a medical linear accelerator. We tried to answer the question, to what extent the required thickness of the shielding barriers can be reduced if instead of the standard flattened photon beams unflattened ones are used. Material and Methods: We chose both an experimental as well as a theoretical approach. On the one hand we measured photon dose rates at protected places outside the treatment room and compared the obtained results for flattened and unflattened beams. On the other hand we complied with international guidelines for adequate treatment vault design and calculated the shielding barriers according to the therein given specifications. Measurements were performed with an Elekta Precise trademark linac providing nominal photon energies of 6 and 10 MV. This machine underwent already earlier some modifications in order to be able to operate both with and without a flattening filter. Photon dose rates were measured with a LB133-1 dose rate meter manufactured by Berthold. To calculate the thickness of shielding barriers we referred to the Austrian standard OeNORM S 5216 and to the US American NCRP Report No. 151. Results: We determined a substantial photon dose rate reduction for all measurement points and photon energies. For unflattened 6 MV beams a reduction factor ranging from 1.4 to 1.8 was identified. The corresponding values for unflattened 10 MV beams were 2.1 and 3.2. The performed shielding calculations indicated the same tendency: For all relevant radiation components we found a reduction in shielding thickness when unflattened beams were used. The required thickness of primary barriers was reduced up to 8.0%, the thickness of secondary barriers up to 11.4%, respectively. Conclusions: For an adequate dimensioning of treatment vault shielding barriers it is by no means irrelevant if the

  1. Impact of a flattening filter free linear accelerator on structural shielding design.

    Jank, Julia; Kragl, Gabriele; Georg, Dietmar

    2014-03-01

    The present study aimed to assess the effects of a flattening filter free medical accelerator on structural shielding demands of a treatment vault of a medical linear accelerator. We tried to answer the question, to what extent the required thickness of the shielding barriers can be reduced if instead of the standard flattened photon beams unflattened ones are used. We chose both an experimental as well as a theoretical approach. On the one hand we measured photon dose rates at protected places outside the treatment room and compared the obtained results for flattened and unflattened beams. On the other hand we complied with international guidelines for adequate treatment vault design and calculated the shielding barriers according to the therein given specifications. Measurements were performed with an Elekta Precise™ linac providing nominal photon energies of 6 and 10 MV. This machine underwent already earlier some modifications in order to be able to operate both with and without a flattening filter. Photon dose rates were measured with a LB133-1 dose rate meter manufactured by Berthold. To calculate the thickness of shielding barriers we referred to the Austrian standard ÖNORM S 5216 and to the US American NCRP Report No. 151. We determined a substantial photon dose rate reduction for all measurement points and photon energies. For unflattened 6 MV beams a reduction factor ranging from 1.4 to 1.8 was identified. The corresponding values for unflattened 10 MV beams were 2.1 and 3.2. The performed shielding calculations indicated the same tendency: For all relevant radiation components we found a reduction in shielding thickness when unflattened beams were used. The required thickness of primary barriers was reduced up to 8.0%, the thickness of secondary barriers up to 11.4%, respectively. For an adequate dimensioning of treatment vault shielding barriers it is by no means irrelevant if the accommodated linac operates with or without a flattening filter. The

  2. Impact of a flattening filter free linear accelerator on structural shielding design

    Jank, Julia [Klinikum - Klagenfurt am Woerthersee (Austria). Inst. fuer Strahlentherapie und Radioonkologie; Kragl, Gabriele [Medical University of Vienna/AKH Vienna (Austria). Div. Medical Radiation Physics; Georg, Dietmar [Medical University of Vienna/AKH Vienna (Austria). Div. Medical Radiation Physics; Medical University of Vienna (Austria). Christian Doppler Lab. for Medical Radiation Research for Radiation Oncology

    2014-04-01

    Purpose: The present study aimed to assess the effects of a flattening filter free medical accelerator on structural shielding demands of a treatment vault of a medical linear accelerator. We tried to answer the question, to what extent the required thickness of the shielding barriers can be reduced if instead of the standard flattened photon beams unflattened ones are used. Material and Methods: We chose both an experimental as well as a theoretical approach. On the one hand we measured photon dose rates at protected places outside the treatment room and compared the obtained results for flattened and unflattened beams. On the other hand we complied with international guidelines for adequate treatment vault design and calculated the shielding barriers according to the therein given specifications. Measurements were performed with an Elekta Precise trademark linac providing nominal photon energies of 6 and 10 MV. This machine underwent already earlier some modifications in order to be able to operate both with and without a flattening filter. Photon dose rates were measured with a LB133-1 dose rate meter manufactured by Berthold. To calculate the thickness of shielding barriers we referred to the Austrian standard OeNORM S 5216 and to the US American NCRP Report No. 151. Results: We determined a substantial photon dose rate reduction for all measurement points and photon energies. For unflattened 6 MV beams a reduction factor ranging from 1.4 to 1.8 was identified. The corresponding values for unflattened 10 MV beams were 2.1 and 3.2. The performed shielding calculations indicated the same tendency: For all relevant radiation components we found a reduction in shielding thickness when unflattened beams were used. The required thickness of primary barriers was reduced up to 8.0%, the thickness of secondary barriers up to 11.4%, respectively. Conclusions: For an adequate dimensioning of treatment vault shielding barriers it is by no means irrelevant if the

  3. Comparison of Adaline and Multiple Linear Regression Methods for Rainfall Forecasting

    Sutawinaya, IP; Astawa, INGA; Hariyanti, NKD

    2018-01-01

    Heavy rainfall can cause disaster, therefore need a forecast to predict rainfall intensity. Main factor that cause flooding is there is a high rainfall intensity and it makes the river become overcapacity. This will cause flooding around the area. Rainfall factor is a dynamic factor, so rainfall is very interesting to be studied. In order to support the rainfall forecasting, there are methods that can be used from Artificial Intelligence (AI) to statistic. In this research, we used Adaline for AI method and Regression for statistic method. The more accurate forecast result shows the method that used is good for forecasting the rainfall. Through those methods, we expected which is the best method for rainfall forecasting here.

  4. The Systematic Bias of Ingestible Core Temperature Sensors Requires a Correction by Linear Regression.

    Hunt, Andrew P; Bach, Aaron J E; Borg, David N; Costello, Joseph T; Stewart, Ian B

    2017-01-01

    An accurate measure of core body temperature is critical for monitoring individuals, groups and teams undertaking physical activity in situations of high heat stress or prolonged cold exposure. This study examined the range in systematic bias of ingestible temperature sensors compared to a certified and traceable reference thermometer. A total of 119 ingestible temperature sensors were immersed in a circulated water bath at five water temperatures (TEMP A: 35.12 ± 0.60°C, TEMP B: 37.33 ± 0.56°C, TEMP C: 39.48 ± 0.73°C, TEMP D: 41.58 ± 0.97°C, and TEMP E: 43.47 ± 1.07°C) along with a certified traceable reference thermometer. Thirteen sensors (10.9%) demonstrated a systematic bias > ±0.1°C, of which 4 (3.3%) were > ± 0.5°C. Limits of agreement (95%) indicated that systematic bias would likely fall in the range of -0.14 to 0.26°C, highlighting that it is possible for temperatures measured between sensors to differ by more than 0.4°C. The proportion of sensors with systematic bias > ±0.1°C (10.9%) confirms that ingestible temperature sensors require correction to ensure their accuracy. An individualized linear correction achieved a mean systematic bias of 0.00°C, and limits of agreement (95%) to 0.00-0.00°C, with 100% of sensors achieving ±0.1°C accuracy. Alternatively, a generalized linear function (Corrected Temperature (°C) = 1.00375 × Sensor Temperature (°C) - 0.205549), produced as the average slope and intercept of a sub-set of 51 sensors and excluding sensors with accuracy outside ±0.5°C, reduced the systematic bias to Correction of sensor temperature to a reference thermometer by linear function eliminates this systematic bias (individualized functions) or ensures systematic bias is within ±0.1°C in 98% of the sensors (generalized function).

  5. The Systematic Bias of Ingestible Core Temperature Sensors Requires a Correction by Linear Regression

    Andrew P. Hunt

    2017-04-01

    Full Text Available An accurate measure of core body temperature is critical for monitoring individuals, groups and teams undertaking physical activity in situations of high heat stress or prolonged cold exposure. This study examined the range in systematic bias of ingestible temperature sensors compared to a certified and traceable reference thermometer. A total of 119 ingestible temperature sensors were immersed in a circulated water bath at five water temperatures (TEMP A: 35.12 ± 0.60°C, TEMP B: 37.33 ± 0.56°C, TEMP C: 39.48 ± 0.73°C, TEMP D: 41.58 ± 0.97°C, and TEMP E: 43.47 ± 1.07°C along with a certified traceable reference thermometer. Thirteen sensors (10.9% demonstrated a systematic bias > ±0.1°C, of which 4 (3.3% were > ± 0.5°C. Limits of agreement (95% indicated that systematic bias would likely fall in the range of −0.14 to 0.26°C, highlighting that it is possible for temperatures measured between sensors to differ by more than 0.4°C. The proportion of sensors with systematic bias > ±0.1°C (10.9% confirms that ingestible temperature sensors require correction to ensure their accuracy. An individualized linear correction achieved a mean systematic bias of 0.00°C, and limits of agreement (95% to 0.00–0.00°C, with 100% of sensors achieving ±0.1°C accuracy. Alternatively, a generalized linear function (Corrected Temperature (°C = 1.00375 × Sensor Temperature (°C − 0.205549, produced as the average slope and intercept of a sub-set of 51 sensors and excluding sensors with accuracy outside ±0.5°C, reduced the systematic bias to < ±0.1°C in 98.4% of the remaining sensors (n = 64. In conclusion, these data show that using an uncalibrated ingestible temperature sensor may provide inaccurate data that still appears to be statistically, physiologically, and clinically meaningful. Correction of sensor temperature to a reference thermometer by linear function eliminates this systematic bias (individualized functions or ensures

  6. Regularized iterative integration combined with non-linear diffusion filtering for phase-contrast x-ray computed tomography.

    Burger, Karin; Koehler, Thomas; Chabior, Michael; Allner, Sebastian; Marschner, Mathias; Fehringer, Andreas; Willner, Marian; Pfeiffer, Franz; Noël, Peter

    2014-12-29

    Phase-contrast x-ray computed tomography has a high potential to become clinically implemented because of its complementarity to conventional absorption-contrast.In this study, we investigate noise-reducing but resolution-preserving analytical reconstruction methods to improve differential phase-contrast imaging. We apply the non-linear Perona-Malik filter on phase-contrast data prior or post filtered backprojected reconstruction. Secondly, the Hilbert kernel is replaced by regularized iterative integration followed by ramp filtered backprojection as used for absorption-contrast imaging. Combining the Perona-Malik filter with this integration algorithm allows to successfully reveal relevant sample features, quantitatively confirmed by significantly increased structural similarity indices and contrast-to-noise ratios. With this concept, phase-contrast imaging can be performed at considerably lower dose.

  7. Trace analysis of acids and bases by conductometric titration with multiparametric non-linear regression.

    Coelho, Lúcia H G; Gutz, Ivano G R

    2006-03-15

    A chemometric method for analysis of conductometric titration data was introduced to extend its applicability to lower concentrations and more complex acid-base systems. Auxiliary pH measurements were made during the titration to assist the calculation of the distribution of protonable species on base of known or guessed equilibrium constants. Conductivity values of each ionized or ionizable species possibly present in the sample were introduced in a general equation where the only unknown parameters were the total concentrations of (conjugated) bases and of strong electrolytes not involved in acid-base equilibria. All these concentrations were adjusted by a multiparametric nonlinear regression (NLR) method, based on the Levenberg-Marquardt algorithm. This first conductometric titration method with NLR analysis (CT-NLR) was successfully applied to simulated conductometric titration data and to synthetic samples with multiple components at concentrations as low as those found in rainwater (approximately 10 micromol L(-1)). It was possible to resolve and quantify mixtures containing a strong acid, formic acid, acetic acid, ammonium ion, bicarbonate and inert electrolyte with accuracy of 5% or better.

  8. Relationships between the structure of wheat gluten and ACE inhibitory activity of hydrolysate: stepwise multiple linear regression analysis.

    Zhang, Yanyan; Ma, Haile; Wang, Bei; Qu, Wenjuan; Wali, Asif; Zhou, Cunshan

    2016-08-01

    Ultrasound pretreatment of wheat gluten (WG) before enzymolysis can improve the angiotensin converting enzyme (ACE) inhibitory activity of the hydrolysates by alerting the structure of substrate proteins. Establishment of a relationship between the structure of WG and ACE inhibitory activity of the hydrolysates to judge the end point of the ultrasonic pretreatment is vital. The results of stepwise multiple linear regression (MLR) showed that the contents of free sulfhydryl, α-helix, disulfide bond, surface hydrophobicity and random coil were significantly correlated to ACE Inhibitory activity of the hydrolysate, with the standard partial regression coefficients were 3.729, -0.676, -0.252, 0.022 and 0.156, respectively. The R(2) of this model was 0.970. External validation showed that the stepwise MLR model could well predict the ACE inhibitory activity of hydrolysate based on the content of free sulfhydryl, α-helix, disulfide bond, surface hydrophobicity and random coil of WG before hydrolysis. A stepwise multiple linear regression model describing the quantitative relationships between the structure of WG and the ACE Inhibitory activity of the hydrolysates was established. This model can be used to predict the endpoint of the ultrasonic pretreatment. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.

  9. Bayesian quantile regression-based partially linear mixed-effects joint models for longitudinal data with multiple features.

    Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara

    2017-01-01

    In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.

  10. Stimulated Emission Computed Tomography (NSECT) images enhancement using a linear filter in the frequency domain

    Viana, Rodrigo S.S.; Tardelli, Tiago C.; Yoriyaz, Helio, E-mail: hyoriyaz@ipen.b [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil); Jackowski, Marcel P., E-mail: mjack@ime.usp.b [University of Sao Paulo (USP), SP (Brazil). Dept. of Computer Science

    2011-07-01

    In recent years, a new technique for in vivo spectrographic imaging of stable isotopes was presented as Neutron Stimulated Emission Computed Tomography (NSECT). In this technique, a fast neutrons beam stimulates stable nuclei in a sample, which emit characteristic gamma radiation. The photon energy is unique and is used to identify the emitting nuclei. The emitted gamma energy spectra can be used for reconstruction of the target tissue image and for determination of the tissue elemental composition. Due to the stochastic nature of photon emission process by irradiated tissue, one of the most suitable algorithms for tomographic reconstruction is the Expectation-Maximization (E-M) algorithm, once on its formulation are considered simultaneously the probabilities of photons emission and detection. However, a disadvantage of this algorithm is the introduction of noise in the reconstructed image as the number of iterations increases. This increase can be caused either by features of the algorithm itself or by the low sampling rate of projections used for tomographic reconstruction. In this work, a linear filter in the frequency domain was used in order to improve the quality of the reconstructed images. (author)

  11. Stimulated Emission Computed Tomography (NSECT) images enhancement using a linear filter in the frequency domain

    Viana, Rodrigo S.S.; Tardelli, Tiago C.; Yoriyaz, Helio; Jackowski, Marcel P.

    2011-01-01

    In recent years, a new technique for in vivo spectrographic imaging of stable isotopes was presented as Neutron Stimulated Emission Computed Tomography (NSECT). In this technique, a fast neutrons beam stimulates stable nuclei in a sample, which emit characteristic gamma radiation. The photon energy is unique and is used to identify the emitting nuclei. The emitted gamma energy spectra can be used for reconstruction of the target tissue image and for determination of the tissue elemental composition. Due to the stochastic nature of photon emission process by irradiated tissue, one of the most suitable algorithms for tomographic reconstruction is the Expectation-Maximization (E-M) algorithm, once on its formulation are considered simultaneously the probabilities of photons emission and detection. However, a disadvantage of this algorithm is the introduction of noise in the reconstructed image as the number of iterations increases. This increase can be caused either by features of the algorithm itself or by the low sampling rate of projections used for tomographic reconstruction. In this work, a linear filter in the frequency domain was used in order to improve the quality of the reconstructed images. (author)

  12. Spatial filtering self-velocimeter for vehicle application using a CMOS linear image sensor

    He, Xin; Zhou, Jian; Nie, Xiaoming; Long, Xingwu

    2015-03-01

    The idea of using a spatial filtering velocimeter (SFV) to measure the velocity of a vehicle for an inertial navigation system is put forward. The presented SFV is based on a CMOS linear image sensor with a high-speed data rate, large pixel size, and built-in timing generator. These advantages make the image sensor suitable to measure vehicle velocity. The power spectrum of the output signal is obtained by fast Fourier transform and is corrected by a frequency spectrum correction algorithm. This velocimeter was used to measure the velocity of a conveyor belt driven by a rotary table and the measurement uncertainty is ˜0.54%. Furthermore, it was also installed on a vehicle together with a laser Doppler velocimeter (LDV) to measure self-velocity. The measurement result of the designed SFV is compared with that of the LDV. It is shown that the measurement result of the SFV is coincident with that of the LDV. Therefore, the designed SFV is suitable for a vehicle self-contained inertial navigation system.

  13. Static Hyperspectral Fluorescence Imaging of Viscous Materials Based on a Linear Variable Filter Spectrometer

    Alexander W. Koch

    2013-09-01

    Full Text Available This paper presents a low-cost hyperspectral measurement setup in a new application based on fluorescence detection in the visible (Vis wavelength range. The aim of the setup is to take hyperspectral fluorescence images of viscous materials. Based on these images, fluorescent and non-fluorescent impurities in the viscous materials can be detected. For the illumination of the measurement object, a narrow-band high-power light-emitting diode (LED with a center wavelength of 370 nm was used. The low-cost acquisition unit for the imaging consists of a linear variable filter (LVF and a complementary metal oxide semiconductor (CMOS 2D sensor array. The translucent wavelength range of the LVF is from 400 nm to 700 nm. For the confirmation of the concept, static measurements of fluorescent viscous materials with a non-fluorescent impurity have been performed and analyzed. With the presented setup, measurement surfaces in the micrometer range can be provided. The measureable minimum particle size of the impurities is in the nanometer range. The recording rate for the measurements depends on the exposure time of the used CMOS 2D sensor array and has been found to be in the microsecond range.

  14. Implementasi Data Mining Estimasi Ketersediaan Lahan Pembuangan Sampah menggunakan Algoritma Simple Linear Regression

    Robi Yanto

    2018-04-01

    Full Text Available Tingginya aktivitas konsumsi yang dilakukan masyarakat berbanding lurus dengan meningkatnnya produksi sampah. Salah satu permasalahan tingginya produksi sampah yaitu rendahnya kesadaran masyarakat terhadap pengelolaan sampah. Hal ini merupakan masalah yang dihadapi di kota-kota besar. Sampah memberikan dampak negatif terhadap perubahan kondisi alam yang ada yaitu terjadinya polusi udara, air dan tanah yang mengakibatkan lingkungan menjadi tidak sehat. Kegiatan pengelolaan sampah melalui sosialisasi program 3R (Reduce, Reuse, Recycle tentang sampah, telah memberikan dampak yang maksimal terhadap kesadaran masyarakat tentang pentingnya lingkungan yang sehat. seiring dengan peningkatan jumlah penduduk memberikan dampak pada peningkatan produksi sampah. Sehingga membutuhkan lahan pembuangan sampah yang mencukupi dalam jangka panjang. Untuk mengatasi permasalahan tersebut maka dilakukan analisa data terhadap estimasi ketersediaan lahan pembuangan sampah dalam jangka panjang dengan menggunakan teknik data mining. dari hasil analisa data mining menggunakan algoritma regresi linear sederhana dengan memperhatikan pertumbuhan penduduk tahun 2018 sampai dengan 2025 sebesar 201484 jiwa, maka diketahui bahwa peningkatan sampah dari tahun 2018 sampai dengan tahun 2025 adalah 36.052,326 ton. Sehingga dari luas lahan 30000 M2 hanya tersediaan lahan pembuangan sampah sampai tahun 2025 sebesar 5.965,1 M2.

  15. A componential model of human interaction with graphs: 1. Linear regression modeling

    Gillan, Douglas J.; Lewis, Robert

    1994-01-01

    Task analyses served as the basis for developing the Mixed Arithmetic-Perceptual (MA-P) model, which proposes (1) that people interacting with common graphs to answer common questions apply a set of component processes-searching for indicators, encoding the value of indicators, performing arithmetic operations on the values, making spatial comparisons among indicators, and repsonding; and (2) that the type of graph and user's task determine the combination and order of the components applied (i.e., the processing steps). Two experiments investigated the prediction that response time will be linearly related to the number of processing steps according to the MA-P model. Subjects used line graphs, scatter plots, and stacked bar graphs to answer comparison questions and questions requiring arithmetic calculations. A one-parameter version of the model (with equal weights for all components) and a two-parameter version (with different weights for arithmetic and nonarithmetic processes) accounted for 76%-85% of individual subjects' variance in response time and 61%-68% of the variance taken across all subjects. The discussion addresses possible modifications in the MA-P model, alternative models, and design implications from the MA-P model.

  16. Effects of measurement errors on psychometric measurements in ergonomics studies: Implications for correlations, ANOVA, linear regression, factor analysis, and linear discriminant analysis.

    Liu, Yan; Salvendy, Gavriel

    2009-05-01

    This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.

  17. Genomic-Enabled Prediction Based on Molecular Markers and Pedigree Using the Bayesian Linear Regression Package in R

    Paulino Pérez

    2010-09-01

    Full Text Available The availability of dense molecular markers has made possible the use of genomic selection in plant and animal breeding. However, models for genomic selection pose several computational and statistical challenges and require specialized computer programs, not always available to the end user and not implemented in standard statistical software yet. The R-package BLR (Bayesian Linear Regression implements several statistical procedures (e.g., Bayesian Ridge Regression, Bayesian LASSO in a unified framework that allows including marker genotypes and pedigree data jointly. This article describes the classes of models implemented in the BLR package and illustrates their use through examples. Some challenges faced when applying genomic-enabled selection, such as model choice, evaluation of predictive ability through cross-validation, and choice of hyper-parameters, are also addressed.

  18. pKa prediction for acidic phosphorus-containing compounds using multiple linear regression with computational descriptors.

    Yu, Donghai; Du, Ruobing; Xiao, Ji-Chang

    2016-07-05

    Ninety-six acidic phosphorus-containing molecules with pKa 1.88 to 6.26 were collected and divided into training and test sets by random sampling. Structural parameters were obtained by density functional theory calculation of the molecules. The relationship between the experimental pKa values and structural parameters was obtained by multiple linear regression fitting for the training set, and tested with the test set; the R(2) values were 0.974 and 0.966 for the training and test sets, respectively. This regression equation, which quantitatively describes the influence of structural parameters on pKa , and can be used to predict pKa values of similar structures, is significant for the design of new acidic phosphorus-containing extractants. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  19. Is it the intervention or the students? using linear regression to control for student characteristics in undergraduate STEM education research.

    Theobald, Roddy; Freeman, Scott

    2014-01-01

    Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance.

  20. Two-Stage Method Based on Local Polynomial Fitting for a Linear Heteroscedastic Regression Model and Its Application in Economics

    Liyun Su

    2012-01-01

    Full Text Available We introduce the extension of local polynomial fitting to the linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to nonparametric technique of local polynomial estimation, we do not need to know the heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we focus on comparison of parameters and reach an optimal fitting. Besides, we verify the asymptotic normality of parameters based on numerical simulations. Finally, this approach is applied to a case of economics, and it indicates that our method is surely effective in finite-sample situations.

  1. Comparison of multiple linear regression and artificial neural network in developing the objective functions of the orthopaedic screws.

    Hsu, Ching-Chi; Lin, Jinn; Chao, Ching-Kong

    2011-12-01

    Optimizing the orthopaedic screws can greatly improve their biomechanical performances. However, a methodical design optimization approach requires a long time to search the best design. Thus, the surrogate objective functions of the orthopaedic screws should be accurately developed. To our knowledge, there is no study to evaluate the strengths and limitations of the surrogate methods in developing the objective functions of the orthopaedic screws. Three-dimensional finite element models for both the tibial locking screws and the spinal pedicle screws were constructed and analyzed. Then, the learning data were prepared according to the arrangement of the Taguchi orthogonal array, and the verification data were selected with use of a randomized selection. Finally, the surrogate objective functions were developed by using either the multiple linear regression or the artificial neural network. The applicability and accuracy of those surrogate methods were evaluated and discussed. The multiple linear regression method could successfully construct the objective function of the tibial locking screws, but it failed to develop the objective function of the spinal pedicle screws. The artificial neural network method showed a greater capacity of prediction in developing the objective functions for the tibial locking screws and the spinal pedicle screws than the multiple linear regression method. The artificial neural network method may be a useful option for developing the objective functions of the orthopaedic screws with a greater structural complexity. The surrogate objective functions of the orthopaedic screws could effectively decrease the time and effort required for the design optimization process. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  2. Construction of multiple linear regression models using blood biomarkers for selecting against abdominal fat traits in broilers.

    Dong, J Q; Zhang, X Y; Wang, S Z; Jiang, X F; Zhang, K; Ma, G W; Wu, M Q; Li, H; Zhang, H

    2018-01-01

    Plasma very low-density lipoprotein (VLDL) can be used to select for low body fat or abdominal fat (AF) in broilers, but its correlation with AF is limited. We investigated whether any other biochemical indicator can be used in combination with VLDL for a better selective effect. Nineteen plasma biochemical indicators were measured in male chickens from the Northeast Agricultural University broiler lines divergently selected for AF content (NEAUHLF) in the fed state at 46 and 48 d of age. The average concentration of every parameter for the 2 d was used for statistical analysis. Levels of these 19 plasma biochemical parameters were compared between the lean and fat lines. The phenotypic correlations between these plasma biochemical indicators and AF traits were analyzed. Then, multiple linear regression models were constructed to select the best model used for selecting against AF content. and the heritabilities of plasma indicators contained in the best models were estimated. The results showed that 11 plasma biochemical indicators (triglycerides, total bile acid, total protein, globulin, albumin/globulin, aspartate transaminase, alanine transaminase, gamma-glutamyl transpeptidase, uric acid, creatinine, and VLDL) differed significantly between the lean and fat lines (P linear regression models based on albumin/globulin, VLDL, triglycerides, globulin, total bile acid, and uric acid, had higher R2 (0.73) than the model based only on VLDL (0.21). The plasma parameters included in the best models had moderate heritability estimates (0.21 ≤ h2 ≤ 0.43). These results indicate that these multiple linear regression models can be used to select for lean broiler chickens. © 2017 Poultry Science Association Inc.

  3. SU-G-BRA-08: Diaphragm Motion Tracking Based On KV CBCT Projections with a Constrained Linear Regression Optimization

    Wei, J [City College of New York, New York, NY (United States); Chao, M [The Mount Sinai Medical Center, New York, NY (United States)

    2016-06-15

    Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associated algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately

  4. SU-G-BRA-08: Diaphragm Motion Tracking Based On KV CBCT Projections with a Constrained Linear Regression Optimization

    Wei, J; Chao, M

    2016-01-01

    Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associated algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately

  5. Direct-on-Filter α-Quartz Estimation in Respirable Coal Mine Dust Using Transmission Fourier Transform Infrared Spectrometry and Partial Least Squares Regression.

    Miller, Arthur L; Weakley, Andrew Todd; Griffiths, Peter R; Cauda, Emanuele G; Bayman, Sean

    2017-05-01

    In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in

  6. An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

    Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

    2011-01-01

    This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…

  7. U.S. Army Armament Research, Development and Engineering Center Grain Evaluation Software to Numerically Predict Linear Burn Regression for Solid Propellant Grain Geometries

    2017-10-01

    ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...distribution is unlimited. AD U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER Munitions Engineering Technology Center Picatinny...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID

  8. Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis

    Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried

    2018-03-01

    This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.

  9. Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis.

    Oguntunde, Philip G; Lischeid, Gunnar; Dietrich, Ottfried

    2018-03-01

    This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease (P  1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.

  10. Precision Interval Estimation of the Response Surface by Means of an Integrated Algorithm of Neural Network and Linear Regression

    Lo, Ching F.

    1999-01-01

    The integration of Radial Basis Function Networks and Back Propagation Neural Networks with the Multiple Linear Regression has been accomplished to map nonlinear response surfaces over a wide range of independent variables in the process of the Modem Design of Experiments. The integrated method is capable to estimate the precision intervals including confidence and predicted intervals. The power of the innovative method has been demonstrated by applying to a set of wind tunnel test data in construction of response surface and estimation of precision interval.

  11. Fragility estimation for seismically isolated nuclear structures by high confidence low probability of failure values and bi-linear regression

    Carausu, A.

    1996-01-01

    A method for the fragility estimation of seismically isolated nuclear power plant structure is proposed. The relationship between the ground motion intensity parameter (e.g. peak ground velocity or peak ground acceleration) and the response of isolated structures is expressed in terms of a bi-linear regression line, whose coefficients are estimated by the least-square method in terms of available data on seismic input and structural response. The notion of high confidence low probability of failure (HCLPF) value is also used for deriving compound fragility curves for coupled subsystems. (orig.)

  12. A multiple linear regression analysis of hot corrosion attack on a series of nickel base turbine alloys

    Barrett, C. A.

    1985-01-01

    Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

  13. Effective Surfactants Blend Concentration Determination for O/W Emulsion Stabilization by Two Nonionic Surfactants by Simple Linear Regression.

    Hassan, A K

    2015-01-01

    In this work, O/W emulsion sets were prepared by using different concentrations of two nonionic surfactants. The two surfactants, tween 80(HLB=15.0) and span 80(HLB=4.3) were used in a fixed proportions equal to 0.55:0.45 respectively. HLB value of the surfactants blends were fixed at 10.185. The surfactants blend concentration is starting from 3% up to 19%. For each O/W emulsion set the conductivity was measured at room temperature (25±2°), 40, 50, 60, 70 and 80°. Applying the simple linear regression least squares method statistical analysis to the temperature-conductivity obtained data determines the effective surfactants blend concentration required for preparing the most stable O/W emulsion. These results were confirmed by applying the physical stability centrifugation testing and the phase inversion temperature range measurements. The results indicated that, the relation which represents the most stable O/W emulsion has the strongest direct linear relationship between temperature and conductivity. This relationship is linear up to 80°. This work proves that, the most stable O/W emulsion is determined via the determination of the maximum R² value by applying of the simple linear regression least squares method to the temperature-conductivity obtained data up to 80°, in addition to, the true maximum slope is represented by the equation which has the maximum R² value. Because the conditions would be changed in a more complex formulation, the method of the determination of the effective surfactants blend concentration was verified by applying it for more complex formulations of 2% O/W miconazole nitrate cream and the results indicate its reproducibility.

  14. Combined genetic algorithm and multiple linear regression (GA-MLR) optimizer: Application to multi-exponential fluorescence decay surface.

    Fisz, Jacek J

    2006-12-07

    The optimization approach based on the genetic algorithm (GA) combined with multiple linear regression (MLR) method, is discussed. The GA-MLR optimizer is designed for the nonlinear least-squares problems in which the model functions are linear combinations of nonlinear functions. GA optimizes the nonlinear parameters, and the linear parameters are calculated from MLR. GA-MLR is an intuitive optimization approach and it exploits all advantages of the genetic algorithm technique. This optimization method results from an appropriate combination of two well-known optimization methods. The MLR method is embedded in the GA optimizer and linear and nonlinear model parameters are optimized in parallel. The MLR method is the only one strictly mathematical "tool" involved in GA-MLR. The GA-MLR approach simplifies and accelerates considerably the optimization process because the linear parameters are not the fitted ones. Its properties are exemplified by the analysis of the kinetic biexponential fluorescence decay surface corresponding to a two-excited-state interconversion process. A short discussion of the variable projection (VP) algorithm, designed for the same class of the optimization problems, is presented. VP is a very advanced mathematical formalism that involves the methods of nonlinear functionals, algebra of linear projectors, and the formalism of Fréchet derivatives and pseudo-inverses. Additional explanatory comments are added on the application of recently introduced the GA-NR optimizer to simultaneous recovery of linear and weakly nonlinear parameters occurring in the same optimization problem together with nonlinear parameters. The GA-NR optimizer combines the GA method with the NR method, in which the minimum-value condition for the quadratic approximation to chi(2), obtained from the Taylor series expansion of chi(2), is recovered by means of the Newton-Raphson algorithm. The application of the GA-NR optimizer to model functions which are multi-linear

  15. A linear 180 nm SOI CMOS antenna switch module using integrated passive device filters for cellular applications

    Jie, Cui; Lei, Chen; Peng, Zhao; Xu, Niu; Yi, Liu

    2014-06-01

    A broadband monolithic linear single pole, eight throw (SP8T) switch has been fabricated in 180 nm thin film silicon-on-insulator (SOI) CMOS technology with a quad-band GSM harmonic filter in integrated passive devices (IPD) technology, which is developed for cellular applications. The antenna switch module (ASM) features 1.2 dB insertion loss with filter on 2G bands and 0.4 dB insertion loss in 3G bands, less than -45 dB isolation and maximum -103 dB intermodulation distortion for mobile front ends by applying distributed architecture and adaptive supply voltage generator.

  16. A linear 180 nm SOI CMOS antenna switch module using integrated passive device filters for cellular applications

    Cui Jie; Chen Lei; Liu Yi; Zhao Peng; Niu Xu

    2014-01-01

    A broadband monolithic linear single pole, eight throw (SP8T) switch has been fabricated in 180 nm thin film silicon-on-insulator (SOI) CMOS technology with a quad-band GSM harmonic filter in integrated passive devices (IPD) technology, which is developed for cellular applications. The antenna switch module (ASM) features 1.2 dB insertion loss with filter on 2G bands and 0.4 dB insertion loss in 3G bands, less than −45 dB isolation and maximum −103 dB intermodulation distortion for mobile front ends by applying distributed architecture and adaptive supply voltage generator. (semiconductor integrated circuits)

  17. Plateletpheresis efficiency and mathematical correction of software-derived platelet yield prediction: A linear regression and ROC modeling approach.

    Jaime-Pérez, José Carlos; Jiménez-Castillo, Raúl Alberto; Vázquez-Hernández, Karina Elizabeth; Salazar-Riojas, Rosario; Méndez-Ramírez, Nereida; Gómez-Almaguer, David

    2017-10-01

    Advances in automated cell separators have improved the efficiency of plateletpheresis and the possibility of obtaining double products (DP). We assessed cell processor accuracy of predicted platelet (PLT) yields with the goal of a better prediction of DP collections. This retrospective proof-of-concept study included 302 plateletpheresis procedures performed on a Trima Accel v6.0 at the apheresis unit of a hematology department. Donor variables, software predicted yield and actual PLT yield were statistically evaluated. Software prediction was optimized by linear regression analysis and its optimal cut-off to obtain a DP assessed by receiver operating characteristic curve (ROC) modeling. Three hundred and two plateletpheresis procedures were performed; in 271 (89.7%) occasions, donors were men and in 31 (10.3%) women. Pre-donation PLT count had the best direct correlation with actual PLT yield (r = 0.486. P Simple correction derived from linear regression analysis accurately corrected this underestimation and ROC analysis identified a precise cut-off to reliably predict a DP. © 2016 Wiley Periodicals, Inc.

  18. Performance of an Axisymmetric Rocket Based Combined Cycle Engine During Rocket Only Operation Using Linear Regression Analysis

    Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.

    1998-01-01

    The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.

  19. pulver: an R package for parallel ultra-rapid p-value computation for linear regression interaction terms.

    Molnos, Sophie; Baumbach, Clemens; Wahl, Simone; Müller-Nurasyid, Martina; Strauch, Konstantin; Wang-Sattler, Rui; Waldenberger, Melanie; Meitinger, Thomas; Adamski, Jerzy; Kastenmüller, Gabi; Suhre, Karsten; Peters, Annette; Grallert, Harald; Theis, Fabian J; Gieger, Christian

    2017-09-29

    Genome-wide association studies allow us to understand the genetics of complex diseases. Human metabolism provides information about the disease-causing mechanisms, so it is usual to investigate the associations between genetic variants and metabolite levels. However, only considering genetic variants and their effects on one trait ignores the possible interplay between different "omics" layers. Existing tools only consider single-nucleotide polymorphism (SNP)-SNP interactions, and no practical tool is available for large-scale investigations of the interactions between pairs of arbitrary quantitative variables. We developed an R package called pulver to compute p-values for the interaction term in a very large number of linear regression models. Comparisons based on simulated data showed that pulver is much faster than the existing tools. This is achieved by using the correlation coefficient to test the null-hypothesis, which avoids the costly computation of inversions. Additional tricks are a rearrangement of the order, when iterating through the different "omics" layers, and implementing this algorithm in the fast programming language C++. Furthermore, we applied our algorithm to data from the German KORA study to investigate a real-world problem involving the interplay among DNA methylation, genetic variants, and metabolite levels. The pulver package is a convenient and rapid tool for screening huge numbers of linear regression models for significant interaction terms in arbitrary pairs of quantitative variables. pulver is written in R and C++, and can be downloaded freely from CRAN at https://cran.r-project.org/web/packages/pulver/ .

  20. Early Parallel Activation of Semantics and Phonology in Picture Naming: Evidence from a Multiple Linear Regression MEG Study.

    Miozzo, Michele; Pulvermüller, Friedemann; Hauk, Olaf

    2015-10-01

    The time course of brain activation during word production has become an area of increasingly intense investigation in cognitive neuroscience. The predominant view has been that semantic and phonological processes are activated sequentially, at about 150 and 200-400 ms after picture onset. Although evidence from prior studies has been interpreted as supporting this view, these studies were arguably not ideally suited to detect early brain activation of semantic and phonological processes. We here used a multiple linear regression approach to magnetoencephalography (MEG) analysis of picture naming in order to investigate early effects of variables specifically related to visual, semantic, and phonological processing. This was combined with distributed minimum-norm source estimation and region-of-interest analysis. Brain activation associated with visual image complexity appeared in occipital cortex at about 100 ms after picture presentation onset. At about 150 ms, semantic variables became physiologically manifest in left frontotemporal regions. In the same latency range, we found an effect of phonological variables in the left middle temporal gyrus. Our results demonstrate that multiple linear regression analysis is sensitive to early effects of multiple psycholinguistic variables in picture naming. Crucially, our results suggest that access to phonological information might begin in parallel with semantic processing around 150 ms after picture onset. © The Author 2014. Published by Oxford University Press.

  1. Multiple linear stepwise regression of liver lipid levels: proton MR spectroscopy study in vivo at 3.0 T

    Xu Li; Liang Changhong; Xiao Yuanqiu; Zhang Zhonglin

    2010-01-01

    Objective: To analyze the correlations between liver lipid level determined by liver 3.0 T 1 H-MRS in vivo and influencing factors using multiple linear stepwise regression. Methods: The prospective study of liver 1 H-MRS was performed with 3.0 T system and eight-channel torso phased-array coils using PRESS sequence. Forty-four volunteers were enrolled in this study. Liver spectra were collected with a TR of 1500 ms, TE of 30 ms, volume of interest of 2 cm×2 cm×2 cm, NSA of 64 times. The acquired raw proton MRS data were processed by using a software program SAGE. For each MRS measurement, using water as the internal reference, the amplitude of the lipid signal was normalized to the sum of the signal from lipid and water to obtain percentage lipid within the liver. The statistical description of height, weight, age and BMI, Line width and water suppression were recorded, and Pearson analysis was applied to test their relationships. Multiple linear stepwise regression was used to set the statistical model for the prediction of Liver lipid content. Results: Age (39.1±12.6) years, body weight (64.4±10.4) kg, BMI (23.3±3.1) kg/m 2 , linewidth (18.9±4.4) and the water suppression (90.7±6.5)% had significant correlation with liver lipid content (0.00 to 0.96%, median 0.02%), r were 0.11, 0.44, 0.40, 0.52, -0.73 respectively (P<0.05). But only age, BMI, line width, and the water suppression entered into the multiple linear regression equation. Liver lipid content prediction equation was as follows: Y= 1.395 - (0.021×water suppression) + (0.022×BMI) + (0.014×line width) - (0.004×age), and the coefficient of determination was 0. 613, corrected coefficient of determination was 0.59. Conclusion: The regression model fitted well, since the variables of age, BMI, width, and water suppression can explain about 60% of liver lipid content changes. (authors)

  2. A Differential 4-Path Highly Linear Widely Tunable On-Chip Band-Pass Filter

    Ghaffari, A.; Klumperink, Eric A.M.; Nauta, Bram

    2010-01-01

    Abstract A passive switched capacitor RF band-pass filter with clock controlled center frequency is realized in 65nm CMOS. An off-chip transformer which acts as a balun, improves filter-Q and realizes impedance matching. The differential architecture reduces clock-leakage and suppresses selectivity

  3. Ultrafast all-optical clock recovery based on phase-only linear optical filtering

    Maram, Reza; Kong, Deming; Galili, Michael

    2014-01-01

    We report on a novel technique for all-optical clock recovery from RZ OOK data based on phase-only filtering, significantly enhancing the recovered clock quality and energy-efficiency compared to the use of a Fabry-Perot filter....

  4. Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study.

    Li, Hongjian; Leung, Kwong-Sak; Wong, Man-Hon; Ballester, Pedro J

    2014-08-27

    State-of-the-art protein-ligand docking methods are generally limited by the traditionally low accuracy of their scoring functions, which are used to predict binding affinity and thus vital for discriminating between active and inactive compounds. Despite intensive research over the years, classical scoring functions have reached a plateau in their predictive performance. These assume a predetermined additive functional form for some sophisticated numerical features, and use standard multivariate linear regression (MLR) on experimental data to derive the coefficients. In this study we show that such a simple functional form is detrimental for the prediction performance of a scoring function, and replacing linear regression by machine learning techniques like random forest (RF) can improve prediction performance. We investigate the conditions of applying RF under various contexts and find that given sufficient training samples RF manages to comprehensively capture the non-linearity between structural features and measured binding affinities. Incorporating more structural features and training with more samples can both boost RF performance. In addition, we analyze the importance of structural features to binding affinity prediction using the RF variable importance tool. Lastly, we use Cyscore, a top performing empirical scoring function, as a baseline for comparison study. Machine-learning scoring functions are fundamentally different from classical scoring functions because the former circumvents the fixed functional form relating structural features with binding affinities. RF, but not MLR, can effectively exploit more structural features and more training samples, leading to higher prediction performance. The future availability of more X-ray crystal structures will further widen the performance gap between RF-based and MLR-based scoring functions. This further stresses the importance of substituting RF for MLR in scoring function development.

  5. The use of artificial neural networks and multiple linear regression to predict rate of medical waste generation

    Jahandideh, Sepideh; Jahandideh, Samad; Asadabadi, Ebrahim Barzegari; Askarian, Mehrdad; Movahedi, Mohammad Mehdi; Hosseini, Somayyeh; Jahandideh, Mina

    2009-01-01

    Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R 2 were used to evaluate performance of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R 2 confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.

  6. Determination of DPPH Radical Oxidation Caused by Methanolic Extracts of Some Microalgal Species by Linear Regression Analysis of Spectrophotometric Measurements

    Ulf-Peter Hansen

    2007-10-01

    Full Text Available The demonstrated modified spectrophotometric method makes use of the 2,2-diphenyl-1-picrylhydrazyl (DPPH radical and its specific absorbance properties. Theabsorbance decreases when the radical is reduced by antioxidants. In contrast to otherinvestigations, the absorbance was measured at a wavelength of 550 nm. This wavelengthenabled the measurements of the stable free DPPH radical without interference frommicroalgal pigments. This approach was applied to methanolic microalgae extracts for twodifferent DPPH concentrations. The changes in absorbance measured vs. the concentrationof the methanolic extract resulted in curves with a linear decrease ending in a saturationregion. Linear regression analysis of the linear part of DPPH reduction versus extractconcentration enabled the determination of the microalgae’s methanolic extractsantioxidative potentials which was independent to the employed DPPH concentrations. Theresulting slopes showed significant differences (6 - 34 μmol DPPH g-1 extractconcentration between the single different species of microalgae (Anabaena sp.,Isochrysis galbana, Phaeodactylum tricornutum, Porphyridium purpureum, Synechocystissp. PCC6803 in their ability to reduce the DPPH radical. The independency of the signal on the DPPH concentration is a valuable advantage over the determination of the EC50 value.

  7. A step-by-step guide to non-linear regression analysis of experimental data using a Microsoft Excel spreadsheet.

    Brown, A M

    2001-06-01

    The objective of this present study was to introduce a simple, easily understood method for carrying out non-linear regression analysis based on user input functions. While it is relatively straightforward to fit data with simple functions such as linear or logarithmic functions, fitting data with more complicated non-linear functions is more difficult. Commercial specialist programmes are available that will carry out this analysis, but these programmes are expensive and are not intuitive to learn. An alternative method described here is to use the SOLVER function of the ubiquitous spreadsheet programme Microsoft Excel, which employs an iterative least squares fitting routine to produce the optimal goodness of fit between data and function. The intent of this paper is to lead the reader through an easily understood step-by-step guide to implementing this method, which can be applied to any function in the form y=f(x), and is well suited to fast, reliable analysis of data in all fields of biology.

  8. Robust Multiple Linear Regression.

    1982-12-01

    difficulty, but it might have more solutions corresponding to local minima. Influence Function of M-Estimates The influence function describes the effect...distributionn n function. In case of M-Estimates the influence function was found to be pro- portional to and given as T(X F)) " C(xpF,T) = .(X.T(F) F(dx...where the inverse of any distribution function F is defined in the usual way as F- (s) = inf{x IF(x) > s) 0<sə Influence Function of L-Estimates In a

  9. Multiple linear regressions

    Abstract. The predictive analysis based on quantitative structure activity relationships (QSAR) on benzim- ... could lead to treatment of obesity, diabetes and related conditions. ..... After discussing the physical and chemical mean- ing of the ...

  10. (Non) linear regression modelling

    Cizek, P.; Gentle, J.E.; Hardle, W.K.; Mori, Y.

    2012-01-01

    We will study causal relationships of a known form between random variables. Given a model, we distinguish one or more dependent (endogenous) variables Y = (Y1,…,Yl), l ∈ N, which are explained by a model, and independent (exogenous, explanatory) variables X = (X1,…,Xp),p ∈ N, which explain or

  11. Modeling the kinetics of essential oil hydrodistillation from juniper berries (Juniperus communis L. using non-linear regression

    Radosavljević Dragana B.

    2017-01-01

    Full Text Available This paper presents kinetics modeling of essential oil hydrodistillation from juniper berries (Juniperus communis L. by using a non-linear regression methodology. The proposed model has the polynomial-logarithmic form. The initial equation of the proposed non-linear model is q = q∞•(a•(logt2 + b•logt + c and by substituting a1=q∞•a, b1 = q∞•b and c1 = q∞•c, the final equation is obtained as q = a1•(logt2 + b1•logt + c1. In this equation q is the quantity of the obtained oil at time t, while a1, b1 and c1 are parameters to be determined for each sample. From the final equation it can be seen that the key parameter q∞, which presents the maximal oil quantity obtained after infinite time, is already included in parameters a1, b1 and c1. In this way, experimental determination of this parameter is avoided. Using the proposed model with parameters obtained by regression, the values of oil hydrodistillation in time are calculated for each sample and compared to the experimental values. In addition, two kinetic models previously proposed in literature were applied to the same experimental results. The developed model provided better agreements with the experimental values than the two, generally accepted kinetic models of this process. The average values of error measures (RSS, RSE, AIC and MRPD obtained for our model (0.005; 0.017; –84.33; 1.65 were generally lower than the corresponding values of the other two models (0.025; 0.041; –53.20; 3.89 and (0.0035; 0.015; –86.83; 1.59. Also, parameter estimation for the proposed model was significantly simpler (maximum 2 iterations per sample using the non-linear regression than that for the existing models (maximum 9 iterations per sample. [Project of the Serbian Ministry of Education, Science and Technological Development, Grant no. TR-35026

  12. Testbed for Multi-Wavelength Optical Code Division Multiplexing Based on Passive Linear Unitary Filters

    Yablonovitch, Eli

    2000-01-01

    .... The equipment purchased under this grant has permitted UCLA to purchase a number of broad-band optical components, including especially some unique code division multiplexing filters that permitted...

  13. Variance-to-mean method generalized by linear difference filter technique

    Hashimoto, Kengo; Ohsaki, Hiroshi; Horiguchi, Tetsuo; Yamane, Yoshihiro; Shiroya, Seiji

    1998-01-01

    The conventional variance-to-mean method (Feynman-α method) seriously suffers the divergency of the variance under such a transient condition as a reactor power drift. Strictly speaking, then, the use of the Feynman-α is restricted to a steady state. To apply the method to more practical uses, it is desirable to overcome this kind of difficulty. For this purpose, we propose an usage of higher-order difference filter technique to reduce the effect of the reactor power drift, and derive several new formulae taking account of the filtering. The capability of the formulae proposed was demonstrated through experiments in the Kyoto University Critical Assembly. The experimental results indicate that the divergency of the variance can be effectively suppressed by the filtering technique, and that the higher-order filter becomes necessary with increasing variation rate in power

  14. Linear versus Nonlinear Filtering with Scale-Selective Corrections for Balanced Dynamics in a Simple Atmospheric Model

    Subramanian, Aneesh C.

    2012-11-01

    This paper investigates the role of the linear analysis step of the ensemble Kalman filters (EnKF) in disrupting the balanced dynamics in a simple atmospheric model and compares it to a fully nonlinear particle-based filter (PF). The filters have a very similar forecast step but the analysis step of the PF solves the full Bayesian filtering problem while the EnKF analysis only applies to Gaussian distributions. The EnKF is compared to two flavors of the particle filter with different sampling strategies, the sequential importance resampling filter (SIRF) and the sequential kernel resampling filter (SKRF). The model admits a chaotic vortical mode coupled to a comparatively fast gravity wave mode. It can also be configured either to evolve on a so-called slow manifold, where the fast motion is suppressed, or such that the fast-varying variables are diagnosed from the slow-varying variables as slaved modes. Identical twin experiments show that EnKF and PF capture the variables on the slow manifold well as the dynamics is very stable. PFs, especially the SKRF, capture slaved modes better than the EnKF, implying that a full Bayesian analysis estimates the nonlinear model variables better. The PFs perform significantly better in the fully coupled nonlinear model where fast and slow variables modulate each other. This suggests that the analysis step in the PFs maintains the balance in both variables much better than the EnKF. It is also shown that increasing the ensemble size generally improves the performance of the PFs but has less impact on the EnKF after a sufficient number of members have been used.

  15. Linear versus Nonlinear Filtering with Scale-Selective Corrections for Balanced Dynamics in a Simple Atmospheric Model

    Subramanian, Aneesh C.; Hoteit, Ibrahim; Cornuelle, Bruce; Miller, Arthur J.; Song, Hajoon

    2012-01-01

    This paper investigates the role of the linear analysis step of the ensemble Kalman filters (EnKF) in disrupting the balanced dynamics in a simple atmospheric model and compares it to a fully nonlinear particle-based filter (PF). The filters have a very similar forecast step but the analysis step of the PF solves the full Bayesian filtering problem while the EnKF analysis only applies to Gaussian distributions. The EnKF is compared to two flavors of the particle filter with different sampling strategies, the sequential importance resampling filter (SIRF) and the sequential kernel resampling filter (SKRF). The model admits a chaotic vortical mode coupled to a comparatively fast gravity wave mode. It can also be configured either to evolve on a so-called slow manifold, where the fast motion is suppressed, or such that the fast-varying variables are diagnosed from the slow-varying variables as slaved modes. Identical twin experiments show that EnKF and PF capture the variables on the slow manifold well as the dynamics is very stable. PFs, especially the SKRF, capture slaved modes better than the EnKF, implying that a full Bayesian analysis estimates the nonlinear model variables better. The PFs perform significantly better in the fully coupled nonlinear model where fast and slow variables modulate each other. This suggests that the analysis step in the PFs maintains the balance in both variables much better than the EnKF. It is also shown that increasing the ensemble size generally improves the performance of the PFs but has less impact on the EnKF after a sufficient number of members have been used.

  16. Real-time prediction and gating of respiratory motion using an extended Kalman filter and Gaussian process regression

    Bukhari, W; Hong, S-M

    2015-01-01

    Motion-adaptive radiotherapy aims to deliver a conformal dose to the target tumour with minimal normal tissue exposure by compensating for tumour motion in real time. The prediction as well as the gating of respiratory motion have received much attention over the last two decades for reducing the targeting error of the treatment beam due to respiratory motion. In this article, we present a real-time algorithm for predicting and gating respiratory motion that utilizes a model-based and a model-free Bayesian framework by combining them in a cascade structure. The algorithm, named EKF-GPR + , implements a gating function without pre-specifying a particular region of the patient’s breathing cycle. The algorithm first employs an extended Kalman filter (LCM-EKF) to predict the respiratory motion and then uses a model-free Gaussian process regression (GPR) to correct the error of the LCM-EKF prediction. The GPR is a non-parametric Bayesian algorithm that yields predictive variance under Gaussian assumptions. The EKF-GPR + algorithm utilizes the predictive variance from the GPR component to capture the uncertainty in the LCM-EKF prediction error and systematically identify breathing points with a higher probability of large prediction error in advance. This identification allows us to pause the treatment beam over such instances. EKF-GPR + implements the gating function by using simple calculations based on the predictive variance with no additional detection mechanism. A sparse approximation of the GPR algorithm is employed to realize EKF-GPR + in real time. Extensive numerical experiments are performed based on a large database of 304 respiratory motion traces to evaluate EKF-GPR + . The experimental results show that the EKF-GPR + algorithm effectively reduces the prediction error in a root-mean-square (RMS) sense by employing the gating function, albeit at the cost of a reduced duty cycle. As an example, EKF-GPR + reduces the patient-wise RMS error to 37%, 39% and 42

  17. Real-time prediction and gating of respiratory motion using an extended Kalman filter and Gaussian process regression

    Bukhari, W.; Hong, S.-M.

    2015-01-01

    Motion-adaptive radiotherapy aims to deliver a conformal dose to the target tumour with minimal normal tissue exposure by compensating for tumour motion in real time. The prediction as well as the gating of respiratory motion have received much attention over the last two decades for reducing the targeting error of the treatment beam due to respiratory motion. In this article, we present a real-time algorithm for predicting and gating respiratory motion that utilizes a model-based and a model-free Bayesian framework by combining them in a cascade structure. The algorithm, named EKF-GPR+, implements a gating function without pre-specifying a particular region of the patient’s breathing cycle. The algorithm first employs an extended Kalman filter (LCM-EKF) to predict the respiratory motion and then uses a model-free Gaussian process regression (GPR) to correct the error of the LCM-EKF prediction. The GPR is a non-parametric Bayesian algorithm that yields predictive variance under Gaussian assumptions. The EKF-GPR+ algorithm utilizes the predictive variance from the GPR component to capture the uncertainty in the LCM-EKF prediction error and systematically identify breathing points with a higher probability of large prediction error in advance. This identification allows us to pause the treatment beam over such instances. EKF-GPR+ implements the gating function by using simple calculations based on the predictive variance with no additional detection mechanism. A sparse approximation of the GPR algorithm is employed to realize EKF-GPR+ in real time. Extensive numerical experiments are performed based on a large database of 304 respiratory motion traces to evaluate EKF-GPR+. The experimental results show that the EKF-GPR+ algorithm effectively reduces the prediction error in a root-mean-square (RMS) sense by employing the gating function, albeit at the cost of a reduced duty cycle. As an example, EKF-GPR+ reduces the patient-wise RMS error to 37%, 39% and 42% in

  18. Real-time prediction and gating of respiratory motion using an extended Kalman filter and Gaussian process regression.

    Bukhari, W; Hong, S-M

    2015-01-07

    Motion-adaptive radiotherapy aims to deliver a conformal dose to the target tumour with minimal normal tissue exposure by compensating for tumour motion in real time. The prediction as well as the gating of respiratory motion have received much attention over the last two decades for reducing the targeting error of the treatment beam due to respiratory motion. In this article, we present a real-time algorithm for predicting and gating respiratory motion that utilizes a model-based and a model-free Bayesian framework by combining them in a cascade structure. The algorithm, named EKF-GPR(+), implements a gating function without pre-specifying a particular region of the patient's breathing cycle. The algorithm first employs an extended Kalman filter (LCM-EKF) to predict the respiratory motion and then uses a model-free Gaussian process regression (GPR) to correct the error of the LCM-EKF prediction. The GPR is a non-parametric Bayesian algorithm that yields predictive variance under Gaussian assumptions. The EKF-GPR(+) algorithm utilizes the predictive variance from the GPR component to capture the uncertainty in the LCM-EKF prediction error and systematically identify breathing points with a higher probability of large prediction error in advance. This identification allows us to pause the treatment beam over such instances. EKF-GPR(+) implements the gating function by using simple calculations based on the predictive variance with no additional detection mechanism. A sparse approximation of the GPR algorithm is employed to realize EKF-GPR(+) in real time. Extensive numerical experiments are performed based on a large database of 304 respiratory motion traces to evaluate EKF-GPR(+). The experimental results show that the EKF-GPR(+) algorithm effectively reduces the prediction error in a root-mean-square (RMS) sense by employing the gating function, albeit at the cost of a reduced duty cycle. As an example, EKF-GPR(+) reduces the patient-wise RMS error to 37%, 39% and

  19. Development of a new linearly variable edge filter (LVEF)-based compact slit-less mini-spectrometer

    Mahmoud, Khaled; Park, Seongchong; Lee, Dong-Hoon

    2018-02-01

    This paper presents the development of a compact charge-coupled detector (CCD) spectrometer. We describe the design, concept and characterization of VNIR linear variable edge filter (LVEF)- based mini-spectrometer. The new instrument has been realized for operation in the 300 nm to 850 nm wavelength range. The instrument consists of a linear variable edge filter in front of CCD array. Low-size, light-weight and low-cost could be achieved using the linearly variable filters with no need to use any moving parts for wavelength selection as in the case of commercial spectrometers available in the market. This overview discusses the main components characteristics, the main concept with the main advantages and limitations reported. Experimental characteristics of the LVEFs are described. The mathematical approach to get the position-dependent slit function of the presented prototype spectrometer and its numerical de-convolution solution for a spectrum reconstruction is described. The performance of our prototype instrument is demonstrated by measuring the spectrum of a reference light source.

  20. Testing the macroeconomic impact of the budget deficit in EU Member States using linear regression with fixed effects

    Dalian Marius DORAN

    2017-11-01

    Full Text Available The article aims to research impact of budget balance, whether surplus or deficit, on the main indicator characterizing the economic growth of a country, namely GDP and the inflation rate in the 27 European Union Member States and the United Kingdom. For this analysis was used panel data, taking into account the period from 2001 to 2015. The method used for the analysis is the linear regression with fixed effects and with Driscoll-Kraay standard errors. The dependent variables are the growth rate of real GDP and the inflation rate, and the independent variable is the budget balance (surplus or deficit. The results obtained after using econometric software Stata shows a positive impact of budget balance on growth in the European Union for the analyzed period.