WorldWideScience

Sample records for dependent variable regression

  1. Multivariate Regression with Monotone Missing Observation of the Dependent Variables

    NARCIS (Netherlands)

    Raats, V.M.; van der Genugten, B.B.; Moors, J.J.A.

    2002-01-01

    Multivariate regression is discussed, where the observations of the dependent variables are (monotone) missing completely at random; the explanatory variables are assumed to be completely observed.We discuss OLS-, GLS- and a certain form of E(stimated) GLS-estimation.It turns out that

  2. A Maximum Likelihood Method for Latent Class Regression Involving a Censored Dependent Variable.

    Science.gov (United States)

    Jedidi, Kamel; And Others

    1993-01-01

    A method is proposed to simultaneously estimate regression functions and subject membership in "k" latent classes or groups given a censored dependent variable for a cross-section of subjects. Maximum likelihood estimates are obtained using an EM algorithm. The method is illustrated through a consumer psychology application. (SLD)

  3. [Multiple dependent variables LS-SVM regression algorithm and its application in NIR spectral quantitative analysis].

    Science.gov (United States)

    An, Xin; Xu, Shuo; Zhang, Lu-Da; Su, Shi-Guang

    2009-01-01

    In the present paper, on the basis of LS-SVM algorithm, we built a multiple dependent variables LS-SVM (MLS-SVM) regression model whose weights can be optimized, and gave the corresponding algorithm. Furthermore, we theoretically explained the relationship between MLS-SVM and LS-SVM. Sixty four broomcorn samples were taken as experimental material, and the sample ratio of modeling set to predicting set was 51 : 13. We first selected randomly and uniformly five weight groups in the interval [0, 1], and then in the way of leave-one-out (LOO) rule determined one appropriate weight group and parameters including penalizing parameters and kernel parameters in the model according to the criterion of the minimum of average relative error. Then a multiple dependent variables quantitative analysis model was built with NIR spectrum and simultaneously analyzed three chemical constituents containing protein, lysine and starch. Finally, the average relative errors between actual values and predicted ones by the model of three components for the predicting set were 1.65%, 6.47% and 1.37%, respectively, and the correlation coefficients were 0.9940, 0.8392 and 0.8825, respectively. For comparison, LS-SVM was also utilized, for which the average relative errors were 1.68%, 6.25% and 1.47%, respectively, and the correlation coefficients were 0.9941, 0.8310 and 0.8800, respectively. It is obvious that MLS-SVM algorithm is comparable to LS-SVM algorithm in modeling analysis performance, and both of them can give satisfying results. The result shows that the model with MLS-SVM algorithm is capable of doing multi-components NIR quantitative analysis synchronously. Thus MLS-SVM algorithm offers a new multiple dependent variables quantitative analysis approach for chemometrics. In addition, the weights have certain effect on the prediction performance of the model with MLS-SVM, which is consistent with our intuition and is validated in this study. Therefore, it is necessary to optimize

  4. Do drug treatment variables predict cognitive performance in multidrug-treated opioid-dependent patients? A regression analysis study

    Directory of Open Access Journals (Sweden)

    Rapeli Pekka

    2012-11-01

    Full Text Available Abstract Background Cognitive deficits and multiple psychoactive drug regimens are both common in patients treated for opioid-dependence. Therefore, we examined whether the cognitive performance of patients in opioid-substitution treatment (OST is associated with their drug treatment variables. Methods Opioid-dependent patients (N = 104 who were treated either with buprenorphine or methadone (n = 52 in both groups were given attention, working memory, verbal, and visual memory tests after they had been a minimum of six months in treatment. Group-wise results were analysed by analysis of variance. Predictors of cognitive performance were examined by hierarchical regression analysis. Results Buprenorphine-treated patients performed statistically significantly better in a simple reaction time test than methadone-treated ones. No other significant differences between groups in cognitive performance were found. In each OST drug group, approximately 10% of the attention performance could be predicted by drug treatment variables. Use of benzodiazepine medication predicted about 10% of performance variance in working memory. Treatment with more than one other psychoactive drug (than opioid or BZD and frequent substance abuse during the past month predicted about 20% of verbal memory performance. Conclusions Although this study does not prove a causal relationship between multiple prescription drug use and poor cognitive functioning, the results are relevant for psychosocial recovery, vocational rehabilitation, and psychological treatment of OST patients. Especially for patients with BZD treatment, other treatment options should be actively sought.

  5. Implementing Variable Selection Techniques in Regression.

    Science.gov (United States)

    Thayer, Jerome D.

    Variable selection techniques in stepwise regression analysis are discussed. In stepwise regression, variables are added or deleted from a model in sequence to produce a final "good" or "best" predictive model. Stepwise computer programs are discussed and four different variable selection strategies are described. These…

  6. Variable and subset selection in PLS regression

    DEFF Research Database (Denmark)

    Høskuldsson, Agnar

    2001-01-01

    The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion...... is that variable selection is important for successful analysis of chemometric data. An important aspect of the results presented is that lack of variable selection can spoil the PLS regression, and that cross-validation measures using a test set can show larger variation, when we use different subsets of X, than...

  7. Density dependence and climate effects in Rocky Mountain elk: an application of regression with instrumental variables for population time series with sampling error.

    Science.gov (United States)

    Creel, Scott; Creel, Michael

    2009-11-01

    1. Sampling error in annual estimates of population size creates two widely recognized problems for the analysis of population growth. First, if sampling error is mistakenly treated as process error, one obtains inflated estimates of the variation in true population trajectories (Staples, Taper & Dennis 2004). Second, treating sampling error as process error is thought to overestimate the importance of density dependence in population growth (Viljugrein et al. 2005; Dennis et al. 2006). 2. In ecology, state-space models are used to account for sampling error when estimating the effects of density and other variables on population growth (Staples et al. 2004; Dennis et al. 2006). In econometrics, regression with instrumental variables is a well-established method that addresses the problem of correlation between regressors and the error term, but requires fewer assumptions than state-space models (Davidson & MacKinnon 1993; Cameron & Trivedi 2005). 3. We used instrumental variables to account for sampling error and fit a generalized linear model to 472 annual observations of population size for 35 Elk Management Units in Montana, from 1928 to 2004. We compared this model with state-space models fit with the likelihood function of Dennis et al. (2006). We discuss the general advantages and disadvantages of each method. Briefly, regression with instrumental variables is valid with fewer distributional assumptions, but state-space models are more efficient when their distributional assumptions are met. 4. Both methods found that population growth was negatively related to population density and winter snow accumulation. Summer rainfall and wolf (Canis lupus) presence had much weaker effects on elk (Cervus elaphus) dynamics [though limitation by wolves is strong in some elk populations with well-established wolf populations (Creel et al. 2007; Creel & Christianson 2008)]. 5. Coupled with predictions for Montana from global and regional climate models, our results

  8. Variable Selection in Logistic Regression Mo del

    Institute of Scientific and Technical Information of China (English)

    ZHANG Shangli; ZHANG Lili; QIU Kuanmin; LU Ying; CAI Baigen

    2015-01-01

    Variable selection is one of the most impor-tant problems in pattern recognition. In linear regression model, there are many methods can solve this problem, such as Least absolute shrinkage and selection operator (LASSO) and many improved LASSO methods, but there are few variable selection methods in generalized linear models. We study the variable selection problem in logis-tic regression model. We propose a new variable selection method–the logistic elastic net, prove that it has grouping eff ect which means that the strongly correlated predictors tend to be in or out of the model together. The logistic elastic net is particularly useful when the number of pre-dictors (p) is much bigger than the number of observations (n). By contrast, the LASSO is not a very satisfactory vari-able selection method in the case when p is more larger than n. The advantage and eff ectiveness of this method are demonstrated by real leukemia data and a simulation study.

  9. Regression analysis using dependent Polya trees.

    Science.gov (United States)

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  10. Logistic regression when binary predictor variables are highly correlated.

    Science.gov (United States)

    Barker, L; Brown, C

    Standard logistic regression can produce estimates having large mean square error when predictor variables are multicollinear. Ridge regression and principal components regression can reduce the impact of multicollinearity in ordinary least squares regression. Generalizations of these, applicable in the logistic regression framework, are alternatives to standard logistic regression. It is shown that estimates obtained via ridge and principal components logistic regression can have smaller mean square error than estimates obtained through standard logistic regression. Recommendations for choosing among standard, ridge and principal components logistic regression are developed. Published in 2001 by John Wiley & Sons, Ltd.

  11. A Spline Regression Model for Latent Variables

    Science.gov (United States)

    Harring, Jeffrey R.

    2014-01-01

    Spline (or piecewise) regression models have been used in the past to account for patterns in observed data that exhibit distinct phases. The changepoint or knot marking the shift from one phase to the other, in many applications, is an unknown parameter to be estimated. As an extension of this framework, this research considers modeling the…

  12. Binary outcome variables and logistic regression models

    Institute of Scientific and Technical Information of China (English)

    Xinhua LIU

    2011-01-01

    Biomedical researchers often study binary variables that indicate whether or not a specific event,such as remission of depression symptoms,occurs during the study period.The indicator variable Y takes two values,usually coded as one if the event (remission) is present and zero if the event is not present(non-remission).Let p be the probability that the event occurs ( Y =1),then 1-p will be the probability that the event does not occur ( Y =0).

  13. Regression calibration with more surrogates than mismeasured variables

    KAUST Repository

    Kipnis, Victor

    2012-06-29

    In a recent paper (Weller EA, Milton DK, Eisen EA, Spiegelman D. Regression calibration for logistic regression with multiple surrogates for one exposure. Journal of Statistical Planning and Inference 2007; 137: 449-461), the authors discussed fitting logistic regression models when a scalar main explanatory variable is measured with error by several surrogates, that is, a situation with more surrogates than variables measured with error. They compared two methods of adjusting for measurement error using a regression calibration approximate model as if it were exact. One is the standard regression calibration approach consisting of substituting an estimated conditional expectation of the true covariate given observed data in the logistic regression. The other is a novel two-stage approach when the logistic regression is fitted to multiple surrogates, and then a linear combination of estimated slopes is formed as the estimate of interest. Applying estimated asymptotic variances for both methods in a single data set with some sensitivity analysis, the authors asserted superiority of their two-stage approach. We investigate this claim in some detail. A troubling aspect of the proposed two-stage method is that, unlike standard regression calibration and a natural form of maximum likelihood, the resulting estimates are not invariant to reparameterization of nuisance parameters in the model. We show, however, that, under the regression calibration approximation, the two-stage method is asymptotically equivalent to a maximum likelihood formulation, and is therefore in theory superior to standard regression calibration. However, our extensive finite-sample simulations in the practically important parameter space where the regression calibration model provides a good approximation failed to uncover such superiority of the two-stage method. We also discuss extensions to different data structures.

  14. Interpreting Multiple Linear Regression: A Guidebook of Variable Importance

    Science.gov (United States)

    Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim

    2012-01-01

    Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…

  15. Automatic identification of variables in epidemiological datasets using logic regression.

    Science.gov (United States)

    Lorenz, Matthias W; Abdi, Negin Ashtiani; Scheckenbach, Frank; Pflug, Anja; Bülbül, Alpaslan; Catapano, Alberico L; Agewall, Stefan; Ezhov, Marat; Bots, Michiel L; Kiechl, Stefan; Orth, Andreas

    2017-04-13

    For an individual participant data (IPD) meta-analysis, multiple datasets must be transformed in a consistent format, e.g. using uniform variable names. When large numbers of datasets have to be processed, this can be a time-consuming and error-prone task. Automated or semi-automated identification of variables can help to reduce the workload and improve the data quality. For semi-automation high sensitivity in the recognition of matching variables is particularly important, because it allows creating software which for a target variable presents a choice of source variables, from which a user can choose the matching one, with only low risk of having missed a correct source variable. For each variable in a set of target variables, a number of simple rules were manually created. With logic regression, an optimal Boolean combination of these rules was searched for every target variable, using a random subset of a large database of epidemiological and clinical cohort data (construction subset). In a second subset of this database (validation subset), this optimal combination rules were validated. In the construction sample, 41 target variables were allocated on average with a positive predictive value (PPV) of 34%, and a negative predictive value (NPV) of 95%. In the validation sample, PPV was 33%, whereas NPV remained at 94%. In the construction sample, PPV was 50% or less in 63% of all variables, in the validation sample in 71% of all variables. We demonstrated that the application of logic regression in a complex data management task in large epidemiological IPD meta-analyses is feasible. However, the performance of the algorithm is poor, which may require backup strategies.

  16. Integrating models that depend on variable data

    Science.gov (United States)

    Banks, A. T.; Hill, M. C.

    2016-12-01

    Models of human-Earth systems are often developed with the goal of predicting the behavior of one or more dependent variables from multiple independent variables, processes, and parameters. Often dependent variable values range over many orders of magnitude, which complicates evaluation of the fit of the dependent variable values to observations. Many metrics and optimization methods have been proposed to address dependent variable variability, with little consensus being achieved. In this work, we evaluate two such methods: log transformation (based on the dependent variable being log-normally distributed with a constant variance) and error-based weighting (based on a multi-normal distribution with variances that tend to increase as the dependent variable value increases). Error-based weighting has the advantage of encouraging model users to carefully consider data errors, such as measurement and epistemic errors, while log-transformations can be a black box for typical users. Placing the log-transformation into the statistical perspective of error-based weighting has not formerly been considered, to the best of our knowledge. To make the evaluation as clear and reproducible as possible, we use multiple linear regression (MLR). Simulations are conducted with MatLab. The example represents stream transport of nitrogen with up to eight independent variables. The single dependent variable in our example has values that range over 4 orders of magnitude. Results are applicable to any problem for which individual or multiple data types produce a large range of dependent variable values. For this problem, the log transformation produced good model fit, while some formulations of error-based weighting worked poorly. Results support previous suggestions fthat error-based weighting derived from a constant coefficient of variation overemphasizes low values and degrades model fit to high values. Applying larger weights to the high values is inconsistent with the log

  17. Model and Variable Selection Procedures for Semiparametric Time Series Regression

    Directory of Open Access Journals (Sweden)

    Risa Kato

    2009-01-01

    Full Text Available Semiparametric regression models are very useful for time series analysis. They facilitate the detection of features resulting from external interventions. The complexity of semiparametric models poses new challenges for issues of nonparametric and parametric inference and model selection that frequently arise from time series data analysis. In this paper, we propose penalized least squares estimators which can simultaneously select significant variables and estimate unknown parameters. An innovative class of variable selection procedure is proposed to select significant variables and basis functions in a semiparametric model. The asymptotic normality of the resulting estimators is established. Information criteria for model selection are also proposed. We illustrate the effectiveness of the proposed procedures with numerical simulations.

  18. Integrated Multiscale Latent Variable Regression and Application to Distillation Columns

    Directory of Open Access Journals (Sweden)

    Muddu Madakyaru

    2013-01-01

    Full Text Available Proper control of distillation columns requires estimating some key variables that are challenging to measure online (such as compositions, which are usually estimated using inferential models. Commonly used inferential models include latent variable regression (LVR techniques, such as principal component regression (PCR, partial least squares (PLS, and regularized canonical correlation analysis (RCCA. Unfortunately, measured practical data are usually contaminated with errors, which degrade the prediction abilities of inferential models. Therefore, noisy measurements need to be filtered to enhance the prediction accuracy of these models. Multiscale filtering has been shown to be a powerful feature extraction tool. In this work, the advantages of multiscale filtering are utilized to enhance the prediction accuracy of LVR models by developing an integrated multiscale LVR (IMSLVR modeling algorithm that integrates modeling and feature extraction. The idea behind the IMSLVR modeling algorithm is to filter the process data at different decomposition levels, model the filtered data from each level, and then select the LVR model that optimizes a model selection criterion. The performance of the developed IMSLVR algorithm is illustrated using three examples, one using synthetic data, one using simulated distillation column data, and one using experimental packed bed distillation column data. All examples clearly demonstrate the effectiveness of the IMSLVR algorithm over the conventional methods.

  19. A-Collapsibility of Distribution Dependence and Quantile Regression Coefficients

    CERN Document Server

    Meerschaert, Mark M

    2010-01-01

    The Yule-Simpson paradox notes that an association between random variables can be reversed when averaged over a background variable. Cox and Wermuth (2003) introduced the concept of distribution dependence between two random variables X and Y , and developed two dependence conditions, each of which guarantees that reversal cannot occur. Ma, Xie and Geng (2006) studied the collapsibility of distribution dependence over a background variable W, under a rather strong homogeneity condition. Collapsibility ensures the association remains the same for conditional and marginal models, so that Yule-Simpson reversal cannot occur. In this paper, we investigate a more general condition for avoiding e?ect reversal: A-collapsibility. The conditions of Cox and Wermuth imply A-collapsibility, without assuming homogeneity. In fact, we show that, when W is a binary variable, collapsibility is equivalent to A-collapsibility plus homogeneity, and A-collapsibility is equivalent to the conditions of Cox and Wermuth. Recently, Co...

  20. Two-step variable selection in quantile regression models

    Directory of Open Access Journals (Sweden)

    FAN Yali

    2015-06-01

    Full Text Available We propose a two-step variable selection procedure for high dimensional quantile regressions,in which the dimension of the covariates, pn is much larger than the sample size n. In the first step, we perform l1 penalty, and we demonstrate that the first step penalized estimator with the LASSO penalty can reduce the model from an ultra-high dimensional to a model whose size has the same order as that of the true model, and the selected model can cover the true model. The second step excludes the remained irrelevant covariates by applying the adaptive LASSO penalty to the reduced model obtained from the first step. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. We conduct a simulation study and a real data analysis to evaluate the finite sample performance of the proposed approach.

  1. Mixed-model Regression for Variable-star Photometry

    Science.gov (United States)

    Dose, Eric

    2016-05-01

    Mixed-model regression, a recent advance from social-science statistics, applies directly to reducing one night's photometric raw data, especially for variable stars in fields with multiple comparison stars. One regression model per filter/passband yields any or all of: transform values, extinction values, nightly zero-points, rapid zero-point fluctuations ("cirrus effect"), ensemble comparisons, vignette and gradient removal arising from incomplete flat-correction, check-star and target-star magnitudes, and specific indications of unusually large catalog magnitude errors. When images from several different fields of view are included, the models improve without complicating the calculations. The mixed-model approach is generally robust to outliers and missing data points, and it directly yields 14 diagnostic plots, used to monitor data set quality and/or residual systematic errors - these diagnostic plots may in fact turn out to be the prime advantage of this approach. Also presented is initial work on a split-annulus approach to sky background estimation, intended to address the sensitivity of photometric observations to noise within the sky-background annulus.

  2. Generalized linear models for categorical and continuous limited dependent variables

    CERN Document Server

    Smithson, Michael

    2013-01-01

    Introduction and OverviewThe Nature of Limited Dependent VariablesOverview of GLMsEstimation Methods and Model EvaluationOrganization of This BookDiscrete VariablesBinary VariablesLogistic RegressionThe Binomial GLMEstimation Methods and IssuesAnalyses in R and StataExercisesNominal Polytomous VariablesMultinomial Logit ModelConditional Logit and Choice ModelsMultinomial Processing Tree ModelsEstimation Methods and Model EvaluationAnalyses in R and StataExercisesOrdinal Categorical VariablesModeling Ordinal Variables: Common Practice versus Best PracticeOrdinal Model AlternativesCumulative Mod

  3. Measurement error in the explanatory variable of a binary regression: regression calibration and integrated conditional likelihood in studies of residential radon and lung cancer.

    Science.gov (United States)

    Fearn, T; Hill, D C; Darby, S C

    2008-05-30

    In epidemiology, one approach to investigating the dependence of disease risk on an explanatory variable in the presence of several confounding variables is by fitting a binary regression using a conditional likelihood, thus eliminating the nuisance parameters. When the explanatory variable is measured with error, the estimated regression coefficient is biased usually towards zero. Motivated by the need to correct for this bias in analyses that combine data from a number of case-control studies of lung cancer risk associated with exposure to residential radon, two approaches are investigated. Both employ the conditional distribution of the true explanatory variable given the measured one. The method of regression calibration uses the expected value of the true given measured variable as the covariate. The second approach integrates the conditional likelihood numerically by sampling from the distribution of the true given measured explanatory variable. The two approaches give very similar point estimates and confidence intervals not only for the motivating example but also for an artificial data set with known properties. These results and some further simulations that demonstrate correct coverage for the confidence intervals suggest that for studies of residential radon and lung cancer the regression calibration approach will perform very well, so that nothing more sophisticated is needed to correct for measurement error.

  4. Regression Discontinuity Designs with Multiple Rating-Score Variables

    Science.gov (United States)

    Reardon, Sean F.; Robinson, Joseph P.

    2012-01-01

    In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those…

  5. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis

    Directory of Open Access Journals (Sweden)

    Maarten van Smeden

    2016-11-01

    Full Text Available Abstract Background Ten events per variable (EPV is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. Methods The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth’s correction, are compared. Results The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect (‘separation’. We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth’s correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. Conclusions The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  6. Unitary Response Regression Models

    Science.gov (United States)

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  7. VARIABLE SELECTION BY PSEUDO WAVELETS IN HETEROSCEDASTIC REGRESSION MODELS INVOLVING TIME SERIES

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    A simple but efficient method has been proposed to select variables in heteroscedastic regression models. It is shown that the pseudo empirical wavelet coefficients corresponding to the significant explanatory variables in the regression models are clearly larger than those nonsignificant ones, on the basis of which a procedure is developed to select variables in regression models. The coefficients of the models are also estimated. All estimators are proved to be consistent.

  8. Standardizing effect size from linear regression models with log-transformed variables for meta-analysis.

    Science.gov (United States)

    Rodríguez-Barranco, Miguel; Tobías, Aurelio; Redondo, Daniel; Molina-Portillo, Elena; Sánchez, María José

    2017-03-17

    Meta-analysis is very useful to summarize the effect of a treatment or a risk factor for a given disease. Often studies report results based on log-transformed variables in order to achieve the principal assumptions of a linear regression model. If this is the case for some, but not all studies, the effects need to be homogenized. We derived a set of formulae to transform absolute changes into relative ones, and vice versa, to allow including all results in a meta-analysis. We applied our procedure to all possible combinations of log-transformed independent or dependent variables. We also evaluated it in a simulation based on two variables either normally or asymmetrically distributed. In all the scenarios, and based on different change criteria, the effect size estimated by the derived set of formulae was equivalent to the real effect size. To avoid biased estimates of the effect, this procedure should be used with caution in the case of independent variables with asymmetric distributions that significantly differ from the normal distribution. We illustrate an application of this procedure by an application to a meta-analysis on the potential effects on neurodevelopment in children exposed to arsenic and manganese. The procedure proposed has been shown to be valid and capable of expressing the effect size of a linear regression model based on different change criteria in the variables. Homogenizing the results from different studies beforehand allows them to be combined in a meta-analysis, independently of whether the transformations had been performed on the dependent and/or independent variables.

  9. Cross Validation of Selection of Variables in Multiple Regression.

    Science.gov (United States)

    1979-12-01

    Bomber IBMNAV * BOMNAV Navigation-Cargo * * CARNAV Sensory-Fighter * SF FGTSEN Sensory - Bomber * SB BOMSEN Communication - Fighter IFGCOM CF FGTCOM...of Variables Variable No. Recode FGTNAV 1 0 LESS THAN 1 1 OR OVER BONNAV 2 0 LESS THAN S1 OR OVER CARNAV 3 0 LESS THAN S1 OR OVER FGTSEN 4 0 LESS THAN...cc x x x x x x x CARNAV X X X X X X x XMTR x X X X X x PD X x X X X X UP x- *Those which AID determined. 44 This value was lowered to 3 in the

  10. Energy-dependent variability from accretion flows

    OpenAIRE

    Zdziarski, Andrzej A.

    2005-01-01

    We develop a formalism to calculate energy-dependent fractional variability (rms) in accretion flows. We consider rms spectra resulting from radial dependencies of the level of local variability (as expected from propagation of disturbances in accretion flows) assuming the constant shape of the spectrum emitted at a given radius. We consider the cases when the variability of the flow is either coherent or incoherent between different radial zones. As example local emission, we consider blackb...

  11. Analyzing Regression-Discontinuity Designs with Multiple Assignment Variables: A Comparative Study of Four Estimation Methods

    Science.gov (United States)

    Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.

    2013-01-01

    In a traditional regression-discontinuity design (RDD), units are assigned to treatment on the basis of a cutoff score and a continuous assignment variable. The treatment effect is measured at a single cutoff location along the assignment variable. This article introduces the multivariate regression-discontinuity design (MRDD), where multiple…

  12. The number of subjects per variable required in linear regression analyses

    NARCIS (Netherlands)

    P.C. Austin (Peter); E.W. Steyerberg (Ewout)

    2015-01-01

    textabstractObjectives To determine the number of independent variables that can be included in a linear regression model. Study Design and Setting We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression c

  13. An improved strategy for regression of biophysical variables and Landsat ETM+ data.

    Science.gov (United States)

    Warren B. Cohen; Thomas K. Maiersperger; Stith T. Gower; David P. Turner

    2003-01-01

    Empirical models are important tools for relating field-measured biophysical variables to remote sensing data. Regression analysis has been a popular empirical method of linking these two types of data to provide continuous estimates for variables such as biomass, percent woody canopy cover, and leaf area index (LAI). Traditional methods of regression are not...

  14. Joint Bayesian variable and graph selection for regression models with network-structured predictors.

    Science.gov (United States)

    Peterson, Christine B; Stingo, Francesco C; Vannucci, Marina

    2016-03-30

    In this work, we develop a Bayesian approach to perform selection of predictors that are linked within a network. We achieve this by combining a sparse regression model relating the predictors to a response variable with a graphical model describing conditional dependencies among the predictors. The proposed method is well-suited for genomic applications because it allows the identification of pathways of functionally related genes or proteins that impact an outcome of interest. In contrast to previous approaches for network-guided variable selection, we infer the network among predictors using a Gaussian graphical model and do not assume that network information is available a priori. We demonstrate that our method outperforms existing methods in identifying network-structured predictors in simulation settings and illustrate our proposed model with an application to inference of proteins relevant to glioblastoma survival.

  15. A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design.

    Science.gov (United States)

    Meaney, Christopher; Moineddin, Rahim

    2014-01-24

    In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the

  16. Limited dependent variable models for panel data

    NARCIS (Netherlands)

    Charlier, E.

    1997-01-01

    Many economic phenomena require limited variable models for an appropriate treatment. In addition, panel data models allow the inclusion of unobserved individual-specific effects. These models are combined in this thesis. Distributional assumptions in the limited dependent variable models are

  17. Regressions during reading: The cost depends on the cause.

    Science.gov (United States)

    Eskenazi, Michael A; Folk, Jocelyn R

    2016-11-21

    The direction and duration of eye movements during reading is predominantly determined by cognitive and linguistic processing, but some low-level oculomotor effects also influence the duration and direction of eye movements. One such effect is inhibition of return (IOR), which results in an increased latency to return attention to a target that has been previously attended (Posner & Cohen, Attention and Performance X: Control of Language Processes, 32, 531-556, 1984). Although this is a low level effect, it has also been found in the complex task of reading (Henderson & Luke, Psychonomic Bulletin & Review, 19(6), 1101-1107, 2012; Rayner, Juhasz, Ashby, & Clifton, Vision Research, 43(9), 1027-1034, 2003). The purpose of the current study was to isolate the potentially different causes of regressive eye movements: to adjust for oculomotor error and to assist with comprehension difficulties. We found that readers demonstrated an IOR effect when regressions were caused by oculomotor error, but not when regressions were caused by comprehension difficulties. The results suggest that IOR is primarily associated with low-level oculomotor control of eye movements, and that regressive eye movements that are controlled by comprehension processes are not subject to IOR effects. The results have implications for understanding the relationship between oculomotor and cognitive control of eye movements and for models of eye movement control.

  18. Evolution variable dependence of jet substructure

    CERN Document Server

    Sakaki, Yasuhito

    2015-01-01

    Studies on jet substructure have evolved significantly in recent years. Jet substructure is essentially determined by QCD radiations and non-perturbative effects. Predictions of jet substructure are usually different among Monte Carlo event generators, and are governed by the parton shower algorithm implemented. For leading logarithmic parton shower, even though one of the core variables is the evolution variable, its choice is not unique. We examine evolution variable dependence of the jet substructure by developing a parton shower generator that interpolates between different evolution variables using a parameter $\\alpha$. Jet shape variables and associated jet rates for quark and gluon jets are used to demonstrate the $\\alpha$-dependence of the jet substructure. We find angular ordered shower predicts wider jets, while relative transverse momentum ($p_{\\bot}$) ordered shower predicts narrower jets. This is qualitatively in agreement with the missing phase space of $p_{\\bot}$ ordered showers. Such differenc...

  19. A comparison of various methods for multivariate regression with highly collinear variables

    NARCIS (Netherlands)

    Kiers, Henk A.L.; Smilde, Age K.

    2007-01-01

    Regression tends to give very unstable and unreliable regression weights when predictors are highly collinear. Several methods have been proposed to counter this problem. A subset of these do so by finding components that summarize the information in the predictors and the criterion variables. The p

  20. The Detection and Interpretation of Interaction Effects between Continuous Variables in Multiple Regression.

    Science.gov (United States)

    Jaccard, James; And Others

    1990-01-01

    Issues in the detection and interpretation of interaction effects between quantitative variables in multiple regression analysis are discussed. Recent discussions associated with problems of multicollinearity are reviewed in the context of the conditional nature of multiple regression with product terms. (TJH)

  1. A comparison of various methods for multivariate regression with highly collinear variables

    NARCIS (Netherlands)

    Kiers, Henk A.L.; Smilde, Age K.

    2007-01-01

    Regression tends to give very unstable and unreliable regression weights when predictors are highly collinear. Several methods have been proposed to counter this problem. A subset of these do so by finding components that summarize the information in the predictors and the criterion variables. The p

  2. Multiple linear regression with correlations among the predictor variables. Theory and computer algorithm ridge (FORTRAN 77)

    Science.gov (United States)

    van Gaans, P. F. M.; Vriend, S. P.

    Application of ridge regression in geoscience usually is a more appropriate technique than ordinary least-squares regression, especially in the situation of highly intercorrelated predictor variables. A FORTRAN 77 program RIDGE for ridged multiple linear regression is presented. The theory of linear regression and ridge regression is treated, to allow for a careful interpretation of the results and to understand the structure of the program. The program gives various parameters to evaluate the extent of multicollinearity within a given regression problem, such as the correlation matrix, multiple correlations among the predictors, variance inflation factors, eigenvalues, condition number, and the determinant of the predictors correlation matrix. The best method for the optimum choice of the ridge parameter with ridge regression has not been established yet. Estimates of the ridge bias, ridged variance inflation factors, estimates, and norms for the ridge parameter therefore are given as output by RIDGE and should complement inspection of the ridge traces. Application within the earth sciences is discussed.

  3. Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection

    KAUST Repository

    Chen, Lisha

    2012-12-01

    The reduced-rank regression is an effective method in predicting multiple response variables from the same set of predictor variables. It reduces the number of model parameters and takes advantage of interrelations between the response variables and hence improves predictive accuracy. We propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty. We apply a group-lasso type penalty that treats each row of the matrix of the regression coefficients as a group and show that this penalty satisfies certain desirable invariance properties. We develop two numerical algorithms to solve the penalized regression problem and establish the asymptotic consistency of the proposed method. In particular, the manifold structure of the reduced-rank regression coefficient matrix is considered and studied in our theoretical analysis. In our simulation study and real data analysis, the new method is compared with several existing variable selection methods for multivariate regression and exhibits competitive performance in prediction and variable selection. © 2012 American Statistical Association.

  4. Hot Resistance Estimation for Dry Type Transformer Using Multiple Variable Regression, Multiple Polynomial Regression and Soft Computing Techniques

    Directory of Open Access Journals (Sweden)

    M. Srinivasan

    2012-01-01

    Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.

  5. Improved Regression Analysis of Temperature-Dependent Strain-Gage Balance Calibration Data

    Science.gov (United States)

    Ulbrich, N.

    2015-01-01

    An improved approach is discussed that may be used to directly include first and second order temperature effects in the load prediction algorithm of a wind tunnel strain-gage balance. The improved approach was designed for the Iterative Method that fits strain-gage outputs as a function of calibration loads and uses a load iteration scheme during the wind tunnel test to predict loads from measured gage outputs. The improved approach assumes that the strain-gage balance is at a constant uniform temperature when it is calibrated and used. First, the method introduces a new independent variable for the regression analysis of the balance calibration data. The new variable is designed as the difference between the uniform temperature of the balance and a global reference temperature. This reference temperature should be the primary calibration temperature of the balance so that, if needed, a tare load iteration can be performed. Then, two temperature{dependent terms are included in the regression models of the gage outputs. They are the temperature difference itself and the square of the temperature difference. Simulated temperature{dependent data obtained from Triumph Aerospace's 2013 calibration of NASA's ARC-30K five component semi{span balance is used to illustrate the application of the improved approach.

  6. The Generalized Regression Discontinuity Design: Using Multiple Assignment Variables and Cutoffs to Estimate Treatment Effects

    Science.gov (United States)

    Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.

    2009-01-01

    This paper introduces a generalization of the regression-discontinuity design (RDD). Traditionally, RDD is considered in a two-dimensional framework, with a single assignment variable and cutoff. Treatment effects are measured at a single location along the assignment variable. However, this represents a specialized (and straight-forward)…

  7. AN IMPROVED STRATEGY FOR REGRESSION OF BIOPHYSICAL VARIABLES AND LANDSAT ETM+ DATA. (R828309)

    Science.gov (United States)

    Empirical models are important tools for relating field-measured biophysical variables to remote sensing data. Regression analysis has been a popular empirical method of linking these two types of data to provide continuous estimates for variables such as biomass, percent wood...

  8. Energy-dependent variability from accretion flows

    CERN Document Server

    Zdziarski, A A

    2005-01-01

    We develop a formalism to calculate energy-dependent fractional variability (rms) in accretion flows. We consider rms spectra resulting from radial dependencies of the level of local variability (as expected from propagation of disturbances in accretion flows) assuming the constant shape of the spectrum emitted at a given radius. We consider the cases when the variability of the flow is either coherent or incoherent between different radial zones. As example local emission, we consider blackbody, Wien and thermal Comptonization spectra. In addition to numerical results, we present a number of analytical formulae for the resulting rms. We also find an analytical formula for the disc Wien spectrum, which we find to be a very good approximation to the disc blackbody. We compare our results to the rms spectrum observed in an ultrasoft state of GRS 1915+105.

  9. Causal relationship model between variables using linear regression to improve professional commitment of lecturer

    Science.gov (United States)

    Setyaningsih, S.

    2017-01-01

    The main element to build a leading university requires lecturer commitment in a professional manner. Commitment is measured through willpower, loyalty, pride, loyalty, and integrity as a professional lecturer. A total of 135 from 337 university lecturers were sampled to collect data. Data were analyzed using validity and reliability test and multiple linear regression. Many studies have found a link on the commitment of lecturers, but the basic cause of the causal relationship is generally neglected. These results indicate that the professional commitment of lecturers affected by variables empowerment, academic culture, and trust. The relationship model between variables is composed of three substructures. The first substructure consists of endogenous variables professional commitment and exogenous three variables, namely the academic culture, empowerment and trust, as well as residue variable ɛ y . The second substructure consists of one endogenous variable that is trust and two exogenous variables, namely empowerment and academic culture and the residue variable ɛ 3. The third substructure consists of one endogenous variable, namely the academic culture and exogenous variables, namely empowerment as well as residue variable ɛ 2. Multiple linear regression was used in the path model for each substructure. The results showed that the hypothesis has been proved and these findings provide empirical evidence that increasing the variables will have an impact on increasing the professional commitment of the lecturers.

  10. REGRESSION DEPENDENCE CONSTRUCTION METHODOLOGY FOR TRACTION CURVES USING LEAST SQUARE METHOD

    Directory of Open Access Journals (Sweden)

    V. Ravino

    2013-01-01

    Full Text Available   The paper presents a methodology that permits to construct regression dependences for traction curves of various tractors while using different operational backgrounds. The dependence construction process is carried out with the help of Microsoft Excel.

  11. Comparison of objective Bayes factors for variable selection in parametric regression models for survival analysis.

    Science.gov (United States)

    Cabras, Stefano; Castellanos, Maria Eugenia; Perra, Silvia

    2014-11-20

    This paper considers the problem of selecting a set of regressors when the response variable is distributed according to a specified parametric model and observations are censored. Under a Bayesian perspective, the most widely used tools are Bayes factors (BFs), which are undefined when improper priors are used. In order to overcome this issue, fractional (FBF) and intrinsic (IBF) BFs have become common tools for model selection. Both depend on the size, Nt , of a minimal training sample (MTS), while the IBF also depends on the specific MTS used. In the case of regression with censored data, the definition of an MTS is problematic because only uncensored data allow to turn the improper prior into a proper posterior and also because full exploration of the space of the MTSs, which includes also censored observations, is needed to avoid bias in model selection. To address this concern, a sequential MTS was proposed, but it has the drawback of an increase of the number of possible MTSs as Nt becomes random. For this reason, we explore the behaviour of the FBF, contextualizing its definition to censored data. We show that these are consistent, providing also the corresponding fractional prior. Finally, a large simulation study and an application to real data are used to compare IBF, FBF and the well-known Bayesian information criterion.

  12. Simultaneous estimation and variable selection in median regression using Lasso-type penalty.

    Science.gov (United States)

    Xu, Jinfeng; Ying, Zhiliang

    2010-06-01

    We consider the median regression with a LASSO-type penalty term for variable selection. With the fixed number of variables in regression model, a two-stage method is proposed for simultaneous estimation and variable selection where the degree of penalty is adaptively chosen. A Bayesian information criterion type approach is proposed and used to obtain a data-driven procedure which is proved to automatically select asymptotically optimal tuning parameters. It is shown that the resultant estimator achieves the so-called oracle property. The combination of the median regression and LASSO penalty is computationally easy to implement via the standard linear programming. A random perturbation scheme can be made use of to get simple estimator of the standard error. Simulation studies are conducted to assess the finite-sample performance of the proposed method. We illustrate the methodology with a real example.

  13. The number of subjects per variable required in linear regression analyses.

    Science.gov (United States)

    Austin, Peter C; Steyerberg, Ewout W

    2015-06-01

    To determine the number of independent variables that can be included in a linear regression model. We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model. A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model. Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  14. Heteroscedasticity as a Basis of Direction Dependence in Reversible Linear Regression Models.

    Science.gov (United States)

    Wiedermann, Wolfgang; Artner, Richard; von Eye, Alexander

    2017-01-01

    Heteroscedasticity is a well-known issue in linear regression modeling. When heteroscedasticity is observed, researchers are advised to remedy possible model misspecification of the explanatory part of the model (e.g., considering alternative functional forms and/or omitted variables). The present contribution discusses another source of heteroscedasticity in observational data: Directional model misspecifications in the case of nonnormal variables. Directional misspecification refers to situations where alternative models are equally likely to explain the data-generating process (e.g., x → y versus y → x). It is shown that the homoscedasticity assumption is likely to be violated in models that erroneously treat true nonnormal predictors as response variables. Recently, Direction Dependence Analysis (DDA) has been proposed as a framework to empirically evaluate the direction of effects in linear models. The present study links the phenomenon of heteroscedasticity with DDA and describes visual diagnostics and nine homoscedasticity tests that can be used to make decisions concerning the direction of effects in linear models. Results of a Monte Carlo simulation that demonstrate the adequacy of the approach are presented. An empirical example is provided, and applicability of the methodology in cases of violated assumptions is discussed.

  15. Prioritizing Highway Safety Manual's crash prediction variables using boosted regression trees.

    Science.gov (United States)

    Saha, Dibakar; Alluri, Priyanka; Gan, Albert

    2015-06-01

    The Highway Safety Manual (HSM) recommends using the empirical Bayes (EB) method with locally derived calibration factors to predict an agency's safety performance. However, the data needs for deriving these local calibration factors are significant, requiring very detailed roadway characteristics information. Many of the data variables identified in the HSM are currently unavailable in the states' databases. Moreover, the process of collecting and maintaining all the HSM data variables is cost-prohibitive. Prioritization of the variables based on their impact on crash predictions would, therefore, help to identify influential variables for which data could be collected and maintained for continued updates. This study aims to determine the impact of each independent variable identified in the HSM on crash predictions. A relatively recent data mining approach called boosted regression trees (BRT) is used to investigate the association between the variables and crash predictions. The BRT method can effectively handle different types of predictor variables, identify very complex and non-linear association among variables, and compute variable importance. Five years of crash data from 2008 to 2012 on two urban and suburban facility types, two-lane undivided arterials and four-lane divided arterials, were analyzed for estimating the influence of variables on crash predictions. Variables were found to exhibit non-linear and sometimes complex relationship to predicted crash counts. In addition, only a few variables were found to explain most of the variation in the crash data. Published by Elsevier Ltd.

  16. Solving the Omitted Variables Problem of Regression Analysis Using the Relative Vertical Position of Observations

    Directory of Open Access Journals (Sweden)

    Jonathan E. Leightner

    2012-01-01

    Full Text Available The omitted variables problem is one of regression analysis’ most serious problems. The standard approach to the omitted variables problem is to find instruments, or proxies, for the omitted variables, but this approach makes strong assumptions that are rarely met in practice. This paper introduces best projection reiterative truncated projected least squares (BP-RTPLS, the third generation of a technique that solves the omitted variables problem without using proxies or instruments. This paper presents a theoretical argument that BP-RTPLS produces unbiased reduced form estimates when there are omitted variables. This paper also provides simulation evidence that shows OLS produces between 250% and 2450% more errors than BP-RTPLS when there are omitted variables and when measurement and round-off error is 1 percent or less. In an example, the government spending multiplier, , is estimated using annual data for the USA between 1929 and 2010.

  17. Regression mixture models : Does modeling the covariance between independent variables and latent classes improve the results?

    NARCIS (Netherlands)

    Lamont, A.E.; Vermunt, J.K.; Van Horn, M.L.

    2016-01-01

    Regression mixture models are increasingly used as an exploratory approach to identify heterogeneity in the effects of a predictor on an outcome. In this simulation study, we tested the effects of violating an implicit assumption often made in these models; that is, independent variables in the

  18. Family background variables as instruments for education in income regressions: A Bayesian analysis

    NARCIS (Netherlands)

    L.F. Hoogerheide (Lennart); J.H. Block (Jörn); A.R. Thurik (Roy)

    2012-01-01

    textabstractThe validity of family background variables instrumenting education in income regressions has been much criticized. In this paper, we use data from the 2004 German Socio-Economic Panel and Bayesian analysis to analyze to what degree violations of the strict validity assumption affect the

  19. Genetic Instrumental Variable (GIV) Regression: Explaining Socioeconomic and Health Outcomes in Non-Experimental Data

    NARCIS (Netherlands)

    T.A. DiPrete (Thomas); C. Burik (Casper); Ph.D. Koellinger (Philipp)

    2017-01-01

    textabstractWe introduce Genetic Instrumental Variables (GIV) regression – a method to estimate causal effects in non-experimental data with many possible applications in the social sciences and epidemiology. In non-experimental data, genetic correlation between the outcome and the exposure of

  20. Analyzing Regression-Discontinuity Designs with Multiple Assignment Variables: A Comparative Study of Four Estimation Methods

    Science.gov (United States)

    Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.

    2012-01-01

    In a traditional regression-discontinuity design (RDD), units are assigned to treatment and comparison conditions solely on the basis of a single cutoff score on a continuous assignment variable. The discontinuity in the functional form of the outcome at the cutoff represents the treatment effect, or the average treatment effect at the cutoff.…

  1. Point Estimates and Confidence Intervals for Variable Importance in Multiple Linear Regression

    Science.gov (United States)

    Thomas, D. Roland; Zhu, PengCheng; Decady, Yves J.

    2007-01-01

    The topic of variable importance in linear regression is reviewed, and a measure first justified theoretically by Pratt (1987) is examined in detail. Asymptotic variance estimates are used to construct individual and simultaneous confidence intervals for these importance measures. A simulation study of their coverage properties is reported, and an…

  2. Variables Associated with Communicative Participation in People with Multiple Sclerosis: A Regression Analysis

    Science.gov (United States)

    Baylor, Carolyn; Yorkston, Kathryn; Bamer, Alyssa; Britton, Deanna; Amtmann, Dagmar

    2010-01-01

    Purpose: To explore variables associated with self-reported communicative participation in a sample (n = 498) of community-dwelling adults with multiple sclerosis (MS). Method: A battery of questionnaires was administered online or on paper per participant preference. Data were analyzed using multiple linear backward stepwise regression. The…

  3. Does the Magnitude of the Link between Unemployment and Crime Depend on the Crime Level? A Quantile Regression Approach

    Directory of Open Access Journals (Sweden)

    Horst Entorf

    2015-07-01

    Full Text Available Two alternative hypotheses – referred to as opportunity- and stigma-based behavior – suggest that the magnitude of the link between unemployment and crime also depends on preexisting local crime levels. In order to analyze conjectured nonlinearities between both variables, we use quantile regressions applied to German district panel data. While both conventional OLS and quantile regressions confirm the positive link between unemployment and crime for property crimes, results for assault differ with respect to the method of estimation. Whereas conventional mean regressions do not show any significant effect (which would confirm the usual result found for violent crimes in the literature, quantile regression reveals that size and importance of the relationship are conditional on the crime rate. The partial effect is significantly positive for moderately low and median quantiles of local assault rates.

  4. Introduction to statistical modelling 2: categorical variables and interactions in linear regression.

    Science.gov (United States)

    Lunt, Mark

    2015-07-01

    In the first article in this series we explored the use of linear regression to predict an outcome variable from a number of predictive factors. It assumed that the predictive factors were measured on an interval scale. However, this article shows how categorical variables can also be included in a linear regression model, enabling predictions to be made separately for different groups and allowing for testing the hypothesis that the outcome differs between groups. The use of interaction terms to measure whether the effect of a particular predictor variable differs between groups is also explained. An alternative approach to testing the difference between groups of the effect of a given predictor, which consists of measuring the effect in each group separately and seeing whether the statistical significance differs between the groups, is shown to be misleading.

  5. Benford's Law and Continuous Dependent Random Variables

    CERN Document Server

    Becker, Thealexa; Miller, Steven J; Ronan, Ryan; Strauch, Frederick W

    2011-01-01

    Many systems exhibit a digit bias. For example, the first digit base 10 of the Fibonacci numbers, or of $2^n$, equals 1 not 10% or 11% of the time, as one would expect if all digits were equally likely, but about 30% of the time. This phenomenon, known as Benford's Law, has many applications, ranging from detecting tax fraud for the IRS to analyzing round-off errors in computer science. The central question is determining which data sets follow Benford's law. Inspired by natural processes such as particle decay, our work examines models for the decomposition of conserved quantities. We prove that in many instances the distribution of lengths of the resulting pieces converges to Benford behavior as the number of divisions grow. The main difficulty is that the resulting random variables are dependent, which we handle by a careful analysis of the dependencies and tools from Fourier analysis to obtain quantified convergence rates.

  6. Independent, dependent, and other variables in healthcare and chaplaincy research.

    Science.gov (United States)

    Flannelly, Laura T; Flannelly, Kevin J; Jankowski, Katherine R B

    2014-01-01

    This article begins by defining the term variable and the terms independent variable and dependent variable, providing examples of each. It then proceeds to describe and discuss synonyms for the terms independent variable and dependent variable, including treatment, intervention, predictor, and risk factor, and synonyms for dependent variable, such as response variables and outcomes. The article explains that the terms extraneous, nuisance, and confounding variables refer to any variable that can interfere with the ability to establish relationships between independent variables and dependent variables, and it describes ways to control for such confounds. It further explains that even though intervening, mediating, and moderating variables explicitly alter the relationship between independent variables and dependent variables, they help to explain the causal relationship between them. In addition, the article links terminology about variables with the concept of levels of measurement in research.

  7. Force calibration using errors-in-variables regression and Monte Carlo uncertainty evaluation

    Science.gov (United States)

    Bartel, Thomas; Stoudt, Sara; Possolo, Antonio

    2016-06-01

    An errors-in-variables regression method is presented as an alternative to the ordinary least-squares regression computation currently employed for determining the calibration function for force measuring instruments from data acquired during calibration. A Monte Carlo uncertainty evaluation for the errors-in-variables regression is also presented. The corresponding function (which we call measurement function, often called analysis function in gas metrology) necessary for the subsequent use of the calibrated device to measure force, and the associated uncertainty evaluation, are also derived from the calibration results. Comparisons are made, using real force calibration data, between the results from the errors-in-variables and ordinary least-squares analyses, as well as between the Monte Carlo uncertainty assessment and the conventional uncertainty propagation employed at the National Institute of Standards and Technology (NIST). The results show that the errors-in-variables analysis properly accounts for the uncertainty in the applied calibrated forces, and that the Monte Carlo method, owing to its intrinsic ability to model uncertainty contributions accurately, yields a better representation of the calibration uncertainty throughout the transducer’s force range than the methods currently in use. These improvements notwithstanding, the differences between the results produced by the current and by the proposed new methods generally are small because the relative uncertainties of the inputs are small and most contemporary load cells respond approximately linearly to such inputs. For this reason, there will be no compelling need to revise any of the force calibration reports previously issued by NIST.

  8. EFFICIENT ESTIMATION OF FUNCTIONAL-COEFFICIENT REGRESSION MODELS WITH DIFFERENT SMOOTHING VARIABLES

    Institute of Scientific and Technical Information of China (English)

    Zhang Riquan; Li Guoying

    2008-01-01

    In this article, a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different co-efficient functions is defined. First step, by the local linear technique and the averaged method, the initial estimates of the coefficient functions are given. Second step, based on the initial estimates, the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure. The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions. Two simulated examples show that the procedure is effective.

  9. Polychotomization of continuous variables in regression models based on the overall C index

    Directory of Open Access Journals (Sweden)

    Bax Leon

    2006-12-01

    Full Text Available Abstract Background When developing multivariable regression models for diagnosis or prognosis, continuous independent variables can be categorized to make a prediction table instead of a prediction formula. Although many methods have been proposed to dichotomize prognostic variables, to date there has been no integrated method for polychotomization. The latter is necessary when dichotomization results in too much loss of information or when central values refer to normal states and more dispersed values refer to less preferable states, a situation that is not unusual in medical settings (e.g. body temperature, blood pressure. The goal of our study was to develop a theoretical and practical method for polychotomization. Methods We used the overall discrimination index C, introduced by Harrel, as a measure of the predictive ability of an independent regressor variable and derived a method for polychotomization mathematically. Since the naïve application of our method, like some existing methods, gives rise to positive bias, we developed a parametric method that minimizes this bias and assessed its performance by the use of Monte Carlo simulation. Results The overall C is closely related to the area under the ROC curve and the produced di(polychotomized variable's predictive performance is comparable to the original continuous variable. The simulation shows that the parametric method is essentially unbiased for both the estimates of performance and the cutoff points. Application of our method to the predictor variables of a previous study on rhabdomyolysis shows that it can be used to make probability profile tables that are applicable to the diagnosis or prognosis of individual patient status. Conclusion We propose a polychotomization (including dichotomization method for independent continuous variables in regression models based on the overall discrimination index C and clarified its meaning mathematically. To avoid positive bias in

  10. Application of Robust Regression and Bootstrap in Poductivity Analysis of GERD Variable in EU27

    Directory of Open Access Journals (Sweden)

    Dagmar Blatná

    2014-06-01

    Full Text Available The GERD is one of Europe 2020 headline indicators being tracked within the Europe 2020 strategy. The headline indicator is the 3% target for the GERD to be reached within the EU by 2020. Eurostat defi nes “GERD” as total gross domestic expenditure on research and experimental development in a percentage of GDP. GERD depends on numerous factors of a general economic background, namely of employment, innovation and research, science and technology. The values of these indicators vary among the European countries, and consequently the occurrence of outliers can be anticipated in corresponding analyses. In such a case, a classical statistical approach – the least squares method – can be highly unreliable, the robust regression methods representing an acceptable and useful tool. The aim of the present paper is to demonstrate the advantages of robust regression and applicability of the bootstrap approach in regression based on both classical and robust methods.

  11. Modified Regression Correlation Coefficient for Poisson Regression Model

    Science.gov (United States)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  12. Using the classical linear regression model in analysis of the dependences of conveyor belt life

    Directory of Open Access Journals (Sweden)

    Miriam Andrejiová

    2013-12-01

    Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.

  13. A Principal Component Regression Approach for Estimating Ventricular Repolarization Duration Variability

    Directory of Open Access Journals (Sweden)

    Pasi A. Karjalainen

    2007-01-01

    Full Text Available Ventricular repolarization duration (VRD is affected by heart rate and autonomic control, and thus VRD varies in time in a similar way as heart rate. VRD variability is commonly assessed by determining the time differences between successive R- and T-waves, that is, RT intervals. Traditional methods for RT interval detection necessitate the detection of either T-wave apexes or offsets. In this paper, we propose a principal-component-regression- (PCR- based method for estimating RT variability. The main benefit of the method is that it does not necessitate T-wave detection. The proposed method is compared with traditional RT interval measures, and as a result, it is observed to estimate RT variability accurately and to be less sensitive to noise than the traditional methods. As a specific application, the method is applied to exercise electrocardiogram (ECG recordings.

  14. Regression by L1 regularization of smart contrasts and sums (ROSCAS) beats PLS and elastic net in latent variable model

    NARCIS (Netherlands)

    Braak, ter C.J.F.

    2009-01-01

    This paper proposes a regression method, ROSCAS, which regularizes smart contrasts and sums of regression coefficients by an L1 penalty. The contrasts and sums are based on the sample correlation matrix of the predictors and are suggested by a latent variable regression model. The contrasts express

  15. Difference mapping method using least square support vector regression for variable-fidelity metamodelling

    Science.gov (United States)

    Zheng, Jun; Shao, Xinyu; Gao, Liang; Jiang, Ping; Qiu, Haobo

    2015-06-01

    Engineering design, especially for complex engineering systems, is usually a time-consuming process involving computation-intensive computer-based simulation and analysis methods. A difference mapping method using least square support vector regression is developed in this work, as a special metamodelling methodology that includes variable-fidelity data, to replace the computationally expensive computer codes. A general difference mapping framework is proposed where a surrogate base is first created, then the approximation is gained by a mapping the difference between the base and the real high-fidelity response surface. The least square support vector regression is adopted to accomplish the mapping. Two different sampling strategies, nested and non-nested design of experiments, are conducted to explore their respective effects on modelling accuracy. Different sample sizes and three approximation performance measures of accuracy are considered.

  16. Variable selection methods in PLS regression - a comparison study on metabolomics data

    DEFF Research Database (Denmark)

    Karaman, İbrahim; Hedemann, Mette Skou; Knudsen, Knud Erik Bach

    Partial least squares regression (PLSR) has been applied to various fields such as psychometrics, consumer science, econometrics and process control. Recently it has been applied to metabolomics based data sets (GC/LC-MS, NMR) and proven to be a very powerful in situations with many variables...... for the purpose of reducing over-fitting problems and providing useful interpretation tools. It has excellent possibilities for giving a graphical overview of sample and variation patterns. It can handle co-linearity in an efficient way and make it possible to use different highly correlated data sets in one...... Integrating Omics data. Statistical Applications in Genetics and Molecular Biology, 7:Article 35, 2008. 2. Martens H and Martens M. Modifed Jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR). Food Quality and Preference, 11:5-16, 2000....

  17. A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes.

    Science.gov (United States)

    Gayou, Olivier; Das, Shiva K; Zhou, Su-Min; Marks, Lawrence B; Parda, David S; Miften, Moyed

    2008-12-01

    A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies.

  18. Variable selection in multiple linear regression: The influence of individual cases

    Directory of Open Access Journals (Sweden)

    SJ Steel

    2007-12-01

    Full Text Available The influence of individual cases in a data set is studied when variable selection is applied in multiple linear regression. Two different influence measures, based on the C_p criterion and Akaike's information criterion, are introduced. The relative change in the selection criterion when an individual case is omitted is proposed as the selection influence of the specific omitted case. Four standard examples from the literature are considered and the selection influence of the cases is calculated. It is argued that the selection procedure may be improved by taking the selection influence of individual data cases into account.

  19. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    Science.gov (United States)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP

  20. Methodological and Epistemological Issues on Linear Regression Applied to Psychometric Variables in Problem Solving: Rethinking Variance

    Science.gov (United States)

    Stamovlasis, Dimitrios

    2010-01-01

    The aim of the present paper is two-fold. First, it attempts to support previous findings on the role of some psychometric variables, such as, M-capacity, the degree of field dependence-independence, logical thinking and the mobility-fixity dimension, on students' achievement in chemistry problem solving. Second, the paper aims to raise some…

  1. Cardinality-dependent Variability in Orthogonal Variability Models

    DEFF Research Database (Denmark)

    Mærsk-Møller, Hans Martin; Jørgensen, Bo Nørregaard

    2012-01-01

    During our work on developing and running a software product line for eco-sustainable greenhouse-production software tools, which currently have three products members we have identified a need for extending the notation of the Orthogonal Variability Model (OVM) to support what we refer to as car......During our work on developing and running a software product line for eco-sustainable greenhouse-production software tools, which currently have three products members we have identified a need for extending the notation of the Orthogonal Variability Model (OVM) to support what we refer...

  2. Using latent variables in logistic regression to reduce multicollinearity, A case-control example: breast cancer risk factors

    Directory of Open Access Journals (Sweden)

    Mohamad Amin Pourhoseingholi

    2008-03-01

    Full Text Available

    Background: Logistic regression is one of the most widely used models to analyze the relation between one or more explanatory variables and a categorical response in the field of epidemiology, health and medicine. When there is strong correlation among explanatory variables, i.e.multicollinearity, the efficiency of model reduces considerably. The objective of this research was to employ latent variables to reduce the effect of multicollinearity in analysis of a case-control study about breast cancer risk factors.

    Methods: The data belonged to a case-control study in which 300 women with breast cancer were compared to same number of controls. To assess the effect of multicollinearity, five highly correlated quantitative variables were selected. Ordinary logistic regression with collinear data was compared to two models contain latent variables were generated using either factor analysis or principal components analysis. Estimated standard errors of parameters were selected to compare the efficiency of models. We also conducted a simulation study in order to compare the efficiency of models with and without latent factors. All analyses were carried out using S-plus.

    Results: Logistic regression based on five primary variables showed an unusual odds ratios for age at first pregnancy (OR=67960, 95%CI: 10184-453503 and for total length of breast feeding (OR=0. On the other hand the parameters estimated for logistic regression on latent variables generated by both factor analysis and principal components analysis were statistically significant (P<0.003. Their standard errors were smaller than that of ordinary logistic regression on original variables. The simulation showed that in the case of normal error and 58% reliability the logistic regression based on latent variables is more efficient than that model for collinear variables.

    Conclusions: This research

  3. A Comparative Investigation of Confidence Intervals for IndependentVariables in Linear Regression.

    Science.gov (United States)

    Dudgeon, Paul

    2016-01-01

    In linear regression, the most appropriate standardized effect size for individual independent variables having an arbitrary metric remains open to debate, despite researchers typically reporting a standardized regression coefficient. Alternative standardized measures include the semipartial correlation, the improvement in the squared multiple correlation, and the squared partial correlation. No arguments based on either theoretical or statistical grounds for preferring one of these standardized measures have been mounted in the literature. Using a Monte Carlo simulation, the performance of interval estimators for these effect-size measures was compared in a 5-way factorial design. Formal statistical design methods assessed both the accuracy and robustness of the four interval estimators. The coverage probability of a large-sample confidence interval for the semipartial correlation coefficient derived from Aloe and Becker was highly accurate and robust in 98% of instances. It was better in small samples than the Yuan-Chan large-sample confidence interval for a standardized regression coefficient. It was also consistently better than both a bootstrap confidence interval for the improvement in the squared multiple correlation and a noncentral interval for the squared partial correlation.

  4. NetRaVE: constructing dependency networks using sparse linear regression

    DEFF Research Database (Denmark)

    Phatak, A.; Kiiveri, H.; Clemmensen, Line Katrine Harder;

    2010-01-01

    NetRaVE is a small suite of R functions for generating dependency networks using sparse regression methods. Such networks provide an alternative to interpreting 'top n lists' of genes arising out of an analysis of microarray data, and they provide a means of organizing and visualizing the resulting...

  5. Quantile Regression Methods

    DEFF Research Database (Denmark)

    Fitzenberger, Bernd; Wilke, Ralf Andreas

    2015-01-01

    Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights by m...... treatment of the topic is based on the perspective of applied researchers using quantile regression in their empirical work....

  6. Spatial Variability of Plant Available Water, Soil Organic Carbon, and Microbial Biomass under Divergent Land Uses: A Comparison among Regression-Kriging, Cokriging, and Regression-Cokriging

    Science.gov (United States)

    Kiani, M.; Hernandez Ramirez, G.; Quideau, S.

    2016-12-01

    Improved knowledge about the spatial variability of plant available water (PAW), soil organic carbon (SOC), and microbial biomass carbon (MBC) as affected by land-use systems can underpin the identification and inventory of beneficial ecosystem good and services in both agricultural and wild lands. Little research has been done that addresses the spatial patterns of PAW, SOC, and MBC under different land use types at a field scale. Therefore, we collected 56 soil samples (5-10 cm depth increment), using a nested cyclic sampling design within both a native grassland (NG) site and an irrigated cultivated (IC) site located near Brooks, Alberta. Using classical statistical and geostatistical methods, we characterized the spatial heterogeneities of PAW, SOC, and MBC under NG and IC using several geostatistical methods such as ordinary kriging (OK), regression-kriging (RK), cokriging (COK), and regression-cokriging (RCOK). Converting the native grassland to irrigated cultivated land altered soil pore distribution by reducing macroporosity which led to lower saturated water content and half hydraulic conductivity in IC compared to NG. This conversion also decreased the relative abundance of gram-negative bacteria, while increasing both the proportion of gram-positive bacteria and MBC concentration. At both studied sites, the best fitted spatial model was Gaussian based on lower RSS and higher R2 as criteria. The IC had stronger degree of spatial dependence and longer range of spatial auto-correlation revealing a homogenization of the spatial variability of soil properties as a result of intensive, recurrent agricultural activities. Comparison of OK, RK, COK, and RCOK approaches indicated that cokriging method had the best performance demonstrating a profound improvement in the accuracy of spatial estimations of PAW, SOC, and MBC. It seems that the combination of terrain covariates such as elevation and depth-to-water with kriging techniques offers more capability for

  7. Logistic regression.

    Science.gov (United States)

    Nick, Todd G; Campbell, Kathleen M

    2007-01-01

    The Medical Subject Headings (MeSH) thesaurus used by the National Library of Medicine defines logistic regression models as "statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable." Logistic regression models are used to study effects of predictor variables on categorical outcomes and normally the outcome is binary, such as presence or absence of disease (e.g., non-Hodgkin's lymphoma), in which case the model is called a binary logistic model. When there are multiple predictors (e.g., risk factors and treatments) the model is referred to as a multiple or multivariable logistic regression model and is one of the most frequently used statistical model in medical journals. In this chapter, we examine both simple and multiple binary logistic regression models and present related issues, including interaction, categorical predictor variables, continuous predictor variables, and goodness of fit.

  8. Modeling Source Water TOC Using Hydroclimate Variables and Local Polynomial Regression.

    Science.gov (United States)

    Samson, Carleigh C; Rajagopalan, Balaji; Summers, R Scott

    2016-04-19

    To control disinfection byproduct (DBP) formation in drinking water, an understanding of the source water total organic carbon (TOC) concentration variability can be critical. Previously, TOC concentrations in water treatment plant source waters have been modeled using streamflow data. However, the lack of streamflow data or unimpaired flow scenarios makes it difficult to model TOC. In addition, TOC variability under climate change further exacerbates the problem. Here we proposed a modeling approach based on local polynomial regression that uses climate, e.g. temperature, and land surface, e.g., soil moisture, variables as predictors of TOC concentration, obviating the need for streamflow. The local polynomial approach has the ability to capture non-Gaussian and nonlinear features that might be present in the relationships. The utility of the methodology is demonstrated using source water quality and climate data in three case study locations with surface source waters including river and reservoir sources. The models show good predictive skill in general at these locations, with lower skills at locations with the most anthropogenic influences in their streams. Source water TOC predictive models can provide water treatment utilities important information for making treatment decisions for DBP regulation compliance under future climate scenarios.

  9. Variable selection for large p small n regression models with incomplete data: Mapping QTL with epistases

    Directory of Open Access Journals (Sweden)

    Wells Martin T

    2008-05-01

    Full Text Available Abstract Background Identifying quantitative trait loci (QTL for both additive and epistatic effects raises the statistical issue of selecting variables from a large number of candidates using a small number of observations. Missing trait and/or marker values prevent one from directly applying the classical model selection criteria such as Akaike's information criterion (AIC and Bayesian information criterion (BIC. Results We propose a two-step Bayesian variable selection method which deals with the sparse parameter space and the small sample size issues. The regression coefficient priors are flexible enough to incorporate the characteristic of "large p small n" data. Specifically, sparseness and possible asymmetry of the significant coefficients are dealt with by developing a Gibbs sampling algorithm to stochastically search through low-dimensional subspaces for significant variables. The superior performance of the approach is demonstrated via simulation study. We also applied it to real QTL mapping datasets. Conclusion The two-step procedure coupled with Bayesian classification offers flexibility in modeling "large p small n" data, especially for the sparse and asymmetric parameter space. This approach can be extended to other settings characterized by high dimension and low sample size.

  10. Maximal Inequalities for Dependent Random Variables

    DEFF Research Database (Denmark)

    Hoffmann-Jorgensen, Jorgen

    2016-01-01

    Maximal inequalities play a crucial role in many probabilistic limit theorem; for instance, the law of large numbers, the law of the iterated logarithm, the martingale limit theorem and the central limit theorem. Let X-1, X-2,... be random variables with partial sums S-k = X-1 + ... + X-k. Then a......Maximal inequalities play a crucial role in many probabilistic limit theorem; for instance, the law of large numbers, the law of the iterated logarithm, the martingale limit theorem and the central limit theorem. Let X-1, X-2,... be random variables with partial sums S-k = X-1 + ... + X...

  11. Variability dependencies in product family engineering

    NARCIS (Netherlands)

    Jaring, M; Bosch, J; VanDerLinden, F

    2004-01-01

    In a product family context, software architects anticipate product diversification and design architectures that support variants in both space (multiple contexts) and time (changing contexts). Product diversification is based on the concept of variability: a single architecture and a set of compon

  12. Multiple linear regression models of urban runoff pollutant load and event mean concentration considering rainfall variables.

    Science.gov (United States)

    Maniquiz, Marla C; Lee, Soyoung; Kim, Lee-Hyung

    2010-01-01

    Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long-term monitoring is needed to gather more data that can be used for the development of estimation models.

  13. Predicting agility performance with other performance variables in pubescent boys: a multiple-regression approach.

    Science.gov (United States)

    Sekulic, Damir; Spasic, Miodrag; Esco, Michael R

    2014-04-01

    The goal was to investigate the influence of balance, jumping power, reactive-strength, speed, and morphological variables on five different agility performances in early pubescent boys (N = 71). The predictors included body height and mass, countermovement and broad jumps, overall stability index, 5 m sprint, and bilateral side jumps test of reactive strength. Forward stepwise regressions calculated on 36 randomly selected participants explained 47% of the variance in performance of the forward-backward running test, 50% of the 180 degrees turn test, 55% of the 20 yd. shuttle test, 62% of the T-shaped course test, and 44% of the zig-zag test, with the bilateral side jumps as the single best predictor. Regression models were cross-validated using the second half of the sample (n = 35). Correlation between predicted and achieved scores did not provide statistically significant validation statistics for the continuous-movement zig-zag test. Further study is needed to assess other predictors of agility in early pubescent boys.

  14. Problems of correlations between explanatory variables in multiple regression analyses in the dental literature.

    Science.gov (United States)

    Tu, Y-K; Kellett, M; Clerehugh, V; Gilthorpe, M S

    2005-10-01

    Multivariable analysis is a widely used statistical methodology for investigating associations amongst clinical variables. However, the problems of collinearity and multicollinearity, which can give rise to spurious results, have in the past frequently been disregarded in dental research. This article illustrates and explains the problems which may be encountered, in the hope of increasing awareness and understanding of these issues, thereby improving the quality of the statistical analyses undertaken in dental research. Three examples from different clinical dental specialties are used to demonstrate how to diagnose the problem of collinearity/multicollinearity in multiple regression analyses and to illustrate how collinearity/multicollinearity can seriously distort the model development process. Lack of awareness of these problems can give rise to misleading results and erroneous interpretations. Multivariable analysis is a useful tool for dental research, though only if its users thoroughly understand the assumptions and limitations of these methods. It would benefit evidence-based dentistry enormously if researchers were more aware of both the complexities involved in multiple regression when using these methods and of the need for expert statistical consultation in developing study design and selecting appropriate statistical methodologies.

  15. ORPHA, ORPHIC FUNCTIONS, AND THE ORPHIC ANALYST: WINNICOTT'S "REGRESSION TO DEPENDENCE" IN THE LANGUAGE OF FERENCZI().

    Science.gov (United States)

    Gurevich, Hayuta

    2016-12-01

    Early developmental trauma is imprinted in the psyche by survival fragmentation and dissociation. Traumatized patients need the analyst to be actively involved and allow for regression to dependence in order to strengthen, create and construct their psychic functioning and structure so that environmental failures will be contained and not rupture continuity of being. I suggest that Ferenczi's and Winnicott's ideas about regression to dependence in analysis are fundamental contributions to these quests, and that Ferenczi set the foundation, which Winnicott further explored and developed. I would like to focus on their clinical theory of treating early developmental trauma of the psyche, describing it in the less known language of Ferenczi, reviving his concept of Orpha and its functions. The complementarities of the two approaches can enrich and broaden our understanding of the clinical complications that arise in the analysis of such states.

  16. Regression tree modeling of forest NPP using site conditions and climate variables across eastern USA

    Science.gov (United States)

    Kwon, Y.

    2013-12-01

    As evidence of global warming continue to increase, being able to predict forest response to climate changes, such as expected rise of temperature and precipitation, will be vital for maintaining the sustainability and productivity of forests. To map forest species redistribution by climate change scenario has been successful, however, most species redistribution maps lack mechanistic understanding to explain why trees grow under the novel conditions of chaining climate. Distributional map is only capable of predicting under the equilibrium assumption that the communities would exist following a prolonged period under the new climate. In this context, forest NPP as a surrogate for growth rate, the most important facet that determines stand dynamics, can lead to valid prediction on the transition stage to new vegetation-climate equilibrium as it represents changes in structure of forest reflecting site conditions and climate factors. The objective of this study is to develop forest growth map using regression tree analysis by extracting large-scale non-linear structures from both field-based FIA and remotely sensed MODIS data set. The major issue addressed in this approach is non-linear spatial patterns of forest attributes. Forest inventory data showed complex spatial patterns that reflect environmental states and processes that originate at different spatial scales. At broad scales, non-linear spatial trends in forest attributes and mixture of continuous and discrete types of environmental variables make traditional statistical (multivariate regression) and geostatistical (kriging) models inefficient. It calls into question some traditional underlying assumptions of spatial trends that uncritically accepted in forest data. To solve the controversy surrounding the suitability of forest data, regression tree analysis are performed using Software See5 and Cubist. Four publicly available data sets were obtained: First, field-based Forest Inventory and Analysis (USDA

  17. Identifying of risks in pricing using a regression model of demand on price dependence

    Directory of Open Access Journals (Sweden)

    O.I. Yashkina

    2016-09-01

    Full Text Available The aim of the article. The main purpose of the article is to describe scientific and methodological approaches of determining the price elasticity of demand as a regression model based on the price and risk assessment of price variations on the received model. The results of the analysis. The study is based on the assumption that the index of price elasticity of demand on high-tech innovation is not constant as it is commonly understood in the classical sense. On the stage of commodity market release and subsequent sales growth, the index of price elasticity of demand may vary within certain limits. Index value and thereafter market response are closely related to the current price. Achieving the stated purpose of the article is possible when having factual information about prices and corresponding volumes of sales of new high-tech products for a short period of time, on the basis of which types of demand and prices interrelation are modeled. Risk assessment of pricing and profit optimization by the regression of demand depending on price consists of three stages: a obtaining of a regression model of the demand on the price; b obtaining of function of demand price elasticity and risk assessment of pricing depending on behavior of the function; c determination of the price of company to receive a maximum operating profit based on the specific model of price to demand function. To receive the regression model of dependence of demand on price it is recommended to use specific reference models. The article includes linear, hyperbolic and parabolic models. The regression dependence of price elasticity of demand on price for each of the reference models of demand is obtained on the basis of the function elasticity concept in mathematical analysis. The concept of «function of price elasticity of demand» expresses this dependence. For the received functions of price elasticity of demand, the article provides intervals with the highest and lowest

  18. BOOTSTRAP WAVELET IN THE NONPARAMETRIC REGRESSION MODEL WITH WEAKLY DEPENDENT PROCESSES

    Institute of Scientific and Technical Information of China (English)

    林路; 张润楚

    2004-01-01

    This paper introduces a method of bootstrap wavelet estimation in a nonparametric regression model with weakly dependent processes for both fixed and random designs. The asymptotic bounds for the bias and variance of the bootstrap wavelet estimators are given in the fixed design model. The conditional normality for a modified version of the bootstrap wavelet estimators is obtained in the fixed model. The consistency for the bootstrap wavelet estimator is also proved in the random design model. These results show that the bootstrap wavelet method is valid for the model with weakly dependent processes.

  19. Bayesian Network Models for Local Dependence among Observable Outcome Variables

    Science.gov (United States)

    Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli

    2009-01-01

    Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…

  20. Validity of a Residualized Dependent Variable after Pretest Covariance Adjustments: Still the Same Variable?

    Science.gov (United States)

    Nimon, Kim; Henson, Robin K.

    2015-01-01

    The authors empirically examined whether the validity of a residualized dependent variable after covariance adjustment is comparable to that of the original variable of interest. When variance of a dependent variable is removed as a result of one or more covariates, the residual variance may not reflect the same meaning. Using the pretest-posttest…

  1. Spatiotemporally restricted arenavirus replication induces immune surveillance and type I interferon-dependent tumour regression

    Science.gov (United States)

    Kalkavan, Halime; Sharma, Piyush; Kasper, Stefan; Helfrich, Iris; Pandyra, Aleksandra A.; Gassa, Asmae; Virchow, Isabel; Flatz, Lukas; Brandenburg, Tim; Namineni, Sukumar; Heikenwalder, Mathias; Höchst, Bastian; Knolle, Percy A.; Wollmann, Guido; von Laer, Dorothee; Drexler, Ingo; Rathbun, Jessica; Cannon, Paula M.; Scheu, Stefanie; Bauer, Jens; Chauhan, Jagat; Häussinger, Dieter; Willimsky, Gerald; Löhning, Max; Schadendorf, Dirk; Brandau, Sven; Schuler, Martin; Lang, Philipp A.; Lang, Karl S.

    2017-01-01

    Immune-mediated effector molecules can limit cancer growth, but lack of sustained immune activation in the tumour microenvironment restricts antitumour immunity. New therapeutic approaches that induce a strong and prolonged immune activation would represent a major immunotherapeutic advance. Here we show that the arenaviruses lymphocytic choriomeningitis virus (LCMV) and the clinically used Junin virus vaccine (Candid#1) preferentially replicate in tumour cells in a variety of murine and human cancer models. Viral replication leads to prolonged local immune activation, rapid regression of localized and metastatic cancers, and long-term disease control. Mechanistically, LCMV induces antitumour immunity, which depends on the recruitment of interferon-producing Ly6C+ monocytes and additionally enhances tumour-specific CD8+ T cells. In comparison with other clinically evaluated oncolytic viruses and to PD-1 blockade, LCMV treatment shows promising antitumoural benefits. In conclusion, therapeutically administered arenavirus replicates in cancer cells and induces tumour regression by enhancing local immune responses. PMID:28248314

  2. Asymptotics for partly linear regression with dependent samples and ARCH errors: consistency with rates

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Partly linear regression model is useful in practice, but littleis investigated in the literature to adapt it to the real data which are dependent and conditionally heteroscedastic. In this paper, the estimators of the regression components are constructed via local polynomial fitting and the large sample properties are explored. Under certain mild regularities, the conditions are obtained to ensure that the estimators of the nonparametric component and its derivatives are consistent up to the convergence rates which are optimal in the i.i.d. case, and the estimator of the parametric component is root-n consistent with the same rate as for parametric model. The technique adopted in the proof differs from that used and corrects the errors in the reference by Hamilton and Truong under i.i.d. samples.

  3. Variable Selection for Functional Logistic Regression in fMRI Data Analysis

    Directory of Open Access Journals (Sweden)

    Nedret BILLOR

    2015-03-01

    Full Text Available This study was motivated by classification problem in Functional Magnetic Resonance Imaging (fMRI, a noninvasive imaging technique which allows an experimenter to take images of a subject's brain over time. As fMRI studies usually have a small number of subjects and we assume that there is a smooth, underlying curve describing the observations in fMRI data, this results in incredibly high-dimensional datasets that are functional in nature. High dimensionality is one of the biggest problems in statistical analysis of fMRI data. There is also a need for the development of better classification methods. One of the best things about fMRI technique is its noninvasiveness. If statistical classification methods are improved, it could aid the advancement of noninvasive diagnostic techniques for mental illness or even degenerative diseases such as Alzheimer's. In this paper, we develop a variable selection technique, which tackles high dimensionality and correlation problems in fMRI data, based on L1 regularization-group lasso for the functional logistic regression model where the response is binary and represent two separate classes; the predictors are functional. We assess our method with a simulation study and an application to a real fMRI dataset.

  4. Peak flow regression equations For small, ungaged streams in Maine: Comparing map-based to field-based variables

    Science.gov (United States)

    Lombard, Pamela J.; Hodgkins, Glenn A.

    2015-01-01

    Regression equations to estimate peak streamflows with 1- to 500-year recurrence intervals (annual exceedance probabilities from 99 to 0.2 percent, respectively) were developed for small, ungaged streams in Maine. Equations presented here are the best available equations for estimating peak flows at ungaged basins in Maine with drainage areas from 0.3 to 12 square miles (mi2). Previously developed equations continue to be the best available equations for estimating peak flows for basin areas greater than 12 mi2. New equations presented here are based on streamflow records at 40 U.S. Geological Survey streamgages with a minimum of 10 years of recorded peak flows between 1963 and 2012. Ordinary least-squares regression techniques were used to determine the best explanatory variables for the regression equations. Traditional map-based explanatory variables were compared to variables requiring field measurements. Two field-based variables—culvert rust lines and bankfull channel widths—either were not commonly found or did not explain enough of the variability in the peak flows to warrant inclusion in the equations. The best explanatory variables were drainage area and percent basin wetlands; values for these variables were determined with a geographic information system. Generalized least-squares regression was used with these two variables to determine the equation coefficients and estimates of accuracy for the final equations.

  5. Future-dependent Flow Policies with Prophetic Variables

    DEFF Research Database (Denmark)

    Li, Ximeng; Nielson, Flemming; Nielson, Hanne Riis

    2016-01-01

    future-dependent flow policies- policies that can depend on not only the current values of variables, but also their final values. The final values are referred to using what we call prophetic variables, just as the initial values can be referenced using logical variables in Hoare logic. We develop......Content-dependency often plays an important role in the information flow security of real world IT systems. Content dependency gives rise to informative policies and permissive static enforcement, and sometimes avoids the need for downgrading. We develop a static type system to soundly enforce...... and enforce a notion of future-dependent security for open systems, in the spirit of "non-deducibility on strategies". We also illustrate our approach in scenarios where future-dependency has advantages over present-dependency and avoids mixtures of upgradings and downgradings....

  6. Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease.

    Science.gov (United States)

    Zhang, Daoqiang; Shen, Dinggang

    2012-01-16

    Many machine learning and pattern classification methods have been applied to the diagnosis of Alzheimer's disease (AD) and its prodromal stage, i.e., mild cognitive impairment (MCI). Recently, rather than predicting categorical variables as in classification, several pattern regression methods have also been used to estimate continuous clinical variables from brain images. However, most existing regression methods focus on estimating multiple clinical variables separately and thus cannot utilize the intrinsic useful correlation information among different clinical variables. On the other hand, in those regression methods, only a single modality of data (usually only the structural MRI) is often used, without considering the complementary information that can be provided by different modalities. In this paper, we propose a general methodology, namely multi-modal multi-task (M3T) learning, to jointly predict multiple variables from multi-modal data. Here, the variables include not only the clinical variables used for regression but also the categorical variable used for classification, with different tasks corresponding to prediction of different variables. Specifically, our method contains two key components, i.e., (1) a multi-task feature selection which selects the common subset of relevant features for multiple variables from each modality, and (2) a multi-modal support vector machine which fuses the above-selected features from all modalities to predict multiple (regression and classification) variables. To validate our method, we perform two sets of experiments on ADNI baseline MRI, FDG-PET, and cerebrospinal fluid (CSF) data from 45 AD patients, 91 MCI patients, and 50 healthy controls (HC). In the first set of experiments, we estimate two clinical variables such as Mini Mental State Examination (MMSE) and Alzheimer's Disease Assessment Scale-Cognitive Subscale (ADAS-Cog), as well as one categorical variable (with value of 'AD', 'MCI' or 'HC'), from the

  7. Variables that influence HIV-1 cerebrospinal fluid viral load in cryptococcal meningitis: a linear regression analysis

    Directory of Open Access Journals (Sweden)

    Cecchini Diego M

    2009-11-01

    Full Text Available Abstract Background The central nervous system is considered a sanctuary site for HIV-1 replication. Variables associated with HIV cerebrospinal fluid (CSF viral load in the context of opportunistic CNS infections are poorly understood. Our objective was to evaluate the relation between: (1 CSF HIV-1 viral load and CSF cytological and biochemical characteristics (leukocyte count, protein concentration, cryptococcal antigen titer; (2 CSF HIV-1 viral load and HIV-1 plasma viral load; and (3 CSF leukocyte count and the peripheral blood CD4+ T lymphocyte count. Methods Our approach was to use a prospective collection and analysis of pre-treatment, paired CSF and plasma samples from antiretroviral-naive HIV-positive patients with cryptococcal meningitis and assisted at the Francisco J Muñiz Hospital, Buenos Aires, Argentina (period: 2004 to 2006. We measured HIV CSF and plasma levels by polymerase chain reaction using the Cobas Amplicor HIV-1 Monitor Test version 1.5 (Roche. Data were processed with Statistix 7.0 software (linear regression analysis. Results Samples from 34 patients were analyzed. CSF leukocyte count showed statistically significant correlation with CSF HIV-1 viral load (r = 0.4, 95% CI = 0.13-0.63, p = 0.01. No correlation was found with the plasma viral load, CSF protein concentration and cryptococcal antigen titer. A positive correlation was found between peripheral blood CD4+ T lymphocyte count and the CSF leukocyte count (r = 0.44, 95% CI = 0.125-0.674, p = 0.0123. Conclusion Our study suggests that CSF leukocyte count influences CSF HIV-1 viral load in patients with meningitis caused by Cryptococcus neoformans.

  8. Structural Response Analysis under Dependent Variables Based on Probability Boxes

    National Research Council Canada - National Science Library

    Xiao, Z; Yang, G

    2015-01-01

      This paper considers structural response analysis when structural uncertainty parameters distribution cannot be specified precisely due to lack of information and there are complex dependencies in the variables...

  9. Fixed transaction costs and modelling limited dependent variables

    NARCIS (Netherlands)

    Hempenius, A.L.

    1994-01-01

    As an alternative to the Tobit model, for vectors of limited dependent variables, I suggest a model, which follows from explicitly using fixed costs, if appropriate of course, in the utility function of the decision-maker.

  10. Moment-bases estimation of smooth transition regression models with endogenous variables

    NARCIS (Netherlands)

    W.D. Areosa (Waldyr Dutra); M.J. McAleer (Michael); M.C. Medeiros (Marcelo)

    2008-01-01

    textabstractNonlinear regression models have been widely used in practice for a variety of time series and cross-section datasets. For purposes of analyzing univariate and multivariate time series data, in particular, Smooth Transition Regression (STR) models have been shown to be very useful for re

  11. High dimensional linear regression models under long memory dependence and measurement error

    Science.gov (United States)

    Kaul, Abhishek

    This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the

  12. On Direction of Dependence in Latent Variable Contexts

    Science.gov (United States)

    von Eye, Alexander; Wiedermann, Wolfgang

    2014-01-01

    Approaches to determining direction of dependence in nonexperimental data are based on the relation between higher-than second-order moments on one side and correlation and regression models on the other. These approaches have experienced rapid development and are being applied in contexts such as research on partner violence, attention deficit…

  13. Discovery of Fourier-dependent time lags in cataclysmic variables

    NARCIS (Netherlands)

    Scaringi, S.; Körding, E.; Groot, P.J.; Uttley, P.; Marsh, T.; Knigge, C.; Maccarone, T.; Dhillon, V.S.

    2013-01-01

    We report the first study of Fourier-frequency-dependent coherence and phase/time lags at optical wavelengths of cataclysmic variables (MV Lyr and LU Cam) displaying typical flickering variability in white light. Observations were performed on the William Herschel Telescope using ULTRACAM. Light

  14. Performance of PLS regression coefficients in selecting variables for each response of a multivariate PLS for omics-type data

    Directory of Open Access Journals (Sweden)

    Giuseppe Palermo

    2009-05-01

    Full Text Available Giuseppe Palermo1, Paolo Piraino2, Hans-Dieter Zucht31Digilab BioVision GmbH, Hannover, Germany; 2Dr Paolo Piraino Statistical Consulting, Rende (CS, Italy; 3Proteome Sciences R&D GmbH and C. KG, Frankfurt am Main, GermanyAbstract: Multivariate partial least square (PLS regression allows the modeling of complex biological events, by considering different factors at the same time. It is unaffected by data collinearity, representing a valuable method for modeling high-dimensional biological data (as derived from genomics, proteomics and peptidomics. In presence of multiple responses, it is of particular interest how to appropriately “dissect” the model, to reveal the importance of single attributes with regard to individual responses (for example, variable selection. In this paper, performances of multivariate PLS regression coefficients, in selecting relevant predictors for different responses in omics-type of data, were investigated by means of a receiver operating characteristic (ROC analysis. For this purpose, simulated data, mimicking the covariance structures of microarray and liquid chromatography mass spectrometric data, were used to generate matrices of predictors and responses. The relevant predictors were set a priori. The influences of noise, the source of data with different covariance structure and the size of relevant predictors were investigated. Results demonstrate the applicability of PLS regression coeffi cients in selecting variables for each response of a multivariate PLS, in omics-type of data. Comparisons with other feature selection methods, such as variable importance in the projection scores, principal component regression, and least absolute shrinkage and selection operator regression were also provided.Keywords: partial least square regression, regression coefficients, variable selection, biomarker discovery, omics-data

  15. Analysis of extreme drinking in patients with alcohol dependence using Pareto regression.

    Science.gov (United States)

    Das, Sourish; Harel, Ofer; Dey, Dipak K; Covault, Jonathan; Kranzler, Henry R

    2010-05-20

    We developed a novel Pareto regression model with an unknown shape parameter to analyze extreme drinking in patients with Alcohol Dependence (AD). We used the generalized linear model (GLM) framework and the log-link to include the covariate information through the scale parameter of the generalized Pareto distribution. We proposed a Bayesian method based on Ridge prior and Zellner's g-prior for the regression coefficients. Simulation study indicated that the proposed Bayesian method performs better than the existing likelihood-based inference for the Pareto regression.We examined two issues of importance in the study of AD. First, we tested whether a single nucleotide polymorphism within GABRA2 gene, which encodes a subunit of the GABA(A) receptor, and that has been associated with AD, influences 'extreme' alcohol intake and second, the efficacy of three psychotherapies for alcoholism in treating extreme drinking behavior. We found an association between extreme drinking behavior and GABRA2. We also found that, at baseline, men with a high-risk GABRA2 allele had a significantly higher probability of extreme drinking than men with no high-risk allele. However, men with a high-risk allele responded to the therapy better than those with two copies of the low-risk allele. Women with high-risk alleles also responded to the therapy better than those with two copies of the low-risk allele, while women who received the cognitive behavioral therapy had better outcomes than those receiving either of the other two therapies. Among men, motivational enhancement therapy was the best for the treatment of the extreme drinking behavior.

  16. Improving Regression Testing through Modified Ant Colony Algorithm on a Dependency Injected Test Pattern

    Directory of Open Access Journals (Sweden)

    G.Keerthi Lakshmi

    2012-03-01

    Full Text Available Performing regression testing on a pre production environment is often viewed by software practitioners as a daunting task since often the test execution shall by-pass the stipulated downtime or the test coverage would be non linear. Choosing the exact test cases to match this type of complexity not only needs prior knowledge of the system, but also a right use of calculations to set the goals right. On systems that are just entering the production environment after getting promoted from the staging phase, trade-offs are often needed to between time and the test coverage to ensure the maximum test cases are covered within the stipulated time. There arises a need to refine the test cases to accommodate the maximum test coverage it makes within the stipulated period of time since at most of the times, the most important test cases are often not deemed to qualify under the sanity test suite and any bugs that creped in them would go undetected until it is found out by the actual user at firsthand. Hence An attempt has been made in the paper to layout a testing framework to address the process of improving the regression suite by adopting a modified version of the Ant Colony Algorithm over and thus dynamically injecting dependency over the best route encompassed by the ant colony.

  17. Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research

    Science.gov (United States)

    2012-01-01

    Background Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit. Methods A simulation study of a linear regression with a response Y and two predictors X1 and X2 was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary variables. Mechanisms of missingness were either 100% MCAR or 50% MAR + 50% MCAR. Auxiliary variables had low (r=.10) vs. moderate correlations (r=.50) with X’s and Y. Results The inclusion of auxiliary variables can improve a multiple imputation model. However, inclusion of too many variables leads to downward bias of regression coefficients and decreases precision. When the correlations are low, inclusion of auxiliary variables is not useful. Conclusion More research on auxiliary variables in multiple imputation should be performed. A preliminary rule of thumb could be that the ratio of variables to cases with complete data should not go below 1 : 3. PMID:23216665

  18. Strong Convergence of Partitioning Estimation for Nonparametric Regression Function under Dependence Samples

    Institute of Scientific and Technical Information of China (English)

    LINGNeng-xiang; DUXue-qiao

    2005-01-01

    In this paper, we study the strong consistency for partitioning estimation of regression function under samples that axe φ-mixing sequences with identically distribution.Key words: nonparametric regression function; partitioning estimation; strong convergence;φ-mixing sequences.

  19. FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

    Science.gov (United States)

    Lorenzo-Seva, Urbano; Ferrando, Pere J

    2011-03-01

    We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

  20. Feasibility demonstration of a variable frequency driver-microwave transient regression rate measurement system. [for solid propellant combustion response

    Science.gov (United States)

    Strand, L. D.; Mcnamara, R. P.

    1976-01-01

    The feasibility of a system capable of rapidly and directly measuring the low-frequency (motor characteristics length bulk mode) combustion response characteristics of solid propellants has been investigated. The system consists of a variable frequency oscillatory driver device coupled with an improved version of the JPL microwave propellant regression rate measurement system. The ratio of the normalized regression rate and pressure amplitudes and their relative phase are measured as a function of varying pressure level and frequency. Test results with a well-characterized PBAN-AP propellant formulation were found to compare favorably with the results of more conventional stability measurement techniques.

  1. Weighted linear regression using D2H and D2 as the independent variables

    Science.gov (United States)

    Hans T. Schreuder; Michael S. Williams

    1998-01-01

    Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...

  2. Comparison of Sparse and Jack-knife partial least squares regression methods for variable selection

    DEFF Research Database (Denmark)

    Karaman, Ibrahim; Qannari, El Mostafa; Martens, Harald

    2013-01-01

    The objective of this study was to compare two different techniques of variable selection, Sparse PLSR and Jack-knife PLSR, with respect to their predictive ability and their ability to identify relevant variables. Sparse PLSR is a method that is frequently used in genomics, whereas Jack-knife PL...

  3. An Extensive Study on the Disturbances Generated by Collinearity in a Linear Regression Model with Three Explanatory Variables

    Directory of Open Access Journals (Sweden)

    FLORIN MARIUS PAVELESCU

    2010-12-01

    Full Text Available In econometric models, linear regressions with three explanatory variables are widely used. As examples can be cited: Cobb-Douglas production function with three inputs (capital, labour and disembodied technical change, Kmenta function used for approximation of CES production function parameters, error-correction models, etc. In case of multiple linear regressions, estimated parameters values and some statistical tests are influenced by collinearity between explanatory variables. In fact, collinearity acts as a noise which distorts the signal (proper parameter values. This influence is emphasized by the coefficients of alignment to collinearity hazard values. The respective coefficients have some similarities with the signal to noise ratio. Consequently, it may be used when the type of collinearity is determined. For these reasons, the main purpose of this paper is to identify all the modeling factors and quantify their impact on the above-mentioned indicator values in the context of linear regression with three explanatory variables.Classification-JEL:C13,C20,C51,C52Keywords:types of collinearity, coefficient of mediated correlation, rank of explanatory variable, order of attractor of collinearity, mediated collinearity, anticollinearity.

  4. Statistical Dependence of Pipe Breaks on Explanatory Variables

    Directory of Open Access Journals (Sweden)

    Patricia Gómez-Martínez

    2017-02-01

    Full Text Available Aging infrastructure is the main challenge currently faced by water suppliers. Estimation of assets lifetime requires reliable criteria to plan assets repair and renewal strategies. To do so, pipe break prediction is one of the most important inputs. This paper analyzes the statistical dependence of pipe breaks on explanatory variables, determining their optimal combination and quantifying their influence on failure prediction accuracy. A large set of registered data from Madrid water supply network, managed by Canal de Isabel II, has been filtered, classified and studied. Several statistical Bayesian models have been built and validated from the available information with a technique that combines reference periods of time as well as geographical location. Statistical models of increasing complexity are built from zero up to five explanatory variables following two approaches: a set of independent variables or a combination of two joint variables plus an additional number of independent variables. With the aim of finding the variable combination that provides the most accurate prediction, models are compared following an objective validation procedure based on the model skill to predict the number of pipe breaks in a large set of geographical locations. As expected, model performance improves as the number of explanatory variables increases. However, the rate of improvement is not constant. Performance metrics improve significantly up to three variables, but the tendency is softened for higher order models, especially in trunk mains where performance is reduced. Slight differences are found between trunk mains and distribution lines when selecting the most influent variables and models.

  5. Sampling designs dependent on sample parameters of auxiliary variables

    CERN Document Server

    Wywiał, Janusz L

    2015-01-01

    The book offers a valuable resource for students and statisticians whose work involves survey sampling. An estimation of the population parameters in finite and fixed populations assisted by auxiliary variables is considered. New sampling designs dependent on moments or quantiles of auxiliary variables are presented on the background of the classical methods. Accuracies of the estimators based on original sampling design are compared with classical estimation procedures. Specific conditional sampling designs are applied to problems of small area estimation as well as to estimation of quantiles of variables under study. .

  6. Family system dynamics and type 1 diabetic glycemic variability: a vector-auto-regressive model.

    Science.gov (United States)

    Günther, Moritz Philipp; Winker, Peter; Böttcher, Claudia; Brosig, Burkhard

    2013-06-01

    Statistical approaches rooted in econometric methodology, so far foreign to the psychiatric and psychological realms have provided exciting and substantial new insights into complex mind-body interactions over time and individuals. Over 120 days, this structured diary study explored the mutual interactions of emotions within a classic 3-person family system with its Type 1 diabetic adolescent's daily blood glucose variability. Glycemic variability was measured through daily standard deviations of blood glucose determinations (at least 3 per day). Emotions were captured individually utilizing the self-assessment manikin on affective valence (negative-positive), activation (calm-excited), and control (dominated-dominant). Auto- and cross-correlating the stationary absolute (level) values of the mutually interacting parallel time series data sets through vector autoregression (VAR, grounded in econometric theory) allowed for the formulation of 2 concordant models. Applying Cholesky Impulse Response Analysis at a 95% confidence interval, we provided evidence for an adolescent being happy, calm, and in control to exhibit less glycemic variability and hence diabetic derailment. A nondominating mother and a happy father seemed to also reduce glycemic variability. Random shocks increasing glycemic variability affected only the adolescent and her father: In 1 model, the male parent felt in charge; in the other, he calmed down while his daughter turned sad. All reactions to external shocks lasted for less than 4 full days. Extant literature on affect and glycemic variability in Type 1 diabetic adolescents as well as challenges arising from introducing econometric theory to the field were discussed.

  7. Testing Dependent Correlations with Nonoverlapping Variables: A Monte Carlo Simulation

    Science.gov (United States)

    Silver, N. Clayton; Hittner, James B.; May, Kim

    2004-01-01

    The authors conducted a Monte Carlo simulation of 4 test statistics or comparing dependent correlations with no variables in common. Empirical Type 1 error rates and power estimates were determined for K. Pearson and L. N. G. Filon's (1898) z, O. J. Dunn and V. A. Clark's (1969) z, J. H. Steiger's (1980) original modification of Dunn and Clark's…

  8. 复杂抽样下截取因变量回归系数方差估计的模拟研究%A Simulation Study on the Variance Estimation of Regression Parameter for Censored Dependent Variable Data under Complex survey Design

    Institute of Scientific and Technical Information of China (English)

    王晓荣; 王彤

    2011-01-01

    目的 探讨复杂抽样下截取因变量数据拟合回归模型后其回归系数的方差估计.方法 模拟复杂抽样下分别从左右方向发生截取的数据,按照是否考虑抽样特征分别拟合参数与半参数回归模型,给出两种情况下模型中回归系数的标准误,比较这两种情况所得结果的异同.结果 在样本量固定的前提下拟合截取回归模型,考虑复杂抽样特征后估计所得的回归系数与假设完全随机抽样一致,但其回归系数的标准误却不同于复杂抽样的情形.如果群内异质性高,群内相关系数很小,在复杂抽样条件下回归系数的标准误要低于不考虑复杂抽样特征的情形.结论 对于抽样框完整的复杂抽样截取数据,进行数据处理时应尽可能地将抽样特征考虑在内,运用复杂抽样数据方差估计得到的结果更接近于实际情况,统计推断结果更加真实可靠.%Objective To study the method of variance estimation by using complex censored data to fit tobit regression model.Methods To simulate complex deigned censored data, then fit tobit regression model with and without taking sampling characteristics into account respectively,compute the mean and standard error of regression coefficient and compare the differences of these two results. Results With a fixed sample size, the regression coefficient of tobit model taking into account of the sampling characteristics is more closer to the ture value, and the strandard error of it is different from that of ignoring the sampling characteristics. If the units in the cluster are highly heterogeneity,the strandard error of regression coefficient is lower than that of ignoring the sampling characteristics. Conclusion We should take into account the sampling characteristics when dealing with complex survey data, and estimate relevant variance using appropriate techniques so as to get more reliable results.

  9. TSS concentration in sewers estimated from turbidity measurements by means of linear regression accounting for uncertainties in both variables.

    Science.gov (United States)

    Bertrand-Krajewski, J L

    2004-01-01

    In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity Tvalues with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.

  10. Multivariate linear regression of high-dimensional fMRI data with multiple target variables.

    Science.gov (United States)

    Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia

    2014-05-01

    Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets.

  11. Optimal Inference for Instrumental Variables Regression with non-Gaussian Errors

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Crump, Richard K.; Jansson, Michael

    This paper is concerned with inference on the coefficient on the endogenous regressor in a linear instrumental variables model with a single endogenous regressor, nonrandom exogenous regressors and instruments, and i.i.d. errors whose distribution is unknown. It is shown that under mild smoothness...

  12. Optimal Inference for Instrumental Variables Regression with non-Gaussian Errors

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Crump, Richard K.; Jansson, Michael

    This paper is concerned with inference on the coefficient on the endogenous regressor in a linear instrumental variables model with a single endogenous regressor, nonrandom exogenous regressors and instruments, and i.i.d. errors whose distribution is unknown. It is shown that under mild smoothness...

  13. Improving autocorrelation regression for the Hurst parameter estimation of long-range dependent time series based on golden section search

    Science.gov (United States)

    Li, Ming; Zhang, Peidong; Leng, Jianxing

    2016-03-01

    This article presents an improved autocorrelation correlation function (ACF) regression method of estimating the Hurst parameter of a time series with long-range dependence (LRD) by using golden section search (GSS). We shall show that the present method is substantially efficient than the conventional ACF regression method of H estimation. Our research uses fractional Gaussian noise as a data case but the method introduced is applicable to time series with LRD in general.

  14. [Understanding logistic regression].

    Science.gov (United States)

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  15. The Comparison of Methods Artificial Neural Network with Linear Regression Using Specific Variables for Prediction Stock Price in Tehran Stock Exchange

    CERN Document Server

    Ahangar, Reza Gharoie; Pournaghshband, Hassan

    2010-01-01

    In this paper, researchers estimated the stock price of activated companies in Tehran (Iran) stock exchange. It is used Linear Regression and Artificial Neural Network methods and compared these two methods. In Artificial Neural Network, of General Regression Neural Network method (GRNN) for architecture is used. In this paper, first, researchers considered 10 macro economic variables and 30 financial variables and then they obtained seven final variables including 3 macro economic variables and 4 financial variables to estimate the stock price using Independent components Analysis (ICA). So, we presented an equation for two methods and compared their results which shown that artificial neural network method is more efficient than linear regression method.

  16. Mixed effect regression analysis for a cluster-based two-stage outcome-auxiliary-dependent sampling design with a continuous outcome.

    Science.gov (United States)

    Xu, Wangli; Zhou, Haibo

    2012-09-01

    Two-stage design is a well-known cost-effective way for conducting biomedical studies when the exposure variable is expensive or difficult to measure. Recent research development further allowed one or both stages of the two-stage design to be outcome dependent on a continuous outcome variable. This outcome-dependent sampling feature enables further efficiency gain in parameter estimation and overall cost reduction of the study (e.g. Wang, X. and Zhou, H., 2010. Design and inference for cancer biomarker study with an outcome and auxiliary-dependent subsampling. Biometrics 66, 502-511; Zhou, H., Song, R., Wu, Y. and Qin, J., 2011. Statistical inference for a two-stage outcome-dependent sampling design with a continuous outcome. Biometrics 67, 194-202). In this paper, we develop a semiparametric mixed effect regression model for data from a two-stage design where the second-stage data are sampled with an outcome-auxiliary-dependent sample (OADS) scheme. Our method allows the cluster- or center-effects of the study subjects to be accounted for. We propose an estimated likelihood function to estimate the regression parameters. Simulation study indicates that greater study efficiency gains can be achieved under the proposed two-stage OADS design with center-effects when compared with other alternative sampling schemes. We illustrate the proposed method by analyzing a dataset from the Collaborative Perinatal Project.

  17. The discovery of timescale-dependent color variability of quasars

    Energy Technology Data Exchange (ETDEWEB)

    Sun, Yu-Han; Wang, Jun-Xian; Chen, Xiao-Yang [CAS Key Laboratory for Research in Galaxies and Cosmology, Department of Astronomy, University of Science and Technology of China, Hefei, Anhui 230026 (China); Zheng, Zhen-Ya, E-mail: sunyh92@mail.ustc.edu.cn, E-mail: jxw@ustc.edu.cn [School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287 (United States)

    2014-09-01

    Quasars are variable on timescales from days to years in UV/optical and generally appear bluer while they brighten. The physics behind the variations in fluxes and colors remains unclear. Using Sloan Digital Sky Survey g- and r-band photometric monitoring data for quasars in Stripe 82, we find that although the flux variation amplitude increases with timescale, the color variability exhibits the opposite behavior. The color variability of quasars is prominent at timescales as short as ∼10 days, but gradually reduces toward timescales up to years. In other words, the variable emission at shorter timescales is bluer than that at longer timescales. This timescale dependence is clearly and consistently detected at all redshifts from z = 0 to 3.5; thus, it cannot be due to contamination to broadband photometry from emission lines that do not respond to fast continuum variations. The discovery directly rules out the possibility that simply attributes the color variability to contamination from a non-variable redder component such as the host galaxy. It cannot be interpreted as changes in global accretion rate either. The thermal accretion disk fluctuation model is favored in the sense that fluctuations in the inner, hotter region of the disk are responsible for short-term variations, while longer-term and stronger variations are expected from the larger and cooler disk region. An interesting implication is that one can use quasar variations at different timescales to probe disk emission at different radii.

  18. The Discovery of Timescale-Dependent Color Variability of Quasars

    CERN Document Server

    Sun, Yu-Han; Chen, Xiao-Yang; Zheng, Zhen-Ya

    2014-01-01

    Quasars are variable on timescales from days to years in UV/optical, and generally appear bluer while they brighten. The physics behind the variations in fluxes and colors remains unclear. Using SDSS g and r band photometric monitoring data of quasars in Stripe 82, we find that although the flux variation amplitude increases with timescale, the color variability exhibits opposite behavior. The color variability of quasars is prominent at timescales as short as ~ 10 days, but gradually reduces toward timescales up to years. In other words, the variable emission at shorter timescales is bluer than that at longer timescales. This timescale dependence is clearly and consistently detected at all redshifts from z = 0 to 3.5, thus can not be due to contaminations to broadband photometry from emission lines which do not respond to fast continuum variations. The discovery directly rules out the possibility that simply attributes the color variability to contamination from a non-variable redder component, such as the h...

  19. Adapting Predictive Models for Cepheid Variable Star Classification Using Linear Regression and Maximum Likelihood

    Science.gov (United States)

    Gupta, Kinjal Dhar; Vilalta, Ricardo; Asadourian, Vicken; Macri, Lucas

    2014-05-01

    We describe an approach to automate the classification of Cepheid variable stars into two subtypes according to their pulsation mode. Automating such classification is relevant to obtain a precise determination of distances to nearby galaxies, which in addition helps reduce the uncertainty in the current expansion of the universe. One main difficulty lies in the compatibility of models trained using different galaxy datasets; a model trained using a training dataset may be ineffectual on a testing set. A solution to such difficulty is to adapt predictive models across domains; this is necessary when the training and testing sets do not follow the same distribution. The gist of our methodology is to train a predictive model on a nearby galaxy (e.g., Large Magellanic Cloud), followed by a model-adaptation step to make the model operable on other nearby galaxies. We follow a parametric approach to density estimation by modeling the training data (anchor galaxy) using a mixture of linear models. We then use maximum likelihood to compute the right amount of variable displacement, until the testing data closely overlaps the training data. At that point, the model can be directly used in the testing data (target galaxy).

  20. THE CONSTRUCTION OF AN ECONOMIC INDICATOR VARIABLES AND FINANCIAL CORPORATE INVESTMENT GRADE: STATISTICAL TREATMENT OF CORRELATIONS AND REGRESSIONS

    Directory of Open Access Journals (Sweden)

    SERGIO CAVAGNOLI GUTH

    2015-09-01

    Full Text Available In a competitive and globalized economic environment, organizations need to evolve to keep up with changes that the environment imposes on them, seeking sustainability and perpetuity. To the extent that increases the pace of change, the durability of business strategies decreases, causing the need of continuous transformations, with permanent restructuring. The objective of this study is to analyze the correlations and regression models coming from the economic and financial ratios stemmed profitability, profitability, liquidity and debt, based on the corporations that owned the investment grade certification in 2008, issued by certification International, Standard & Poor's, Moody's and Fitch Ratings. The proposed methodology for the setting of this study is typically quantitative, based on statistical analysis of correlation and regression. It was found through this study that the variables studied, could be the basis for the construction of an economic and financial indicator of investment grade. Keywords: Investment Grade. Indicator. Corporations.

  1. Heart rate variability regression and risk of sudden unexpected death in epilepsy.

    Science.gov (United States)

    Galli, Alessio; Lombardi, Federico

    2017-02-01

    The exact mechanisms of sudden unexpected death in epilepsy remain elusive, despite there is consensus that SUDEP is associated with severe derangements in the autonomic control to vital functions as breathing and heart rate regulation. Heart rate variability (HRV) has been advocated as biomarker of autonomic control to the heart. Cardiac dysautonomia has been found in diseases where other branches of the autonomous nervous system are damaged, as Parkinson disease and multiple system atrophy. In this perspective, an impaired HRV not only is a risk factor for sudden cardiac death mediated by arrhythmias, but also a potential biomarker for monitoring a progressive decline of the autonomous nervous system. This slope may lead to an acute imbalance of the regulatory pathways of vital functions after seizure and then to SUDEP. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Understanding data in clinical research: a simple graphical display for plotting data (up to four independent variables) after binary logistic regression analysis.

    Science.gov (United States)

    Mesa, José Luis

    2004-01-01

    In clinical research, suitable visualization techniques of data after statistical analysis are crucial for the researches' and physicians' understanding. Common statistical techniques to analyze data in clinical research are logistic regression models. Among these, the application of binary logistic regression analysis (LRA) has greatly increased during past years, due to its diagnostic accuracy and because scientists often want to analyze in a dichotomous way whether some event will occur or not. Such an analysis lacks a suitable, understandable, and widely used graphical display, instead providing an understandable logit function based on a linear model for the natural logarithm of the odds in favor of the occurrence of the dependent variable, Y. By simple exponential transformation, such a logit equation can be transformed into a logistic function, resulting in predicted probabilities for the presence of the dependent variable, P(Y-1/X). This model can be used to generate a simple graphical display for binary LRA. For the case of a single predictor or explanatory (independent) variable, X, a plot can be generated with X represented by the abscissa (i.e., horizontal axis) and P(Y-1/X) represented by the ordinate (i.e., vertical axis). For the case of multiple predictor models, I propose here a relief 3D surface graphic in order to plot up to four independent variables (two continuous and two discrete). By using this technique, any researcher or physician would be able to transform a lesser understandable logit function into a figure easier to grasp, thus leading to a better knowledge and interpretation of data in clinical research. For this, a sophisticated statistical package is not necessary, because the graphical display may be generated by using any 2D or 3D surface plotter.

  3. Displacement of estimates of chemical equilibrium constants at breaking of determinancy of independent variables of equilibrium system regression models

    Energy Technology Data Exchange (ETDEWEB)

    Nikolaeva, L.S.; Prikhod' ko, N.V.; Evseev, A.M.; Rozen, A.M.; Kolychev, A.E.; Gontar, B.G. (Moskovskij Gosudarstvennyj Univ. (USSR). Khimicheskij Fakul' tet)

    1982-07-01

    Using as an example regression models of extraction systems HNO/sub 3/-TBP-H/sub 2/O, UO/sub 2/(NO/sub 3/)/sub 2/-TBP-H/sub 2/O it has been shown that disregard of errors of the controlled (independent) variables 3% measurement error of UO/sub 2/(NO/sub 3/)/sub 2/ equilibrium concentration and 3% error of the determination of HNO/sub 3/ activity coefficient results in the displacement of evaluations of certain equilibria constants and leads to incorrect conclusion on the mechanism of chemical equilibria.

  4. Quadratic time dependent Hamiltonians and separation of variables

    Science.gov (United States)

    Anzaldo-Meneses, A.

    2017-06-01

    Time dependent quantum problems defined by quadratic Hamiltonians are solved using canonical transformations. The Green's function is obtained and a comparison with the classical Hamilton-Jacobi method leads to important geometrical insights like exterior differential systems, Monge cones and time dependent Gaussian metrics. The Wei-Norman approach is applied using unitary transformations defined in terms of generators of the associated Lie groups, here the semi-direct product of the Heisenberg group and the symplectic group. A new explicit relation for the unitary transformations is given in terms of a finite product of elementary transformations. The sequential application of adequate sets of unitary transformations leads naturally to a new separation of variables method for time dependent Hamiltonians, which is shown to be related to the Inönü-Wigner contraction of Lie groups. The new method allows also a better understanding of interacting particles or coupled modes and opens an alternative way to analyze topological phases in driven systems.

  5. Variability in the heritability of body mass index: a systematic review and meta-regression

    Directory of Open Access Journals (Sweden)

    Cathy E Elks

    2012-02-01

    Full Text Available Evidence for a major role of genetic factors in the determination of body mass index (BMI comes from studies of related individuals. However, heritability estimates for BMI vary widely between studies and the reasons for this remain unclear. While some variation is natural due to differences between populations and settings, study design factors may also explain some of the heterogeneity. We performed a systematic review that identified eighty-eight independent estimates of BMI heritability from twin studies (total 140,525 twins and twenty-seven estimates from family studies (42,968 family members. BMI heritability estimates from twin studies ranged from 0.47 to 0.90 (5th/50th/95th centiles: 0.58/0.75/0.87 and were generally higher than those from family studies (range: 0.24-0.81; 5th/50th/95th centiles: 0.25/0.46/0.68. Meta-regression of the results from twin studies showed that BMI heritability estimates were 0.07 (P=0.001 higher in children than in adults; estimates increased with mean age among childhood studies (+0.012 per year, P=0.002, but decreased with mean age in adult studies (-0.002 per year, P=0.002. Heritability estimates derived from AE twin models (which assume no contribution of shared environment were 0.12 higher than those from ACE models (P<0.001, whilst lower estimates were associated with self-reported versus DNA-based determination of zygosity (-0.04, P=0.02, and with self-reported versus measured BMI (-0.05, P=0.03. Together, the above factors explained 47% of the heterogeneity in estimates of BMI heritability from twin studies. In summary, while some variation in BMI heritability is expected due to population-level differences, study design factors explained nearly half the heterogeneity reported in twin studies. The genetic contribution to BMI appears to vary with age and may have a greater influence during childhood than adult life.

  6. Some Simple Computational Formulas for Multiple Regression

    Science.gov (United States)

    Aiken, Lewis R., Jr.

    1974-01-01

    Short-cut formulas are presented for direct computation of the beta weights, the standard errors of the beta weights, and the multiple correlation coefficient for multiple regression problems involving three independent variables and one dependent variable. (Author)

  7. Epoch-dependent absorption line profile variability in lambda Cep

    CERN Document Server

    Uuh-Sonda, J M; Eenens, P; Mahy, L; Palate, M; Gosset, E; Flores, C A

    2014-01-01

    We present the analysis of a multi-epoch spectroscopic monitoring campaign of the O6Ief star lambda Cep. Previous observations reported the existence of two modes of non-radial pulsations in this star. Our data reveal a much more complex situation. The frequency content of the power spectrum considerably changes from one epoch to the other. We find no stable frequency that can unambiguously be attributed to pulsations. The epoch-dependence of the frequencies and variability patterns are similar to what is seen in the wind emission lines of this and other Oef stars, suggesting that both phenomena likely have the same, currently still unknown, origin.

  8. Multiple linear regression analysis

    Science.gov (United States)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  9. Asymptotics for partly linear regression with dependent samples and ARCH errors: consistency with rates

    Institute of Scientific and Technical Information of China (English)

    LU; Zudi

    2001-01-01

    [1]Engle, R. F., Granger, C. W. J., Rice, J. et al., Semiparametric estimates of the relation between weather and electricity sales, Journal of the American Statistical Association, 1986, 81: 310.[2]Heckman, N. E., Spline smoothing in partly linear models, Journal of the Royal Statistical Society, Ser. B, 1986, 48: 244.[3]Rice, J., Convergence rates for partially splined models, Statistics & Probability Letters, 1986, 4: 203.[4]Chen, H., Convergence rates for parametric components in a partly linear model, Annals of Statistics, 1988, 16: 136.[5]Robinson, P. M., Root-n-consistent semiparametric regression, Econometrica, 1988, 56: 931.[6]Speckman, P., Kernel smoothing in partial linear models, Journal of the Royal Statistical Society, Ser. B, 1988, 50: 413.[7]Cuzick, J., Semiparametric additive regression, Journal of the Royal Statistical Society, Ser. B, 1992, 54: 831.[8]Cuzick, J., Efficient estimates in semiparametric additive regression models with unknown error distribution, Annals of Statistics, 1992, 20: 1129.[9]Chen, H., Shiau, J. H., A two-stage spline smoothing method for partially linear models, Journal of Statistical Planning & Inference, 1991, 27: 187.[10]Chen, H., Shiau, J. H., Data-driven efficient estimators for a partially linear model, Annals of Statistics, 1994, 22: 211.[11]Schick, A., Root-n consistent estimation in partly linear regression models, Statistics & Probability Letters, 1996, 28: 353.[12]Hamilton, S. A., Truong, Y. K., Local linear estimation in partly linear model, Journal of Multivariate Analysis, 1997, 60: 1.[13]Mills, T. C., The Econometric Modeling of Financial Time Series, Cambridge: Cambridge University Press, 1993, 137.[14]Engle, R. F., Autoregressive conditional heteroscedasticity with estimates of United Kingdom inflation, Econometrica, 1982, 50: 987.[15]Bera, A. K., Higgins, M. L., A survey of ARCH models: properties of estimation and testing, Journal of Economic

  10. Predicting punching acceleration from selected strength and power variables in elite karate athletes: a multiple regression analysis.

    Science.gov (United States)

    Loturco, Irineu; Artioli, Guilherme Giannini; Kobal, Ronaldo; Gil, Saulo; Franchini, Emerson

    2014-07-01

    This study investigated the relationship between punching acceleration and selected strength and power variables in 19 professional karate athletes from the Brazilian National Team (9 men and 10 women; age, 23 ± 3 years; height, 1.71 ± 0.09 m; and body mass [BM], 67.34 ± 13.44 kg). Punching acceleration was assessed under 4 different conditions in a randomized order: (a) fixed distance aiming to attain maximum speed (FS), (b) fixed distance aiming to attain maximum impact (FI), (c) self-selected distance aiming to attain maximum speed, and (d) self-selected distance aiming to attain maximum impact. The selected strength and power variables were as follows: maximal dynamic strength in bench press and squat-machine, squat and countermovement jump height, mean propulsive power in bench throw and jump squat, and mean propulsive velocity in jump squat with 40% of BM. Upper- and lower-body power and maximal dynamic strength variables were positively correlated to punch acceleration in all conditions. Multiple regression analysis also revealed predictive variables: relative mean propulsive power in squat jump (W·kg-1), and maximal dynamic strength 1 repetition maximum in both bench press and squat-machine exercises. An impact-oriented instruction and a self-selected distance to start the movement seem to be crucial to reach the highest acceleration during punching execution. This investigation, while demonstrating strong correlations between punching acceleration and strength-power variables, also provides important information for coaches, especially for designing better training strategies to improve punching speed.

  11. Land use regression modeling of intra-urban residential variability in multiple traffic-related air pollutants

    Directory of Open Access Journals (Sweden)

    Baxter Lisa K

    2008-05-01

    Full Text Available Abstract Background There is a growing body of literature linking GIS-based measures of traffic density to asthma and other respiratory outcomes. However, no consensus exists on which traffic indicators best capture variability in different pollutants or within different settings. As part of a study on childhood asthma etiology, we examined variability in outdoor concentrations of multiple traffic-related air pollutants within urban communities, using a range of GIS-based predictors and land use regression techniques. Methods We measured fine particulate matter (PM2.5, nitrogen dioxide (NO2, and elemental carbon (EC outside 44 homes representing a range of traffic densities and neighborhoods across Boston, Massachusetts and nearby communities. Multiple three to four-day average samples were collected at each home during winters and summers from 2003 to 2005. Traffic indicators were derived using Massachusetts Highway Department data and direct traffic counts. Multivariate regression analyses were performed separately for each pollutant, using traffic indicators, land use, meteorology, site characteristics, and central site concentrations. Results PM2.5 was strongly associated with the central site monitor (R2 = 0.68. Additional variability was explained by total roadway length within 100 m of the home, smoking or grilling near the monitor, and block-group population density (R2 = 0.76. EC showed greater spatial variability, especially during winter months, and was predicted by roadway length within 200 m of the home. The influence of traffic was greater under low wind speed conditions, and concentrations were lower during summer (R2 = 0.52. NO2 showed significant spatial variability, predicted by population density and roadway length within 50 m of the home, modified by site characteristics (obstruction, and with higher concentrations during summer (R2 = 0.56. Conclusion Each pollutant examined displayed somewhat different spatial patterns

  12. Example-Dependent Cost-Sensitive Logistic Regression for Credit Scoring

    OpenAIRE

    Correa Bahnsen, Alejandro; Aouada, Djamila; Ottersten, Björn

    2014-01-01

    Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples. Credit scoring is a typical example of cost-sensitive classification. However, it is usually treated using methods that do not take into account the real financial costs associated with the lending business. In this paper, we propose a new example-dependent cost matrix for credit scoring. Furthermore, we propose an algorithm that introdu...

  13. Effect size and power in assessing moderating effects of categorical variables using multiple regression: a 30-year review.

    Science.gov (United States)

    Aguinis, Herman; Beaty, James C; Boik, Robert J; Pierce, Charles A

    2005-01-01

    The authors conducted a 30-year review (1969-1998) of the size of moderating effects of categorical variables as assessed using multiple regression. The median observed effect size (f(2)) is only .002, but 72% of the moderator tests reviewed had power of .80 or greater to detect a targeted effect conventionally defined as small. Results suggest the need to minimize the influence of artifacts that produce a downward bias in the observed effect size and put into question the use of conventional definitions of moderating effect sizes. As long as an effect has a meaningful impact, the authors advise researchers to conduct a power analysis and plan future research designs on the basis of smaller and more realistic targeted effect sizes.

  14. Effect of methamphetamine dependence on heart rate variability.

    Science.gov (United States)

    Henry, Brook L; Minassian, Arpi; Perry, William

    2012-05-01

    Methamphetamine (METH) is an increasing popular and highly addictive stimulant associated with autonomic nervous system (ANS) dysfunction, cardiovascular pathology and neurotoxicity. Heart rate variability (HRV) has been used to assess autonomic function and predict mortality in cardiac disorders and drug intoxication, but has not been characterized in METH use. We recorded HRV in a sample of currently abstinent individuals with a history of METH dependence compared to age- and gender-matched drug-free comparison subjects. HRV was assessed using time domain, frequency domain, and non-linear entropic analyses in 17 previously METH-dependent and 21 drug-free comparison individuals during a 5 minute rest period. The METH-dependent group demonstrated significant reduction in HRV, reduced parasympathetic activity, and diminished heartbeat complexity relative to comparison participants. More recent METH use was associated with increased sympathetic tone. Chronic METH exposure may be associated with decreased HRV, impaired vagal function, and reduction in heart rate complexity as assessed by multiple methods of analysis. We discuss and review evidence that impaired HRV may be related to the cardiotoxic or neurotoxic effects of prolonged METH use.

  15. Linking mutagenic activity to micropollutant concentrations in wastewater samples by partial least square regression and subsequent identification of variables.

    Science.gov (United States)

    Hug, Christine; Sievers, Moritz; Ottermanns, Richard; Hollert, Henner; Brack, Werner; Krauss, Martin

    2015-11-01

    We deployed multivariate regression to identify compounds co-varying with the mutagenic activity of complex environmental samples. Wastewater treatment plant (WWTP) effluents with a large share of industrial input of different sampling dates were evaluated for mutagenic activity by the Ames Fluctuation Test and chemically characterized by a screening for suspected pro-mutagens and non-targeted software-based peak detection in full scan data. Areas of automatically detected peaks were used as predictor matrix for partial least squares projections to latent structures (PLS) in combination with measured mutagenic activity. Detected peaks were successively reduced by the exclusion of all peaks with lowest variable importance until the best model (high R(2) and Q(2)) was reached. Peaks in the best model co-varying with the observed mutagenicity showed increased chlorine, bromine, sulfur, and nitrogen abundance compared to original peak set indicating a preferential selection of anthropogenic compounds. The PLS regression revealed four tentatively identified compounds, newly identified 4-(dimethylamino)-pyridine, and three known micropollutants present in domestic wastewater as co-varying with the mutagenic activity. Co-variance between compounds stemming from industrial wastewater and mutagenic activity supported the application of "virtual" EDA as a statistical tool to separate toxicologically relevant from less relevant compounds.

  16. Exercise training improves heart rate variability after methamphetamine dependency.

    Science.gov (United States)

    Dolezal, Brett Andrew; Chudzynski, Joy; Dickerson, Daniel; Mooney, Larissa; Rawson, Richard A; Garfinkel, Alan; Cooper, Christopher B

    2014-06-01

    Heart rate variability (HRV) reflects a healthy autonomic nervous system and is increased with physical training. Methamphetamine dependence (MD) causes autonomic dysfunction and diminished HRV. We compared recently abstinent methamphetamine-dependent participants with age-matched, drug-free controls (DF) and also investigated whether HRV can be improved with exercise training in the methamphetamine-dependent participants. In 50 participants (MD = 28; DF = 22), resting heart rate (HR; R-R intervals) was recorded over 5 min while seated using a monitor affixed to a chest strap. Previously reported time domain (SDNN, RMSSD, pNN50) and frequency domain (LFnu, HFnu, LF/HF) parameters of HRV were calculated with customized software. MD were randomized to thrice-weekly exercise training (ME = 14) or equal attention without training (MC = 14) over 8 wk. Groups were compared using paired and unpaired t-tests. Statistical significance was set at P ≤ 0.05. Participant characteristics were matched between groups (mean ± SD): age = 33 ± 6 yr; body mass = 82.7 ± 12 kg, body mass index = 26.8 ± 4.1 kg·min. Compared with DF, the MD group had significantly higher resting HR (P increased SDNN (+14.7 ± 2.0 ms, +34%), RMSSD (+19.6 ± 4.2 ms, +63%), pNN50 (+22.6% ± 2.7%, +173%), HFnu (+14.2 ± 1.9, +60%), and decreased HR (-5.2 ± 1.1 bpm, -7%), LFnu (-9.6 ± 1.5, -16%), and LF/HF (-0.7 ± 0.3, -19%). These measures did not change from baseline in the MC group. HRV, based on several conventional indices, was diminished in recently abstinent, methamphetamine-dependent individuals. Moreover, physical training yielded a marked increase in HRV, representing increased vagal modulation or improved autonomic balance.

  17. Efficient Estimation of Mutual Information for Strongly Dependent Variables

    CERN Document Server

    Gao, Shuyang; Galstyan, Aram

    2014-01-01

    We demonstrate that a popular class of nonparametric mutual information (MI) estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between two strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. We introduce a new estimator that is robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude. We demonstrate the superior performance of the proposed estimator on both synthetic and real-world data.

  18. Growth variability in a tissue governed by stress dependent growth

    Science.gov (United States)

    Alim, Karen; Boudaoud, Arezki

    2012-02-01

    Cell wall mechanics lie at the heart of plant cell growth and tissue morphogenesis. Conversely, mechanical forces generated at tissue level can feedback on cellular dynamics. Differential growth of neighboring cells is one eminent origin of mechanical forces and stresses in tissues where cells adhere to each other. How can stresses arising from differential growth orchestrate large scale tissue growth? We show that cell growth coupled to the cell's main stress can reduce or increase tissue growth variability. Employing a cell-based two dimensional tissue model we investigate the dynamics of a tissue with stress depending growth dynamics. We find that the exact cell division rule strongly affects not only the tissue geometry and topology but also its growth dynamics. Our results should enable to infer underlying growth dynamics from live tissue statistics.

  19. Characterizing heart rate variability by scale-dependent Lyapunov exponent

    Science.gov (United States)

    Hu, Jing; Gao, Jianbo; Tung, Wen-wen

    2009-06-01

    Previous studies on heart rate variability (HRV) using chaos theory, fractal scaling analysis, and many other methods, while fruitful in many aspects, have produced much confusion in the literature. Especially the issue of whether normal HRV is chaotic or stochastic remains highly controversial. Here, we employ a new multiscale complexity measure, the scale-dependent Lyapunov exponent (SDLE), to characterize HRV. SDLE has been shown to readily characterize major models of complex time series including deterministic chaos, noisy chaos, stochastic oscillations, random 1/f processes, random Levy processes, and complex time series with multiple scaling behaviors. Here we use SDLE to characterize the relative importance of nonlinear, chaotic, and stochastic dynamics in HRV of healthy, congestive heart failure, and atrial fibrillation subjects. We show that while HRV data of all these three types are mostly stochastic, the stochasticity is different among the three groups.

  20. To resuscitate or not to resuscitate: a logistic regression analysis of physician-related variables influencing the decision.

    Science.gov (United States)

    Einav, Sharon; Alon, Gady; Kaufman, Nechama; Braunstein, Rony; Carmel, Sara; Varon, Joseph; Hersch, Moshe

    2012-09-01

    To determine whether variables in physicians' backgrounds influenced their decision to forego resuscitating a patient they did not previously know. Questionnaire survey of a convenience sample of 204 physicians working in the departments of internal medicine, anaesthesiology and cardiology in 11 hospitals in Israel. Twenty per cent of the participants had elected to forego resuscitating a patient they did not previously know without additional consultation. Physicians who had more frequently elected to forego resuscitation had practised medicine for more than 5 years (p=0.013), estimated the number of resuscitations they had performed as being higher (p=0.009), and perceived their experience in resuscitation as sufficient (p=0.001). The variable that predicted the outcome of always performing resuscitation in the logistic regression model was less than 5 years of experience in medicine (OR 0.227, 95% CI 0.065 to 0.793; p=0.02). Physicians' level of experience may affect the probability of a patient's receiving resuscitation, whereas the physicians' personal beliefs and values did not seem to affect this outcome.

  1. A comparison on parameter-estimation methods in multiple regression analysis with existence of multicollinearity among independent variables

    Directory of Open Access Journals (Sweden)

    Hukharnsusatrue, A.

    2005-11-01

    Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than

  2. Dependence of NAO variability on coupling with sea ice

    Science.gov (United States)

    Strong, Courtenay; Magnusdottir, Gudrun

    2011-05-01

    The variance of the North Atlantic Oscillation index (denoted N) is shown to depend on its coupling with area-averaged sea ice concentration anomalies in and around the Barents Sea (index denoted B). The observed form of this coupling is a negative feedback whereby positive N tends to produce negative B, which in turn forces negative N. The effects of this feedback in the system are examined by modifying the feedback in two modeling frameworks: a statistical vector autoregressive model ( F VAR) and an atmospheric global climate model ( F CAM) customized so that sea ice anomalies on the lower boundary are stochastic with adjustable sensitivity to the model's evolving N. Experiments show that the variance of N decreases nearly linearly with the sensitivity of B to N, where the sensitivity is a measure of the negative feedback strength. Given that the sea ice concentration field has anomalies, the variance of N goes down as these anomalies become more sensitive to N. If the sea ice concentration anomalies are entirely absent, the variance of N is even smaller than the experiment with the most sensitive anomalies. Quantifying how the variance of N depends on the presence and sensitivity of sea ice anomalies to N has implications for the simulation of N in global climate models. In the physical system, projected changes in sea ice thickness or extent could alter the sensitivity of B to N, impacting the within-season variability and hence predictability of N.

  3. Regression Basics

    CERN Document Server

    Kahane, Leo H

    2007-01-01

    Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:

  4. Classifying geometric variability by dominant eigenmodes of deformation in regressing tumours during active breath-hold lung cancer radiotherapy

    Science.gov (United States)

    Badawi, Ahmed M.; Weiss, Elisabeth; Sleeman, William C., IV; Hugo, Geoffrey D.

    2012-01-01

    The purpose of this study is to develop and evaluate a lung tumour interfraction geometric variability classification scheme as a means to guide adaptive radiotherapy and improve measurement of treatment response. Principal component analysis (PCA) was used to generate statistical shape models of the gross tumour volume (GTV) for 12 patients with weekly breath hold CT scans. Each eigenmode of the PCA model was classified as ‘trending’ or ‘non-trending’ depending on whether its contribution to the overall GTV variability included a time trend over the treatment course. Trending eigenmodes were used to reconstruct the original semi-automatically delineated GTVs into a reduced model containing only time trends. Reduced models were compared to the original GTVs by analyzing the reconstruction error in the GTV and position. Both retrospective (all weekly images) and prospective (only the first four weekly images) were evaluated. The average volume difference from the original GTV was 4.3% ± 2.4% for the trending model. The positional variability of the GTV over the treatment course, as measured by the standard deviation of the GTV centroid, was 1.9 ± 1.4 mm for the original GTVs, which was reduced to 1.2 ± 0.6 mm for the trending-only model. In 3/13 cases, the dominant eigenmode changed class between the prospective and retrospective models. The trending-only model preserved GTV and shape relative to the original GTVs, while reducing spurious positional variability. The classification scheme appears feasible for separating types of geometric variability by time trend.

  5. Evaluation of heat transfer mathematical models and multiple linear regression to predict the inside variables in semi-solar greenhouse

    Directory of Open Access Journals (Sweden)

    M Taki

    2017-05-01

    Full Text Available Introduction Controlling greenhouse microclimate not only influences the growth of plants, but also is critical in the spread of diseases inside the greenhouse. The microclimate parameters were inside air, greenhouse roof and soil temperature, relative humidity and solar radiation intensity. Predicting the microclimate conditions inside a greenhouse and enabling the use of automatic control systems are the two main objectives of greenhouse climate model. The microclimate inside a greenhouse can be predicted by conducting experiments or by using simulation. Static and dynamic models are used for this purpose as a function of the metrological conditions and the parameters of the greenhouse components. Some works were done in past to 2015 year to simulation and predict the inside variables in different greenhouse structures. Usually simulation has a lot of problems to predict the inside climate of greenhouse and the error of simulation is higher in literature. The main objective of this paper is comparison between heat transfer and regression models to evaluate them to predict inside air and roof temperature in a semi-solar greenhouse in Tabriz University. Materials and Methods In this study, a semi-solar greenhouse was designed and constructed at the North-West of Iran in Azerbaijan Province (geographical location of 38°10′ N and 46°18′ E with elevation of 1364 m above the sea level. In this research, shape and orientation of the greenhouse, selected between some greenhouses common shapes and according to receive maximum solar radiation whole the year. Also internal thermal screen and cement north wall was used to store and prevent of heat lost during the cold period of year. So we called this structure, ‘semi-solar’ greenhouse. It was covered with glass (4 mm thickness. It occupies a surface of approximately 15.36 m2 and 26.4 m3. The orientation of this greenhouse was East–West and perpendicular to the direction of the wind prevailing

  6. Modelos de regresión para variables expresadas como una proporción continua Regression models for variables expressed as a continuous proportion

    Directory of Open Access Journals (Sweden)

    Aarón Salinas-Rodríguez

    2006-10-01

    the Public Health field. MATERIAL AND METHODS: From the National Reproductive Health Survey performed in 2003, the proportion of individual coverage in the family planning program -proposed in one study carried out in the National Institute of Public Health in Cuernavaca, Morelos, Mexico (2005- was modeled using the Normal, Gamma, Beta and quasi-likelihood regression models. The Akaike Information Criterion (AIC proposed by McQuarrie and Tsai was used to define the best model. Then, using a simulation (Monte Carlo/Markov Chains approach a variable with a Beta distribution was generated to evaluate the behavior of the 4 models while varying the sample size from 100 to 18 000 observations. RESULTS: Results showed that the best statistical option for the analysis of continuous proportions was the Beta regression model, since its assumptions are easily accomplished and because it had the lowest AIC value. Simulation evidenced that while the sample size increases the Gamma, and even more so the quasi-likelihood, models come significantly close to the Beta regression model. CONCLUSIONS: The use of parametric Beta regression is highly recommended to model continuous proportions and the normal model should be avoided. If the sample size is large enough, the use of quasi-likelihood model represents a good alternative.

  7. Human phoneme recognition depending on speech-intrinsic variability.

    Science.gov (United States)

    Meyer, Bernd T; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger

    2010-11-01

    The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent regions and covers several intrinsic variations. By comparing results depending on intrinsic and extrinsic variations (i.e., different levels of masking noise), the degradation induced by variabilities can be expressed in terms of the SNR. The spectral level distance between the respective speech segment and the long-term spectrum of the masking noise was found to be a good predictor for recognition rates, while phoneme confusions were influenced by the distance to spectrally close phonemes. An analysis based on transmitted information of articulatory features showed that voicing and manner of articulation are comparatively robust cues in the presence of intrinsic variations, whereas the coding of place is more degraded. The database and detailed results have been made available for comparisons between human speech recognition (HSR) and automatic speech recognizers (ASR).

  8. Exercise Training Improves Heart Rate Variability after Methamphetamine Dependency

    Science.gov (United States)

    Dolezal, Brett A.; Chudzynski, Joy; Dickerson, Daniel; Mooney, Larissa; Rawson, Richard A.; Garfinkel, Alan; Cooper, Christopher B.

    2014-01-01

    Purpose Heart rate variability (HRV) reflects a healthy autonomic nervous system and is increased with physical training. Methamphetamine dependence (MD) causes autonomic dysfunction and diminished HRV. We compared recently abstinent MD participants with age-matched, drug free controls (DF) and also investigated whether HRV can be improved with exercise training in the MD participants. Methods In 50 participants (MD=28; DF=22) resting heart rate (R-R intervals) was recorded over 5 min while seated using a monitor affixed to a chest strap. Previously reported time-domain (SDNN, RMSSD, pNN50) and frequency-domain (LFnu, HFnu, LF/HF) parameters of HRV were calculated with customized software. MD were randomized to thrice weekly exercise training (ME=14) or equal attention without training (MC=14) over 8 weeks. Groups were compared using paired and unpaired t-tests. Statistical significance was set at P≤0.05. Results Participant characteristics were matched between groups: age 33±6 years; body mass 82.7±12 kg, BMI 26.8±4.1 kg•min−2, mean±SD. Compared with DF, the MD group had significantly higher resting heart rate (P<0.05), LFnu, and LF/HF (P<0.001) as well as lower SDNN, RMSSD, pNN50 and HFnu (all P<0.001). At randomization, HRV indices were similar between ME and MC groups. However, after training, the ME group significantly (all P<0.001) increased SDNN (+14.7±2.0 ms, +34%), RMSSD (+19.6±4.2 ms, +63%), pNN50 (+22.6±2.7%, +173%), HFnu (+14.2±1.9, +60%) and decreased HR (−5.2±1.1 beats·min−1, −7%), LFnu (−9.6±1.5, −16%) and LF/HF (−0.7±0.3, −19%). These measures did not change from baseline in the MC group. Conclusion HRV, based on several conventional indices, was diminished in recently abstinent, methamphetamine dependent individuals. Moreover, physical training yielded a marked increase of HRV representing increased vagal modulation or improved autonomic balance. PMID:24162556

  9. Multi-Modal Multi-Task Learning for Joint Prediction of Multiple Regression and Classification Variables in Alzheimer’s Disease

    Science.gov (United States)

    Zhang, Daoqiang; Shen, Dinggang

    2011-01-01

    Many machine learning and pattern classification methods have been applied to the diagnosis of Alzheimer’s disease (AD) and its prodromal stage, i.e., mild cognitive impairment (MCI). Recently, rather than predicting categorical variables as in classification, several pattern regression methods have also been used to estimate continuous clinical variables from brain images. However, most existing regression methods focus on estimating multiple clinical variables separately and thus cannot utilize the intrinsic useful correlation information among different clinical variables. On the other hand, in those regression methods, only a single modality of data (usually only the structural MRI) is often used, without considering the complementary information that can be provided by different modalities. In this paper, we propose a general methodology, namely Multi-Modal Multi-Task (M3T) learning, to jointly predict multiple variables from multi-modal data. Here, the variables include not only the clinical variables used for regression but also the categorical variable used for classification, with different tasks corresponding to prediction of different variables. Specifically, our method contains two key components, i.e., (1) a multi-task feature selection which selects the common subset of relevant features for multiple variables from each modality, and (2) a multi-modal support vector machine which fuses the above-selected features from all modalities to predict multiple (regression and classification) variables. To validate our method, we perform two sets of experiments on ADNI baseline MRI, FDG-PET, and cerebrospinal fluid (CSF) data from 45 AD patients, 91 MCI patients, and 50 healthy controls (HC). In the first set of experiments, we estimate two clinical variables such as Mini Mental State Examination (MMSE) and Alzheimer’s Disease Assessment Scale - Cognitive Subscale (ADAS-Cog), as well as one categorical variable (with value of ‘AD’, ‘MCI’ or

  10. On the Variability and Correlation of Surface Ozone and Carbon Monoxide Observed in Hong Kong Using Trajectory and Regression Analyses

    Institute of Scientific and Technical Information of China (English)

    WANG Tijian(王体健); K. S. LAM; C. W. TSANG; S. C. KOT

    2004-01-01

    This paper investigates,the variability and correlation of surface ozone (03) and carbon monoxide (CO) observed at Cape D'Aguilar in Hong Kong from I January 1994 to 31 December 1995.Statistical analysis shows that the average 03 and CO mixing ratios during the two years are 32:k17 ppbv and 305:k191ppbv,respectively.The O3/CO ratio ranges from 0.05 to 0.6 ppbv/ppbv with its frequency peaking at 0.15.The raw dataset is divided into six groups using backward trajectory and cluster analyses.For data assigned to the same trajectory type,three groups are further sorted out based on CO and NOx mixing ratios.The correlation coefficients and slopes of O3/CO for the 18 groups are calculated using linear regression analysis.Final]y,five kinds of air masses with different chemical features are identified:continental background (CB),marine background (MB),regional polluted continental (RPC),perturbed marine (P'M),and local polluted (LP) air masses.Further studies indicate that 03 and CO in the continental and marine background air masses (CB and MB) are positively correlated for the reason that they are well mixed over the long range transport before arriving at the site.The negative correlation between 03 and CO in air mass LP is believed to be associated with heavy anthropogenic influence,which results from the enhancement by local sources as indicated by high CO and NOx and depletion of 03 when mixed with fresh emissions.The positive correlation in the perturbed marine air mass P*M favors the low photochemical production of 03.The negative,correlation found in the regional polluted continental air mass RPC is different from the observations at Oki Island in Japan due to the more complex 03 chemistry at Cape D'Aguilar.

  11. Variability of interconnected wind plants: correlation length and its dependence on variability time scale

    Science.gov (United States)

    St. Martin, Clara M.; Lundquist, Julie K.; Handschy, Mark A.

    2015-04-01

    The variability in wind-generated electricity complicates the integration of this electricity into the electrical grid. This challenge steepens as the percentage of renewably-generated electricity on the grid grows, but variability can be reduced by exploiting geographic diversity: correlations between wind farms decrease as the separation between wind farms increases. But how far is far enough to reduce variability? Grid management requires balancing production on various timescales, and so consideration of correlations reflective of those timescales can guide the appropriate spatial scales of geographic diversity grid integration. To answer ‘how far is far enough,’ we investigate the universal behavior of geographic diversity by exploring wind-speed correlations using three extensive datasets spanning continents, durations and time resolution. First, one year of five-minute wind power generation data from 29 wind farms span 1270 km across Southeastern Australia (Australian Energy Market Operator). Second, 45 years of hourly 10 m wind-speeds from 117 stations span 5000 km across Canada (National Climate Data Archive of Environment Canada). Finally, four years of five-minute wind-speeds from 14 meteorological towers span 350 km of the Northwestern US (Bonneville Power Administration). After removing diurnal cycles and seasonal trends from all datasets, we investigate dependence of correlation length on time scale by digitally high-pass filtering the data on 0.25-2000 h timescales and calculating correlations between sites for each high-pass filter cut-off. Correlations fall to zero with increasing station separation distance, but the characteristic correlation length varies with the high-pass filter applied: the higher the cut-off frequency, the smaller the station separation required to achieve de-correlation. Remarkable similarities between these three datasets reveal behavior that, if universal, could be particularly useful for grid management. For high

  12. Relationship of push-ups and sit-ups tests to selected anthropometric variables and performance results: a multiple regression study.

    Science.gov (United States)

    Esco, Michael R; Olson, Michele S; Williford, Henry

    2008-11-01

    The purpose of this study was to explore whether selected anthropometric measures such as specific skinfold sites, along with weight, height, body mass index (BMI), waist and hip circumferences, and waist/hip ratio (WHR) were associated with sit-ups (SU) and push-ups (PU) performance, and to build a regression model for SU and PU tests. One hundred apparently healthy adults (40 men and 60 women) served as the subjects for test validation. The subjects performed 60-second SU and PU tests. The variables analyzed via multiple regression included weight, height, BMI, hip and waist circumferences, WHR, skinfolds at the abdomen (SFAB), thigh (SFTH), and subscapularis (SFSS), and sex. An additional cohort of 40 subjects (17 men and 23 women) was used to cross-validate the regression models. Validity was confirmed by correlation and paired t-tests. The regression analysis yielded a four-variable (PU, height, SFAB, and SFTH) multiple regression equation for estimating SU (R2 = 0.64, SEE = 7.5 repetitions). For PU, only SU was loaded into the regression equation (R2 = 0.43, SEE = 9.4 repetitions). Thus, the variables in the regression models accounted for 64% and 43% of the variation in SU and PU, respectively. The cross-validation sample elicited a high correlation for SU (r = 0.87) and PU (r = 0.79) scores. Moreover, paired-samples t-tests revealed that there were no significant differences between actual and predicted SU and PU scores. Therefore, this study shows that there are a number of selected, health-related anthropometric variables that account significantly for, and are predictive of, SU and PU tests.

  13. Multiple Regression and Mediator Variables can be used to Avoid Double Counting when Economic Values are Derived using Stochastic Herd Simulation

    DEFF Research Database (Denmark)

    Østergaard, Søren; Ettema, Jehan Frans; Hjortø, Line

    Multiple regression and model building with mediator variables was addressed to avoid double counting when economic values are estimated from data simulated with herd simulation modeling (using the SimHerd model). The simulated incidence of metritis was analyzed statistically as the independent...... variable, while using the traits representing the direct effects of metritis on yield, fertility and occurrence of other diseases as mediator variables. The economic value of metritis was estimated to be €78 per 100 cow-years for each 1% increase of metritis in the period of 1-100 days in milk...... in multiparous cows. The merit of using this approach was demonstrated since the economic value of metritis was estimated to be 81% higher when no mediator variables were included in the multiple regression analysis...

  14. Linear regression

    CERN Document Server

    Olive, David J

    2017-01-01

    This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...

  15. Kendall-Theil Robust Line (KTRLine--version 1.0)-A Visual Basic Program for Calculating and Graphing Robust Nonparametric Estimates of Linear-Regression Coefficients Between Two Continuous Variables

    Science.gov (United States)

    Granato, Gregory E.

    2006-01-01

    data in subsequent rows. The user may choose the columns that contain the independent (X) and dependent (Y) variable. A third column, if present, may contain metadata such as the sample-collection location and date. The program screens the input files and plots the data. The KTRLine software is a graphical tool that facilitates development of regression models by use of graphs of the regression line with data, the regression residuals (with X or Y), and percentile plots of the cumulative frequency of the X variable, Y variable, and the regression residuals. The user may individually transform the independent and dependent variables to reduce heteroscedasticity and to linearize data. The program plots the data and the regression line. The program also prints model specifications and regression statistics to the screen. The user may save and print the regression results. The program can accept data sets that contain up to about 15,000 XY data points, but because the program must sort the array of all pairwise slopes, the program may be perceptibly slow with data sets that contain more than about 1,000 points.

  16. Selecting variables in non-parametric regression models for binary response. An application to the computerized detection of breast cancer.

    Science.gov (United States)

    Roca-Pardiñas, Javier; Cadarso-Suárez, Carmen; Tahoces, Pablo G; Lado, María J

    2009-01-30

    In many biomedical applications, interest lies in being able to distinguish between two possible states of a given response variable, depending on the values of certain continuous predictors. If the number of predictors, p, is high, or if there is redundancy among them, it then becomes important to decide on the selection of the best subset of predictors that will be able to obtain the models with greatest discrimination capacity. With this aim in mind, logistic generalized additive models were considered and receiver operating characteristic (ROC) curves were applied in order to determine and compare the discriminatory capacity of such models. This study sought to develop bootstrap-based tests that allow for the following to be ascertained: (a) the optimal number q < or = p of predictors; and (b) the model or models including q predictors, which display the largest AUC (area under the ROC curve). A simulation study was conducted to verify the behaviour of these tests. Finally, the proposed method was applied to a computer-aided diagnostic system dedicated to early detection of breast cancer. Copyright (c) 2008 John Wiley & Sons, Ltd.

  17. Factorial kriging and stepwise regression approach to identify environmental factors influencing spatial multi-scale variability of heavy metals in soils.

    Science.gov (United States)

    Lv, Jianshu; Liu, Yang; Zhang, Zulu; Dai, Jierui

    2013-10-15

    The knowledge about spatial variations of heavy metals in soils and their relationships with environmental factors is important for human impact assessment and soil management. Surface soils from Rizhao city, Eastern China with rapid urbanization and industrialization were analyzed for six key heavy metals and characterized by parent material and land use using GIS-based data. Factorial kriging analysis and stepwise multiple regression were applied to examine the scale-dependent relationships among heavy metals and to identify environmental factors affecting spatial variability at each spatial scale. Linear model of coregionalization fitting showed that spatial multi-scale variation of heavy metals in soils consisted of nugget effect, an exponential structure with the range of 12 km (short-range scale), as well as a spherical structure with the range of 36 km (long-range scale). The short-range variation of Cd, Pb and Zn were controlled by land use, with higher values in urban areas as well as cultivated land in mountain area, and were related to human influence; while parent material dominated the long structure variations of these elements. Spatial variations of Cr and Ni were associated with natural geochemical sources at short- and long-range scales. At both two scales, Hg dominated by land use, corresponded well to spatial distributions of urban areas, and was attributed to anthropic emissions and atmosphere deposition.

  18. Age Dependent Variability in Gene Expression in Fischer 344 ...

    Science.gov (United States)

    Recent evidence suggests older adults may be a sensitive population with regard to environmental exposure to toxic compounds. One source of this sensitivity could be an enhanced variability in response. Studies on phenotypic differences have suggested that variation in response does increase with age. However, few reports address the question of variation in gene expression as an underlying cause for increased variability of phenotypic response in the aged. In this study, we utilized global analysis to compare variation in constitutive gene expression in the retinae of young (4 mos), middle-aged (11 mos) and aged (23 mos) Fischer 344 rats. Three hundred and forty transcripts were identified in which variance in expression increased from 4 to 23 mos of age, while only twelve transcripts were found for which it decreased. Functional roles for identified genes were clustered in basic biological categories including cell communication, function, metabolism and response to stimuli. Our data suggest that population stochastically-induced variability should be considered in assessing sensitivity due to old age. Recent evidence suggests older adults may be a sensitive population with regard to environmental exposure to toxic compounds. One source of this sensitivity could be an enhanced variability in response. Studies on phenotypic differences have suggested that variation in response does increase with age. However, few reports address the question of variation in

  19. High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models

    Science.gov (United States)

    Welp, Gerhard; Thiel, Michael

    2017-01-01

    Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties–sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen–in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models–multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)–were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of

  20. High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models.

    Science.gov (United States)

    Forkuor, Gerald; Hounkpatin, Ozias K L; Welp, Gerhard; Thiel, Michael

    2017-01-01

    Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties-sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen-in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models-multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)-were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of redness

  1. Interval ridge regression (iRR) as a fast and robust method for quantitative prediction and variable selection applied to edible oil adulteration.

    Science.gov (United States)

    Jović, Ozren; Smrečki, Neven; Popović, Zora

    2016-04-01

    A novel quantitative prediction and variable selection method called interval ridge regression (iRR) is studied in this work. The method is performed on six data sets of FTIR, two data sets of UV-vis and one data set of DSC. The obtained results show that models built with ridge regression on optimal variables selected with iRR significantly outperfom models built with ridge regression on all variables in both calibration (6 out of 9 cases) and validation (2 out of 9 cases). In this study, iRR is also compared with interval partial least squares regression (iPLS). iRR outperfomed iPLS in validation (insignificantly in 6 out of 9 cases and significantly in one out of 9 cases for poil, a well known health beneficial nutrient, is studied in this work by mixing it with cheap and widely used oils such as soybean (So) oil, rapeseed (R) oil and sunflower (Su) oil. Binary mixture sets of hempseed oil with these three oils (HSo, HR and HSu) and a ternary mixture set of H oil, R oil and Su oil (HRSu) were considered. The obtained accuracy indicates that using iRR on FTIR and UV-vis data, each particular oil can be very successfully quantified (in all 8 cases RMSEPoil (R(2)>0.99).

  2. The functional central limit theorem for strong near-epoch dependent random variables

    Institute of Scientific and Technical Information of China (English)

    QIU Jin; LIN Zhengyan

    2004-01-01

    The functional central limit theorem for strong near-epoch dependent sequences of random variables is proved.The conditions given improve on previous results in the literature concerning dependence and heterogeneity.

  3. On an asymptotic distribution of dependent random variables on a 3-dimensional lattice.

    Science.gov (United States)

    Harvey, Danielle J; Weng, Qian; Beckett, Laurel A

    2010-06-15

    We define conditions under which sums of dependent spatial data will be approximately normally distributed. A theorem on the asymptotic distribution of a sum of dependent random variables defined on a 3-dimensional lattice is presented. Examples are also presented.

  4. Minimax lower bound for kink location estimators in a nonparametric regression model with long-range dependence

    CERN Document Server

    Wishart, Justin Rory

    2011-01-01

    In this paper, a lower bound is determined in the minimax sense for change point estimators of the first derivative of a regression function in the fractional white noise model. Similar minimax results presented previously in the area focus on change points in the derivatives of a regression function in the white noise model or consider estimation of the regression function in the presence of correlated errors.

  5. LARGE DEVIATIONS AND MODERATE DEVIATIONS FOR SUMS OF NEGATIVELY DEPENDENT RANDOM VARIABLES

    Institute of Scientific and Technical Information of China (English)

    Liu Li; Wan Chenggao; Feng Yanqin

    2011-01-01

    In this article, we obtain the large deviations and moderate deviations for negatively dependent (ND) and non-identically distributed random variables defined on (-∞, +∞). The results show that for some non-identical random variables, precise large deviations and moderate deviations remain insensitive to negative dependence structure.

  6. Concomitant variables in finite mixture models

    NARCIS (Netherlands)

    Wedel, M

    The standard mixture model, the concomitant variable mixture model, the mixture regression model and the concomitant variable mixture regression model all enable simultaneous identification and description of groups of observations. This study reviews the different ways in which dependencies among

  7. Geodesic least squares regression on information manifolds

    Energy Technology Data Exchange (ETDEWEB)

    Verdoolaege, Geert, E-mail: geert.verdoolaege@ugent.be [Department of Applied Physics, Ghent University, Ghent, Belgium and Laboratory for Plasma Physics, Royal Military Academy, Brussels (Belgium)

    2014-12-05

    We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.

  8. Depicting the Dependency of Isoprene in Ambient Air and from Plants on Temperature and Solar Radiation by Using Regression Analysis

    Science.gov (United States)

    Saxena, Pallavi; Ghosh, Chirashree

    2016-07-01

    Among all sources of volatile organic compounds, isoprene emission from plants is an important part of the atmospheric hydrocarbon budget. In the present study, isoprene emission capacity at the bottom of the canopies of plant species viz. Dalbergia sissoo and Nerium oleander and in ambient air at different sites selected on the basis of land use pattern viz. near to traffic intersection with dense vegetation, away from traffic intersection with dense vegetation under floodplain area (Site I) and away from traffic intersection with dense vegetation under hilly ridge area (Site II) during three different seasons (monsoon, winter and summer) in Delhi were measured. In order to find out the dependence of isoprene emission rate on temperature and solar radiation, regression analysis has been performed. In case of dependency of isoprene in ambient air on temperature and solar radiation in selected seasons it has been found that high isoprene was found during summer season as compared to winter and monsoon seasons. Thus, positive linear relationship gives the best fit between temperature, solar rdaiation and isoprene during summer season as compared to winter and monsoon season. On the other hand, in case of isoprene emission from selected plant species, it has been found that high temperature and solar radiation promotes high isoprene emission rates during summer season as compared to winter and monsoon seasons in D. sissoo. Thus, positive linear relationship gives the best fit between temperature, solar radiation and isoprene emission rate during summer season as compared to winter and monsoon season. In contrast, in case of Nerium oleander, no such appropriate relationship was obtained. The study concludes that in ambient air, isoprene concentration was found to be high during summer season as compared to other seasons and gives best fit between temperature, solar radiation and isoprene. In case of plants, Dalbergia sissoo comes under high isoprene emission category

  9. Introduction to the use of regression models in epidemiology.

    Science.gov (United States)

    Bender, Ralf

    2009-01-01

    Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.

  10. How to regress and predict in a Bland-Altman plot? Review and contribution based on tolerance intervals and correlated-errors-in-variables models.

    Science.gov (United States)

    Francq, Bernard G; Govaerts, Bernadette

    2016-06-30

    Two main methodologies for assessing equivalence in method-comparison studies are presented separately in the literature. The first one is the well-known and widely applied Bland-Altman approach with its agreement intervals, where two methods are considered interchangeable if their differences are not clinically significant. The second approach is based on errors-in-variables regression in a classical (X,Y) plot and focuses on confidence intervals, whereby two methods are considered equivalent when providing similar measures notwithstanding the random measurement errors. This paper reconciles these two methodologies and shows their similarities and differences using both real data and simulations. A new consistent correlated-errors-in-variables regression is introduced as the errors are shown to be correlated in the Bland-Altman plot. Indeed, the coverage probabilities collapse and the biases soar when this correlation is ignored. Novel tolerance intervals are compared with agreement intervals with or without replicated data, and novel predictive intervals are introduced to predict a single measure in an (X,Y) plot or in a Bland-Atman plot with excellent coverage probabilities. We conclude that the (correlated)-errors-in-variables regressions should not be avoided in method comparison studies, although the Bland-Altman approach is usually applied to avert their complexity. We argue that tolerance or predictive intervals are better alternatives than agreement intervals, and we provide guidelines for practitioners regarding method comparison studies. Copyright © 2016 John Wiley & Sons, Ltd.

  11. Diplotype Trend Regression Analysis of the ADH Gene Cluster and the ALDH2 Gene: Multiple Significant Associations with Alcohol Dependence

    Science.gov (United States)

    Luo, Xingguang; Kranzler, Henry R.; Zuo, Lingjun; Wang, Shuang; Schork, Nicholas J.; Gelernter, Joel

    2006-01-01

    The set of alcohol-metabolizing enzymes has considerable genetic and functional complexity. The relationships between some alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) genes and alcohol dependence (AD) have long been studied in many populations, but not comprehensively. In the present study, we genotyped 16 markers within the ADH gene cluster (including the ADH1A, ADH1B, ADH1C, ADH5, ADH6, and ADH7 genes), 4 markers within the ALDH2 gene, and 38 unlinked ancestry-informative markers in a case-control sample of 801 individuals. Associations between markers and disease were analyzed by a Hardy-Weinberg equilibrium (HWE) test, a conventional case-control comparison, a structured association analysis, and a novel diplotype trend regression (DTR) analysis. Finally, the disease alleles were fine mapped by a Hardy-Weinberg disequilibrium (HWD) measure (J). All markers were found to be in HWE in controls, but some markers showed HWD in cases. Genotypes of many markers were associated with AD. DTR analysis showed that ADH5 genotypes and diplotypes of ADH1A, ADH1B, ADH7, and ALDH2 were associated with AD in European Americans and/or African Americans. The risk-influencing alleles were fine mapped from among the markers studied and were found to coincide with some well-known functional variants. We demonstrated that DTR was more powerful than many other conventional association methods. We also found that several ADH genes and the ALDH2 gene were susceptibility loci for AD, and the associations were best explained by several independent risk genes. PMID:16685648

  12. Energy decay of a variable-coefficient wave equation with nonlinear time-dependent localized damping

    Directory of Open Access Journals (Sweden)

    Jieqiong Wu

    2015-09-01

    Full Text Available We study the energy decay for the Cauchy problem of the wave equation with nonlinear time-dependent and space-dependent damping. The damping is localized in a bounded domain and near infinity, and the principal part of the wave equation has a variable-coefficient. We apply the multiplier method for variable-coefficient equations, and obtain an energy decay that depends on the property of the coefficient of the damping term.

  13. An Investigation of the Variables Predicting Faculty of Education Students' Speaking Anxiety through Ordinal Logistic Regression Analysis

    Science.gov (United States)

    Bozpolat, Ebru

    2017-01-01

    The purpose of this study is to determine whether Cumhuriyet University Faculty of Education students' levels of speaking anxiety are predicted by the variables of gender, department, grade, such sub-dimensions of "Speaking Self-Efficacy Scale for Pre-Service Teachers" as "public speaking," "effective speaking,"…

  14. Spatial Interpolation of Soil Texture Using Compositional Kriging and Regression Kriging with Consideration of the Characteristics of Compositional Data and Environment Variables

    Institute of Scientific and Technical Information of China (English)

    ZHANG Shi-wen; SHEN Chong-yang; CHEN Xiao-yang; YE Hui-chun; HUANG Yuan-fang; LAI Shuang

    2013-01-01

    The spatial interpolation for soil texture does not necessarily satisfy the constant sum and nonnegativity constraints. Meanwhile, although numeric and categorical variables have been used as auxiliary variables to improve prediction accuracy of soil attributes such as soil organic matter, they (especially the categorical variables) are rarely used in spatial prediction of soil texture. The objective of our study was to comparing the performance of the methods for spatial prediction of soil texture with consideration of the characteristics of compositional data and auxiliary variables. These methods include the ordinary kriging with the symmetry logratio transform, regression kriging with the symmetry logratio transform, and compositional kriging (CK) approaches. The root mean squared error (RMSE), the relative improvement value of RMSE and Aitchison’s distance (DA) were all utilized to assess the accuracy of prediction and the mean squared deviation ratio was used to evaluate the goodness of fit of the theoretical estimate of error. The results showed that the prediction methods utilized in this paper could enable interpolation results of soil texture to satisfy the constant sum and nonnegativity constraints. Prediction accuracy and model fitting effect of the CK approach were better, suggesting that the CK method was more appropriate for predicting soil texture. The CK method is directly interpolated on soil texture, which ensures that it is optimal unbiased estimator. If the environment variables are appropriately selected as auxiliary variables, spatial variability of soil texture can be predicted reasonably and accordingly the predicted results will be satisfied.

  15. [From clinical judgment to linear regression model.

    Science.gov (United States)

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R(2)) indicates the importance of independent variables in the outcome.

  16. Complete Moment Convergence and Mean Convergence for Arrays of Rowwise Extended Negatively Dependent Random Variables

    Directory of Open Access Journals (Sweden)

    Yongfeng Wu

    2014-01-01

    Full Text Available The authors first present a Rosenthal inequality for sequence of extended negatively dependent (END random variables. By means of the Rosenthal inequality, the authors obtain some complete moment convergence and mean convergence results for arrays of rowwise END random variables. The results in this paper extend and improve the corresponding theorems by Hu and Taylor (1997.

  17. About regression-kriging: from equations to case studies

    NARCIS (Netherlands)

    Hengl, T.; Heuvelink, G.B.M.; Rossiter, D.G.

    2007-01-01

    This paper discusses the characteristics of regression-kriging (RK), its strengths and limitations, and illustrates these with a simple example and three case studies. RK is a spatial interpolation technique that combines a regression of the dependent variable on auxiliary variables (such as land su

  18. Assessing the impact of local meteorological variables on surface ozone in Hong Kong during 2000-2015 using quantile and multiple line regression models

    Science.gov (United States)

    Zhao, Wei; Fan, Shaojia; Guo, Hai; Gao, Bo; Sun, Jiaren; Chen, Laiguo

    2016-11-01

    The quantile regression (QR) method has been increasingly introduced to atmospheric environmental studies to explore the non-linear relationship between local meteorological conditions and ozone mixing ratios. In this study, we applied QR for the first time, together with multiple linear regression (MLR), to analyze the dominant meteorological parameters influencing the mean, 10th percentile, 90th percentile and 99th percentile of maximum daily 8-h average (MDA8) ozone concentrations in 2000-2015 in Hong Kong. The dominance analysis (DA) was used to assess the relative importance of meteorological variables in the regression models. Results showed that the MLR models worked better at suburban and rural sites than at urban sites, and worked better in winter than in summer. QR models performed better in summer for 99th and 90th percentiles and performed better in autumn and winter for 10th percentile. And QR models also performed better in suburban and rural areas for 10th percentile. The top 3 dominant variables associated with MDA8 ozone concentrations, changing with seasons and regions, were frequently associated with the six meteorological parameters: boundary layer height, humidity, wind direction, surface solar radiation, total cloud cover and sea level pressure. Temperature rarely became a significant variable in any season, which could partly explain the peak of monthly average ozone concentrations in October in Hong Kong. And we found the effect of solar radiation would be enhanced during extremely ozone pollution episodes (i.e., the 99th percentile). Finally, meteorological effects on MDA8 ozone had no significant changes before and after the 2010 Asian Games.

  19. A Matlab program for stepwise regression

    Directory of Open Access Journals (Sweden)

    Yanhong Qi

    2016-03-01

    Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.

  20. Synaptic Variability Introduces State-Dependent Modulation of Excitatory Spinal Cord Synapses

    Directory of Open Access Journals (Sweden)

    David Parker

    2015-01-01

    Full Text Available The relevance of neuronal and synaptic variability remains unclear. Cellular and synaptic plasticity and neuromodulation are also variable. This could reflect state-dependent effects caused by the variable initial cellular or synaptic properties or direct variability in plasticity-inducing mechanisms. This study has examined state-dependent influences on synaptic plasticity at connections between excitatory interneurons (EIN and motor neurons in the lamprey spinal cord. State-dependent effects were examined by correlating initial synaptic properties with the substance P-mediated plasticity of low frequency-evoked EPSPs and the reduction of the EPSP depression over spike trains (metaplasticity. The low frequency EPSP potentiation reflected an interaction between the potentiation of NMDA responses and the release probability. The release probability introduced a variable state-dependent subtractive influence on the postsynaptic NMDA-dependent potentiation. The metaplasticity was also state-dependent: it was greater at connections with smaller available vesicle pools and high initial release probabilities. This was supported by the significant reduction in the number of connections showing metaplasticity when the release probability was reduced by high Mg2+ Ringer. Initial synaptic properties thus introduce state-dependent influences that affect the potential for plasticity. Understanding these conditions will be as important as understanding the subsequent changes.

  1. Synaptic Variability Introduces State-Dependent Modulation of Excitatory Spinal Cord Synapses.

    Science.gov (United States)

    Parker, David

    2015-01-01

    The relevance of neuronal and synaptic variability remains unclear. Cellular and synaptic plasticity and neuromodulation are also variable. This could reflect state-dependent effects caused by the variable initial cellular or synaptic properties or direct variability in plasticity-inducing mechanisms. This study has examined state-dependent influences on synaptic plasticity at connections between excitatory interneurons (EIN) and motor neurons in the lamprey spinal cord. State-dependent effects were examined by correlating initial synaptic properties with the substance P-mediated plasticity of low frequency-evoked EPSPs and the reduction of the EPSP depression over spike trains (metaplasticity). The low frequency EPSP potentiation reflected an interaction between the potentiation of NMDA responses and the release probability. The release probability introduced a variable state-dependent subtractive influence on the postsynaptic NMDA-dependent potentiation. The metaplasticity was also state-dependent: it was greater at connections with smaller available vesicle pools and high initial release probabilities. This was supported by the significant reduction in the number of connections showing metaplasticity when the release probability was reduced by high Mg(2+) Ringer. Initial synaptic properties thus introduce state-dependent influences that affect the potential for plasticity. Understanding these conditions will be as important as understanding the subsequent changes.

  2. Investigating the Performance of Alternate Regression Weights by Studying All Possible Criteria in Regression Models with a Fixed Set of Predictors

    Science.gov (United States)

    Waller, Niels; Jones, Jeff

    2011-01-01

    We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n x 1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a…

  3. Statistical Portfolio Estimation under the Utility Function Depending on Exogenous Variables

    Directory of Open Access Journals (Sweden)

    Kenta Hamada

    2012-01-01

    Full Text Available In the estimation of portfolios, it is natural to assume that the utility function depends on exogenous variable. From this point of view, in this paper, we develop the estimation under the utility function depending on exogenous variable. To estimate the optimal portfolio, we introduce a function of moments of the return process and cumulant between the return processes and exogenous variable, where the function means a generalized version of portfolio weight function. First, assuming that exogenous variable is a random process, we derive the asymptotic distribution of the sample version of portfolio weight function. Then, an influence of exogenous variable on the return process is illuminated when exogenous variable has a shot noise in the frequency domain. Second, assuming that exogenous variable is nonstochastic, we derive the asymptotic distribution of the sample version of portfolio weight function. Then, an influence of exogenous variable on the return process is illuminated when exogenous variable has a harmonic trend. We also evaluate the influence of exogenous variable on the return process numerically.

  4. Estimating Intelligence in Spanish: Regression Equations With the Word Accentuation Test and Demographic Variables in Latin America.

    Science.gov (United States)

    Sierra Sanjurjo, Natalia; Montañes, Patricia; Sierra Matamoros, Fabio Alexander; Burin, Debora

    2015-01-01

    Spanish is the fourth most spoken language in the world, and the majority of Spanish speakers have a Latin American origin. Reading aloud infrequently accentuated words has been established as a National Adult Reading Test-like method to assess premorbid intelligence in Spanish. However, several versions have been proposed and validated with small and selected samples, in particular geographical conditions, and they seldom derive a formula for IQ estimation with the Wechsler Adult Intelligence Scale (WAIS) Full-Scale IQ (FSIQ). The objective of this study was to develop equations to estimate WAIS-Third Edition (WAIS-III) FSIQ from the Word Accentuation Test-Revised (WAT-R), demographic variables, and their combination within diverse Latin American samples. Two hundred and forty participants from Argentina and Colombia, selected according to age and years of education strata, were assessed with the WAT-R, the WAIS-III, and a structured questionnaire about demographic and medical information. A combined approach including place of birth, years of education, and WAT-R provided the best equation, explaining 76% of IQ variance. These equations could be useful for estimating premorbid IQ in patients with Latin American Spanish as their birth language.

  5. Systematic Selection of Key Logistic Regression Variables for Risk Prediction Analyses: A Five-Factor Maximum Model.

    Science.gov (United States)

    Hewett, Timothy E; Webster, Kate E; Hurd, Wendy J

    2017-08-16

    The evolution of clinical practice and medical technology has yielded an increasing number of clinical measures and tests to assess a patient's progression and return to sport readiness after injury. The plethora of available tests may be burdensome to clinicians in the absence of evidence that demonstrates the utility of a given measurement. Thus, there is a critical need to identify a discrete number of metrics to capture during clinical assessment to effectively and concisely guide patient care. The data sources included Pubmed and PMC Pubmed Central articles on the topic. Therefore, we present a systematic approach to injury risk analyses and how this concept may be used in algorithms for risk analyses for primary anterior cruciate ligament (ACL) injury in healthy athletes and patients after ACL reconstruction. In this article, we present the five-factor maximum model, which states that in any predictive model, a maximum of 5 variables will contribute in a meaningful manner to any risk factor analysis. We demonstrate how this model already exists for prevention of primary ACL injury, how this model may guide development of the second ACL injury risk analysis, and how the five-factor maximum model may be applied across the injury spectrum for development of the injury risk analysis.

  6. Multiple Regression Analysis of the Variable Component in the Near-Infrared Region for Type 1 AGN MCG+08-11-011

    CERN Document Server

    Tomita, H; Kobayashi, Y; Minezaki, T; Enya, K; Suganuma, M; Aoki, T; Koshida, S; Yamauchi, M; Tomita, Hiroyuki; Yoshii, Yuzuru; Kobayashi, Yukiyasu; Minezaki, Takeo; Enya, Keigo; Suganuma, Masahiro; Aoki, Tsutomu; Koshida, Shintaro; Yamauchi, Masahiro

    2006-01-01

    We propose a new method of analysing a variable component for type 1 active galactic nuclei (AGNs) in the near-infrared wavelength region. This analysis uses a multiple regression technique and divides the variable component into two components originating in the accretion disk at the center of AGNs and from the dust torus that far surrounds the disk. Applying this analysis to the long-term $VHK$ monitoring data of MCG+08-11-011 that were obtained by the MAGNUM project, we found that the $(H-K)$-color temperature of the dust component is $T = 1635$K $\\pm20$K, which agrees with the sublimation temperature of dust grains, and that the time delay of $K$ to $H$ variations is $\\Delta t\\approx 6$ days, which indicates the existence of a radial temperature gradient in the dust torus. As for the disk component, we found that the power-law spectrum of $f_\

  7. On an asymptotic distribution of dependent random variables on a 3-dimensional lattice✩

    Science.gov (United States)

    Harvey, Danielle J.; Weng, Qian; Beckett, Laurel A.

    2010-01-01

    We define conditions under which sums of dependent spatial data will be approximately normally distributed. A theorem on the asymptotic distribution of a sum of dependent random variables defined on a 3-dimensional lattice is presented. Examples are also presented. PMID:20436940

  8. Weak laws of large numbers for arrays of rowwise negatively dependent random variables

    Directory of Open Access Journals (Sweden)

    R. L. Taylor

    2001-01-01

    Full Text Available Weak laws of large numbers for arrays of rowwise negatively dependent random variables are obtained in this paper. The more general hypothesis of negative dependence relaxes the usual assumption of independence. The moment conditions are similar to previous results, and the stochastic bounded condition also provides a generalization of the usual distributional assumptions.

  9. A CENTRAL LIMIT THEOREM FOR STRONG NEAR-EPOCH DEPENDENT RANDOM VARIABLES

    Institute of Scientific and Technical Information of China (English)

    LIN ZHENGYAN; QIU JIN

    2004-01-01

    In this paper, a central limit theorem for strong near-epoch dependent sequences of random variables introduced in [9] is showed. Under the same moments condition,the authors essentially weaken the "size" requirement mentioned in other papers about near epoch dependence.

  10. Regression analysis by example

    CERN Document Server

    Chatterjee, Samprit

    2012-01-01

    Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

  11. Functional linear regression via canonical analysis

    CERN Document Server

    He, Guozhong; Wang, Jane-Ling; Yang, Wenjing; 10.3150/09-BEJ228

    2011-01-01

    We study regression models for the situation where both dependent and independent variables are square-integrable stochastic processes. Questions concerning the definition and existence of the corresponding functional linear regression models and some basic properties are explored for this situation. We derive a representation of the regression parameter function in terms of the canonical components of the processes involved. This representation establishes a connection between functional regression and functional canonical analysis and suggests alternative approaches for the implementation of functional linear regression analysis. A specific procedure for the estimation of the regression parameter function using canonical expansions is proposed and compared with an established functional principal component regression approach. As an example of an application, we present an analysis of mortality data for cohorts of medflies, obtained in experimental studies of aging and longevity.

  12. 线性回归模型的Boosting变量选择方法∗%Boosting Variable Selection Algorithm for Linear Regression Models

    Institute of Scientific and Technical Information of China (English)

    李毓; 张春霞; 王冠伟

    2015-01-01

    针对线性回归模型的变量选择问题,本文基于遗传算法提出了一种新的Boosting学习方法。该方法对每一训练个体赋予权重,以遗传算法作为Boosting的基学习算法,将带有权重分布的训练集作为遗传算法的输入进行变量选择。同时,根据前一次变量选择效果的好坏更新训练集上的权重分布。重复上述步骤多次,最后以加权融合方式合并多次变量选择的结果。基于模拟和实际数据的试验结果表明,本文新提出的Boosting方法能显著提高传统遗传算法用于变量选择的质量,准确识别出与响应变量相关的协变量,这为线性回归模型的变量选择提供了一种有效的新方法。%With respect to variable selection for linear regression models, this paper proposes a novel Boosting learning method based on genetic algorithm. In the novel algorithm, all train-ing examples are firstly assigned equal weights and a traditional genetic algorithm is adopted as the base learning algorithm of Boosting. Then, the training set associated with a weight distribution is taken as the input of genetic algorithm to do variable selection. Subsequently, the weight distribution is updated according to the quality of the previous variable selection results. Through repeating the above steps for multiple times, the results are then fused via a weighted combination rule. The performance of the proposed Boosting method is investigated on some simulated and real-world data. The experimental results show that our method can significantly improve the variable selection performance of traditional genetic algorithm and accurately identify the relevant variables. Thus, the novel Boosting method can be deemed as an effective technique for handling variable selection problems in linear regression models.

  13. Spatial variability in levels of benzene, formaldehyde, and total benzene, toluene, ethylbenzene and xylenes in New York City: a land-use regression study

    Directory of Open Access Journals (Sweden)

    Kheirbek Iyad

    2012-07-01

    Full Text Available Abstract Background Hazardous air pollutant exposures are common in urban areas contributing to increased risk of cancer and other adverse health outcomes. While recent analyses indicate that New York City residents experience significantly higher cancer risks attributable to hazardous air pollutant exposures than the United States as a whole, limited data exist to assess intra-urban variability in air toxics exposures. Methods To assess intra-urban spatial variability in exposures to common hazardous air pollutants, street-level air sampling for volatile organic compounds and aldehydes was conducted at 70 sites throughout New York City during the spring of 2011. Land-use regression models were developed using a subset of 59 sites and validated against the remaining 11 sites to describe the relationship between concentrations of benzene, total BTEX (benzene, toluene, ethylbenzene, xylenes and formaldehyde to indicators of local sources, adjusting for temporal variation. Results Total BTEX levels exhibited the most spatial variability, followed by benzene and formaldehyde (coefficient of variation of temporally adjusted measurements of 0.57, 0.35, 0.22, respectively. Total roadway length within 100 m, traffic signal density within 400 m of monitoring sites, and an indicator of temporal variation explained 65% of the total variability in benzene while 70% of the total variability in BTEX was accounted for by traffic signal density within 450 m, density of permitted solvent-use industries within 500 m, and an indicator of temporal variation. Measures of temporal variation, traffic signal density within 400 m, road length within 100 m, and interior building area within 100 m (indicator of heating fuel combustion predicted 83% of the total variability of formaldehyde. The models built with the modeling subset were found to predict concentrations well, predicting 62% to 68% of monitored values at validation sites. Conclusions Traffic and

  14. Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    Science.gov (United States)

    Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

    2017-06-01

    A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.

  15. Composition-dependence of stacking fault energy in austenitic stainless steels through linear regression with random intercepts

    Science.gov (United States)

    Meric de Bellefon, G.; van Duysen, J. C.; Sridharan, K.

    2017-08-01

    The stacking fault energy (SFE) plays an important role in deformation behavior and radiation damage of FCC metals and alloys such as austenitic stainless steels. In the present communication, existing expressions to calculate SFE in those steels from chemical composition are reviewed and an improved multivariate linear regression with random intercepts is used to analyze a new database of 144 SFE measurements collected from 30 literature references. It is shown that the use of random intercepts can account for experimental biases in these literature references. A new expression to predict SFE from austenitic stainless steel compositions is proposed.

  16. Quantile regression applied to spectral distance decay

    Science.gov (United States)

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  17. Interaction between continuous variables in logistic regression model%Logistic回归模型中连续变量交互作用的分析

    Institute of Scientific and Technical Information of China (English)

    邱宏; 余德新; 谢立亚; 王晓蓉; 付振明

    2010-01-01

    Rothman提出生物学交互作用的评价应该基于相加尺度即是否有相加交互作用,而logistic回归模型的乘积项反映的是相乘交互作用.目前国内外文献讨论logistic回归模型中两因素的相加交互作用以两分类变量为主,本文介绍两连续变量或连续变量与分类变量相加交互作用可信区间估计的Bootstrap方法,文中以香港男性肺癌病例对照研究资料为例,辅以免费软件R的实现程序,为研究人员分析交互作用提供参考.%Rothman argued that interaction estimated as departure from additivity better reflected the biological interaction. In a logistic regression model, the product term reflects the interaction as departure from multiplicativity. So far, literature on estimating interaction regarding an additive scale using logistic regression was only focusing on two dichotomous factors. The objective of the present report was to provide a method to examine the interaction as departure from additivity between two continuous variables or between one continuous variable and one categorical variable.We used data from a lung cancer case-control study among males in Hong Kong as an example to illustrate the bootstrap re-sampling method for calculating the corresponding confidence intervals.Free software R (Version 2.8.1) was used to estimate interaction on the additive scale.

  18. Tail dependence of random variables from ARCH and heavy tailed bilinear models

    Institute of Scientific and Technical Information of China (English)

    潘家柱

    2002-01-01

    Discussed in this paper is the dependent structure in the tails of distributions of random variables from some heavy-tailed stationary nonlinear time series. One class of models discussed is the first-order autoregressive conditional heteroscedastic (ARCH) process introduced by Engle (1982). The other class is the simple first-order bilinear models driven by heavy-tailed innovations. We give some explicit formulas for the asymptotic values of conditional probabilities used for measuring the tail dependence between two random variables from these models. Our results have significant meanings in finance.

  19. On the dependence of QCD splitting functions on the choice of the evolution variable

    CERN Document Server

    Jadach, S; Placzek, W; Skrzypek, M

    2016-01-01

    We show that already at the NLO level the DGLAP evolution kernel Pqq starts to depend on the choice of the evolution variable. We give an explicit example of such a variable, namely the maximum of transverse momenta of emitted partons and we identify a class of evolution variables that leave the NLO Pqq kernel unchanged with respect to the known standard MS-bar results. The kernels are calculated using a modified Curci-Furmanski-Petronzio method which is based on a direct Feynman-graphs calculation.

  20. On the dependence of QCD splitting functions on the choice of the evolution variable

    Science.gov (United States)

    Jadach, S.; Kusina, A.; Placzek, W.; Skrzypek, M.

    2016-08-01

    We show that already at the NLO level the DGLAP evolution kernel P qq starts to depend on the choice of the evolution variable. We give an explicit example of such a variable, namely the maximum of transverse momenta of emitted partons and we identify a class of evolution variables that leave the NLO P qq kernel unchanged with respect to the known standard overline{MS} results. The kernels are calculated using a modified Curci-Furmanski-Petronzio method which is based on a direct Feynman-graphs calculation.

  1. ERLS Algorithm for Linear Regression Model with Missing Response Variable%响应变量缺失下线性回归模型的ERLS算法

    Institute of Scientific and Technical Information of China (English)

    刘力军

    2012-01-01

    针对线性回归模型,提出了一个新的期望递归最小二乘算法(Expectation Recursive Least Square,ERLS)。在响应变量数据存在部分缺失的条件下,ERLS取响应变量的期望值代替缺失值,基于该期望值与自变量数据,实现自适应的递归估计回归系数,避免了高维数据相关矩阵的求逆困难。ERLS算法充分利用了全部有效数据,实现了在线回归估计。数值实验结果表明,在观测数据存在野值时,通过引入非线性抑制函数,ERLS算法优于LS方法。%A novel Expectation Least Square(ERLS) algorithm is proposed for linear regression model.Under the condition that response is partly missing,ERLS uses expectation value of the response instead of the missing value.Based on the expectation value and the data of independent variable,ERLS adaptively estimates the regression coefficients,which avoids the difficulty of inversion operation to the correlation matrix of high-dimensional data.ERLS makes fully use of the available data and sovles the regression problem in an online manner.Numerical expriments show that,by introducing a nonlinear function of supression,ERLS is superior to LS solution under the existence of wild data points.

  2. Spatial variability and its scale dependency of observed and modeled soil moisture under different climate conditions

    Directory of Open Access Journals (Sweden)

    B. Li

    2012-09-01

    Full Text Available Past studies on soil moisture spatial variability have been mainly conducted in catchment scales where soil moisture is often sampled over a short time period. Because of limited climate and weather conditions, the observed soil moisture often exhibited smaller dynamic ranges which prevented the complete revelation of soil moisture spatial variability as a function of mean soil moisture. In this study, spatial statistics (mean, spatial variability and skewness of in situ soil moisture measurements (from a continuously monitored network across the US, modeled and satellite retrieved soil moisture obtained in a warm season (198 days were examined at large extent scales (>100 km over three different climate regions. The investigation on in situ measurements revealed that their spatial moments strongly depend on climates, with distinct mean, spatial variability and skewness observed in each climate zone. In addition, an upward convex shape, which was revealed in several smaller scale studies, was observed for the relationship between spatial variability of in situ soil moisture and its spatial mean across dry, intermediate, and wet climates. These climate specific features were vaguely or partially observable in modeled and satellite retrieved soil moisture estimates, which is attributed to the fact that these two data sets do not have climate specific and seasonal sensitive mean soil moisture values, in addition to lack of dynamic ranges. From the point measurements to satellite retrievals, soil moisture spatial variability decreased in each climate region. The three data sources all followed the power law in the scale dependency of spatial variability, with coarser resolution data showing stronger scale dependency than finer ones. The main findings from this study are: (1 the statistical distribution of soil moisture depends on spatial mean soil moisture values and thus need to be derived locally within any given area; (2 the boundedness of soil

  3. Quantile regression

    CERN Document Server

    Hao, Lingxin

    2007-01-01

    Quantile Regression, the first book of Hao and Naiman's two-book series, establishes the seldom recognized link between inequality studies and quantile regression models. Though separate methodological literature exists for each subject, the authors seek to explore the natural connections between this increasingly sought-after tool and research topics in the social sciences. Quantile regression as a method does not rely on assumptions as restrictive as those for the classical linear regression; though more traditional models such as least squares linear regression are more widely utilized, Hao

  4. Altered blood oxygen level-dependent signal variability in chronic post-traumatic stress disorder during symptom provocation

    Directory of Open Access Journals (Sweden)

    Ke J

    2015-07-01

    Full Text Available Jun Ke,1,* Li Zhang,2,* Rongfeng Qi,1,* Qiang Xu,1 Weihui Li,2 Cailan Hou,3 Yuan Zhong,1 Zhiqiang Zhang,1 Zhong He,4 Lingjiang Li,2,5 Guangming Lu11Department of Medical Imaging, Jinling Hospital, Medical School of Nanjing University, Nanjing, 2Mental Health Institute, the Second Xiangya Hospital, National Technology Institute of Psychiatry, Key Laboratory of Psychiatry and Mental Health of Hunan Province, Central South University, Changsha, 3Guangdong Academy of Medical Science, Guangdong General Hospital, Guangdong Mental Health Center, Guangzhou, 4Department of Radiology of the Second Xiangya Hospital, Central South University, Changsha, 5Shenzhen Kangning Hospital of Guangdong Province, Shenzhen, People’s Republic of China*These authors contributed equally to this workBackground: Recent research suggests that variability in brain signal provides important information about brain function in health and disease. However, it is unknown whether blood oxygen level-dependent (BOLD signal variability is altered in post-traumatic stress disorder (PTSD. We aimed to identify the BOLD signal variability changes of PTSD patients during symptom provocation and compare the brain patterns of BOLD signal variability with those of brain activation.Methods: Twelve PTSD patients and 14 age-matched controls, who all experienced a mining accident, underwent clinical assessment as well as fMRI scanning while viewing trauma-related and neutral pictures. BOLD signal variability and brain activation were respectively examined with standard deviation (SD and general linear model analysis, and compared between the PTSD and control groups. Multiple regression analyses were conducted to explore the association between PTSD symptom severity and these two brain measures across all subjects as well as in the PTSD group.Results: PTSD patients showed increased activation in the middle occipital gyrus compared with controls, and an inverse correlation was found between PTSD

  5. Motivation as an independent and a dependent variable in medical education: a review of the literature.

    Science.gov (United States)

    Kusurkar, R A; Ten Cate, Th J; van Asperen, M; Croiset, G

    2011-01-01

    Motivation in learning behaviour and education is well-researched in general education, but less in medical education. To answer two research questions, 'How has the literature studied motivation as either an independent or dependent variable? How is motivation useful in predicting and understanding processes and outcomes in medical education?' in the light of the Self-determination Theory (SDT) of motivation. A literature search performed using the PubMed, PsycINFO and ERIC databases resulted in 460 articles. The inclusion criteria were empirical research, specific measurement of motivation and qualitative research studies which had well-designed methodology. Only studies related to medical students/school were included. Findings of 56 articles were included in the review. Motivation as an independent variable appears to affect learning and study behaviour, academic performance, choice of medicine and specialty within medicine and intention to continue medical study. Motivation as a dependent variable appears to be affected by age, gender, ethnicity, socioeconomic status, personality, year of medical curriculum and teacher and peer support, all of which cannot be manipulated by medical educators. Motivation is also affected by factors that can be influenced, among which are, autonomy, competence and relatedness, which have been described as the basic psychological needs important for intrinsic motivation according to SDT. Motivation is an independent variable in medical education influencing important outcomes and is also a dependent variable influenced by autonomy, competence and relatedness. This review finds some evidence in support of the validity of SDT in medical education.

  6. Investigation of Dependence between Time-zero and Time-dependent Variability in High-k NMOS Transistors

    CERN Document Server

    Hassan, Mohammad Khaled

    2016-01-01

    Bias Temperature Instability (BTI) is a major reliability concern in CMOS technology, especially with High dielectric constant (High-\\k{appa}/HK) metal gate (MG) transistors. In addition, the time independent process induced variation has also increased because of the aggressive scaling down of devices. As a result, the faster devices at the lower threshold voltage distribution tail experience higher stress, leading to additional skewness in the BTI degradation. Since time dependent dielectric breakdown (TDDB) and stress-induced leakage current (SILC) in NMOS devices are correlated to BTI, it is necessary to investigate the effect of time zero variability on all these effects simultaneously. To that effect, we propose a simulation framework to model and analyze the impact of time-zero variability (in particular, random dopant fluctuations) on different aging effects. For small area devices (~1000 nm2) in 30nm technology, we observe significant effect of Random Dopant Fluctuation (RDF) on BTI induced variabili...

  7. Rethinking the dependent variable in voting behavior: On the measurement and analysis of electoral utilities

    NARCIS (Netherlands)

    Eijk, van der Cees; Brug, van der Wouter; Kroh, Martin; Franklin, Mark

    2006-01-01

    As a dependent variable, party choice did not lend itself to analysis by means of powerful multivariate methods until the coming of discrete-choice models, most notably conditional logit and multinomial logit. These methods involve estimating effects on party preferences (utilities) that are post ho

  8. Panel data models extended to spatial error autocorrelation or a spatially lagged dependent variable

    NARCIS (Netherlands)

    Elhorst, J. Paul

    2001-01-01

    This paper surveys panel data models extended to spatial error autocorrelation or a spatially lagged dependent variable. In particular, it focuses on the specification and estimation of four panel data models commonly used in applied research: the fixed effects model, the random effects model, the

  9. Comparing apples and oranges: the dependent variable problem in comparing and evaluating climate change adaptation policies

    NARCIS (Netherlands)

    Dupuis, J.; Biesbroek, G.R.

    2013-01-01

    An increasing number of studies have compared climate change adaptation policies within and between different countries. In this paper we show that these comparative studies suffer from what is known as the ‘‘dependent variable problem’ – the indistinctness of the phenomenon that is being measured,

  10. Comparing apples and oranges: the dependent variable problem in comparing and evaluating climate change adaptation policies

    NARCIS (Netherlands)

    Dupuis, J.; Biesbroek, G.R.

    2013-01-01

    An increasing number of studies have compared climate change adaptation policies within and between different countries. In this paper we show that these comparative studies suffer from what is known as the ‘‘dependent variable problem’ – the indistinctness of the phenomenon that is being measured,

  11. Parametric and Semiparametric Estimation in Models with Misclassified Categorical Dependent Variables

    NARCIS (Netherlands)

    Dustmann, C.; van Soest, A.H.O.

    1999-01-01

    We consider both a parametric and a semiparametric method to account for classification errors on the dependent variable in an ordered response model. The methods are applied to the analysis of self-reported speaking fluency of male immigrants in Germany. We find some substantial differences in para

  12. Parametric and Semiparametric Estimation in Models with Misclassified Categorical Dependent Variables

    NARCIS (Netherlands)

    Dustmann, C.; van Soest, A.H.O.

    1999-01-01

    We consider both a parametric and a semiparametric method to account for classification errors on the dependent variable in an ordered response model. The methods are applied to the analysis of self-reported speaking fluency of male immigrants in Germany. We find some substantial differences in

  13. Field Dependency, n Power and Locus of Control Variables in Alcohol Aversion.

    Science.gov (United States)

    Query, William T.

    1983-01-01

    Compared individual differences and treatment effectiveness in male volunteer alcoholics (N=47) in a 10-day electroconditioning aversion program. Follow-up showed combination therapy was more successful. Internals and hard liquor drinkers tended to be abstinent as predicted. Field dependency was a more unstable variable for outcome. (Author/JAC)

  14. Necessary and sufficient conditions for moderate deviations of dependent random variables with heavy tails

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    This paper studies the moderate deviations of real-valued extended negatively dependent(END) random variables with consistently varying tails.The moderate deviations of partial sums are first given.The results are then used to establish the necessary and sufficient conditions for the moderate deviations of random sums under certain circumstances.

  15. Dose dependency and individual variability of the lipopolysaccharide-induced bovine acute phase protein response

    DEFF Research Database (Denmark)

    Jacobsen, S.; Andersen, P.H.; Tølbøll, T.

    2004-01-01

    In order to investigate the dose dependency and the individual variability of the lipopolysaccharide (LPS)-induced acute phase protein response in cattle, 8 nonlactating, nonpregnant Danish Holstein cows were challenged 3 times each by intravenous injection of increasing doses (10, 100, and 1000 ng...... for several days after each LPS injection, and their increase or decrease was significantly related to LPS dose. In addition to dose dependency, the response was also dependent on the individual, as APP concentrations differed significantly among cows. To compare APP production in 2 consecutive challenges...

  16. Principal component regression for crop yield estimation

    CERN Document Server

    Suryanarayana, T M V

    2016-01-01

    This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...

  17. Development of variable pathlength UV-vis spectroscopy combined with partial-least-squares regression for wastewater chemical oxygen demand (COD) monitoring.

    Science.gov (United States)

    Chen, Baisheng; Wu, Huanan; Li, Sam Fong Yau

    2014-03-01

    To overcome the challenging task to select an appropriate pathlength for wastewater chemical oxygen demand (COD) monitoring with high accuracy by UV-vis spectroscopy in wastewater treatment process, a variable pathlength approach combined with partial-least squares regression (PLSR) was developed in this study. Two new strategies were proposed to extract relevant information of UV-vis spectral data from variable pathlength measurements. The first strategy was by data fusion with two data fusion levels: low-level data fusion (LLDF) and mid-level data fusion (MLDF). Predictive accuracy was found to improve, indicated by the lower root-mean-square errors of prediction (RMSEP) compared with those obtained for single pathlength measurements. Both fusion levels were found to deliver very robust PLSR models with residual predictive deviations (RPD) greater than 3 (i.e. 3.22 and 3.29, respectively). The second strategy involved calculating the slopes of absorbance against pathlength at each wavelength to generate slope-derived spectra. Without the requirement to select the optimal pathlength, the predictive accuracy (RMSEP) was improved by 20-43% as compared to single pathlength spectroscopy. Comparing to nine-factor models from fusion strategy, the PLSR model from slope-derived spectroscopy was found to be more parsimonious with only five factors and more robust with residual predictive deviation (RPD) of 3.72. It also offered excellent correlation of predicted and measured COD values with R(2) of 0.936. In sum, variable pathlength spectroscopy with the two proposed data analysis strategies proved to be successful in enhancing prediction performance of COD in wastewater and showed high potential to be applied in on-line water quality monitoring.

  18. Astronomical Methods for Nonparametric Regression

    Science.gov (United States)

    Steinhardt, Charles L.; Jermyn, Adam

    2017-01-01

    I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.

  19. Performance and robustness of probabilistic river forecasts computed with quantile regression based on multiple independent variables in the North Central USA

    Directory of Open Access Journals (Sweden)

    F. Hoss

    2014-10-01

    et al., 2011; López López et al., 2014. This study adds the rise rate of the river stage in the last 24 and 48 h and the forecast error 24 and 48 h ago to the QR model. Including those four variables significantly improved the forecasts, as measured by the Brier Skill Score (BSS. Mainly, the resolution increases, as the original QR implementation already delivered high reliability. Combining the forecast with the other four variables results in much less favorable BSSs. Lastly, the forecast performance does not depend on the size of the training dataset, but on the year, the river gage, lead time and event threshold that are being forecast. We find that each event threshold requires a separate model configuration or at least calibration.

  20. Asymptotic Results for Tail Probabilities of Sums of Dependent and Heavy-Tailed Random Variables

    Institute of Scientific and Technical Information of China (English)

    Kam Chuen YUEN; Chuancun YIN

    2012-01-01

    Abstract Let X1,X2,...be a sequence of dependent and heavy-tailed random variables with distributions F1,F2,…. on (-∞,∞),and let т be a nonnegative integer-valued random variable independent of the sequence {Xk,k≥ 1}.In this framework,the asymptotic behavior of the tail probabilities of the quantities Sn =n∑k=1 Xk and S(n) =max1≤k≤n Sk for n > 1,and their randomized versions Sт and S(т) are studied.Some applications to the risk theory are presented.

  1. Land-use regression with long-term satellite-based greenness index and culture-specific sources to model PM2.5 spatial-temporal variability.

    Science.gov (United States)

    Wu, Chih-Da; Chen, Yu-Cheng; Pan, Wen-Chi; Zeng, Yu-Ting; Chen, Mu-Jean; Guo, Yue Leon; Lung, Shih-Chun Candice

    2017-05-01

    This study utilized a long-term satellite-based vegetation index, and considered culture-specific emission sources (temples and Chinese restaurants) with Land-use Regression (LUR) modelling to estimate the spatial-temporal variability of PM2.5 using data from Taipei metropolis, which exhibits typical Asian city characteristics. Annual average PM2.5 concentrations from 2006 to 2012 of 17 air quality monitoring stations established by Environmental Protection Administration of Taiwan were used for model development. PM2.5 measurements from 2013 were used for external data verification. Monthly Normalized Difference Vegetation Index (NDVI) images coupled with buffer analysis were used to assess the spatial-temporal variations of greenness surrounding the monitoring sites. The distribution of temples and Chinese restaurants were included to represent the emission contributions from incense and joss money burning, and gas cooking, respectively. Spearman correlation coefficient and stepwise regression were used for LUR model development, and 10-fold cross-validation and external data verification were applied to verify the model reliability. The results showed a strongly negative correlation (r: -0.71 to -0.77) between NDVI and PM2.5 while temples (r: 0.52 to 0.66) and Chinese restaurants (r: 0.31 to 0.44) were positively correlated to PM2.5 concentrations. With the adjusted model R(2) of 0.89, a cross-validated adj-R(2) of 0.90, and external validated R(2) of 0.83, the high explanatory power of the resultant model was confirmed. Moreover, the averaged NDVI within a 1750 m circular buffer (p < 0.01), the number of Chinese restaurants within a 1750 m buffer (p < 0.01), and the number of temples within a 750 m buffer (p = 0.06) were selected as important predictors during the stepwise selection procedures. According to the partial R(2), NDVI explained 66% of PM2.5 variation and was the dominant variable in the developed model. We suggest future studies consider

  2. Long-term (2004-2015) tendencies and variabilities of tropical UTLS water vapor mixing ratio and temperature observed by AURA/MLS using multivariate regression analysis

    Science.gov (United States)

    Sridharan, S.; Sandhya, M.

    2016-09-01

    Long-term variabilities and tendencies in the tropical (30°N-30°S)monthly averaged zonal mean water vapor mixing ratio (WVMR) and temperature in the upper troposphere and lower stratosphere (UTLS), obtained from the Microwave Limb Sounder (MLS) instrument onboard Earth Observing System (EOS) satellite for the period October 2004-September 2015, are studied using multivariate regression analysis. It is found that the WVMR shows a decreasing trend of 0.02-0.1 ppmv/year in WVMR below 100 hPa while the trend is positive (0.02-0.035 ppmv/year) above 100 hPa. There is no significant trend at 121 hPa. The WVMR response to solar cycle (SC) is negative below 21 hPa. However, the magnitude decreases with height from 0.13 ppmv/100 sfu(solar flux unit) at 178 hPa to 0.07 ppmv/100sfuat 26 hPa. The response of WVMR to multivariate El Niño index (MEI), which is a proxy for El Niño Southern Oscillation (ENSO), is positive at and below 100 hPa and negative above 100 hPa. It is negative at 56-46 hPa with maximum value of 0.1 ppmv/MEI at 56 hPa. Large positive (negative) quasi-biennial oscillation (QBO) in WVMR at 56-68 hPa reconstructed from the regression analysis coincide with eastward (westward) to westward (eastward) transition of QBO winds at that level. The trend in zonal mean tropical temperature is negative above 56 hPa with magnitude increasing with height. The maximum negative trend of 0.05 K/year is observed at 21-17 hPa and the trend insignificant around tropopause. The response of temperature to SC is negative in the UTLS region and to ENSO is positive below 100 hPa and mostly negative above 100 hPa. The negative response of WVMR to MEI in the stratosphere is suggested to be due to the extended cold trap of tropopause temperature during El Niño years that might have controlled the water vapor entry into the stratosphere. The WVMR response to residual vertical velocity at 70 hPa is positive in the stratosphere, whereas the temperature response is positive in the

  3. On a Camassa-Holm type equation with two dependent variables

    Energy Technology Data Exchange (ETDEWEB)

    Falqui, Gregorio [SISSA, Via Beirut 2/4, I-34014 Trieste (Italy)

    2006-01-13

    We consider a generalization of the Camassa-Holm (CH) equation with two dependent variables, called CH2, introduced in a paper by Liu and Zhang (Liu S-Q and Zhang Y 2005 J. Geom. Phys. 54 427-53). We briefly provide an alternative derivation of it based on the theory of Hamiltonian structures on (the dual of) a Lie algebra. The Lie algebra involved here is the same algebra as underlies the NLS hierarchy. We study the structural properties of the hierarchy defined by the CH2 equation within the bi-Hamiltonian theory of integrable PDEs, and provide its Lax representation. Then we explicitly discuss how to construct classes of solutions, both of peakon and of algebro-geometrical type. Finally we sketch the construction of a class of singular solutions, defined by setting to zero one of the two dependent variables.

  4. Stochasticity and Determinism: How Density-Independent and Density-Dependent Processes Affect Population Variability

    OpenAIRE

    Jan Ohlberger; Rogers, Lauren A.; Nils Chr. Stenseth

    2014-01-01

    A persistent debate in population ecology concerns the relative importance of environmental stochasticity and density dependence in determining variability in adult year-class strength, which contributes to future reproduction as well as potential yield in exploited populations. Apart from the strength of the processes, the timing of density regulation may affect how stochastic variation, for instance through climate, translates into changes in adult abundance. In this study, we develop a lif...

  5. An edgeworth expansion for a sum of M-Dependent random variables

    Directory of Open Access Journals (Sweden)

    Wan Soo Rhee

    1985-01-01

    Full Text Available Given a sequence X1,X2,…,Xn of m-dependent random variables with moments of order 3+α (0<α≦1, we give an Edgeworth expansion of the distribution of Sσ−1(S=X1+X2+…+Xn, σ2=ES2 under the assumption that E[exp(it Sσ1] is small away from the origin. The result is of the best possible order.

  6. Study of Mechanical Properties of Wool Type Fabrics using ANCOVA Regression Model

    Science.gov (United States)

    Hristian, L.; Ostafe, M. M.; Manea, L. R.; Apostol, L. L.

    2017-06-01

    The work has achieved a study on the variation of tensile strength for the four groups of wool fabric type, depending on the fiber composition, the tensile strength of the warp yarns and the weft yarns technological density using ANCOVA regression model. ANCOVA checks the correlation between a dependent variable and the covariate independent variables and removes the variability from the dependent variable that can be accounted for by the covariates. Analysis of covariance models combines analysis of variance with regression analysis techniques. Regarding design, ANCOVA models explain the dependent variable by combining categorical (qualitative) independent variables with continuous (quantitative) variables. There are special extensions to ANCOVA calculations to estimate parameters for both categorical and continuous variables. However ANCOVA models can also be calculated using multiple regression analysis using a design matrix with a mix of dummy-coded qualitative and quantitative variables.

  7. Aplicación de los modelos de regresión tobit en la modelización de variables epidemiológicas censuradas Application of tobit regression models in modelling censored epidemiological variables

    Directory of Open Access Journals (Sweden)

    M. J. Bleda Hernández

    2002-04-01

    Full Text Available Muchas variables en estudios epidemiológicos corresponden a medidas continuas obtenidas mediante aparatos de medición con determinados límites de detección, produciendo distribuciones censuradas. La censura, a diferencia del truncamiento, se produce por un defecto de los datos de la muestra. La distribución de una variable censurada es una mezcla entre una distribución continua y otra discreta. En este caso, no es adecuado utilizar el modelo de regresión lineal estimado para mínimos cuadrados ordinarios, ya que proporciona estimaciones sesgadas. Con un único punto de censura debe utilizarse el modelo de regresión censurado (modelo tobit, mientras que cuando hay varios puntos de censura se utiliza la generalización de este modelo. La ilustración de estos modelos se presenta a través del análisis de las concentraciones de mercurio medidas en orina, correspondientes al estudio sobre los efectos para la salud de las emisiones de la incineradora de residuos sólidos de Mataró.Many variables in epidemiological studies are continuous measures obtained by means of measurement equipments with detection limits, generating censored distributions. The censorship, opposite to the trucation, takes place for a defect of the data of the sample. The distribution of a censored variable is a mixture between a continuous and a categorical distributions. In this case, results from lineal regression models, by means of ordinary least squares, will provide biased estimates. With one only censorhip point the tobit model must be used, while with several censorship points this model's generalization should also be used. The illustration of these models is presented through the analysis of the levels of mercury measured in urine in the study about health effects of a municipal solid-waste incinerator in the county of Mataró (Spain.

  8. Bayesian Techniques for Comparing Time-dependent GRMHD Simulations to Variable Event Horizon Telescope Observations

    Science.gov (United States)

    Kim, Junhan; Marrone, Daniel P.; Chan, Chi-Kwan; Medeiros, Lia; Özel, Feryal; Psaltis, Dimitrios

    2016-12-01

    The Event Horizon Telescope (EHT) is a millimeter-wavelength, very-long-baseline interferometry (VLBI) experiment that is capable of observing black holes with horizon-scale resolution. Early observations have revealed variable horizon-scale emission in the Galactic Center black hole, Sagittarius A* (Sgr A*). Comparing such observations to time-dependent general relativistic magnetohydrodynamic (GRMHD) simulations requires statistical tools that explicitly consider the variability in both the data and the models. We develop here a Bayesian method to compare time-resolved simulation images to variable VLBI data, in order to infer model parameters and perform model comparisons. We use mock EHT data based on GRMHD simulations to explore the robustness of this Bayesian method and contrast it to approaches that do not consider the effects of variability. We find that time-independent models lead to offset values of the inferred parameters with artificially reduced uncertainties. Moreover, neglecting the variability in the data and the models often leads to erroneous model selections. We finally apply our method to the early EHT data on Sgr A*.

  9. Modelling the statistical dependence of rainfall event variables by a trivariate copula function

    Directory of Open Access Journals (Sweden)

    M. Balistrocchi

    2011-01-01

    Full Text Available In many hydrological models, such as those derived by analytical probabilistic methods, the precipitation stochastic process is represented by means of individual storm random variables which are supposed to be independent of each other. However, several proposals were advanced to develop joint probability distributions able to account for the observed statistical dependence. The traditional technique of the multivariate statistics is nevertheless affected by several drawbacks, whose most evident issue is the unavoidable subordination of the dependence structure assessment to the marginal distribution fitting. Conversely, the copula approach can overcome this limitation, by splitting the problem in two distinct items. Furthermore, goodness-of-fit tests were recently made available and a significant improvement in the function selection reliability has been achieved. Herein a trivariate probability distribution of the rainfall event volume, the wet weather duration and the interevent time is proposed and verified by test statistics with regard to three long time series recorded in different Italian climates. The function was developed by applying a mixing technique to bivariate copulas, which were formerly obtained by analyzing the random variables in pairs. A unique probabilistic model seems to be suitable for representing the dependence structure, despite the sensitivity shown by the dependence parameters towards the threshold utilized in the procedure for extracting the independent events. The joint probability function was finally developed by adopting a Weibull model for the marginal distributions.

  10. Bayesian Network Models for Local Dependence among Observable Outcome Variables. Research Report. ETS RR-06-36

    Science.gov (United States)

    Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli

    2006-01-01

    Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task that may be dependent. This paper explores four design patterns for modeling locally dependent observations from the same task: (1) No context--Ignore dependence among observables; (2) Compensatory…

  11. Distance and Azimuthal Dependence of Ground‐Motion Variability for Unilateral Strike‐Slip Ruptures

    KAUST Repository

    Vyas, Jagdish Chandra

    2016-06-21

    We investigate near‐field ground‐motion variability by computing the seismic wavefield for five kinematic unilateral‐rupture models of the 1992 Mw 7.3 Landers earthquake, eight simplified unilateral‐rupture models based on the Landers event, and a large Mw 7.8 ShakeOut scenario. We include the geometrical fault complexity and consider different 1D velocity–density profiles for the Landers simulations and a 3D heterogeneous Earth structure for the ShakeOut scenario. For the Landers earthquake, the computed waveforms are validated using strong‐motion recordings. We analyze the simulated ground‐motion data set in terms of distance and azimuth dependence of peak ground velocity (PGV). Our simulations reveal that intraevent ground‐motion variability Graphic is higher in close distances to the fault (<20  km) and decreases with increasing distance following a power law. This finding is in stark contrast to constant sigma‐values used in empirical ground‐motion prediction equations. The physical explanation of a large near‐field Graphic is the presence of strong directivity and rupture complexity. High values of Graphic occur in the rupture‐propagation direction, but small values occur in the direction perpendicular to it. We observe that the power‐law decay of Graphic is primarily controlled by slip heterogeneity. In addition, Graphic, as function of azimuth, is sensitive to variations in both rupture speed and slip heterogeneity. The azimuth dependence of the ground‐motion mean μln(PGV) is well described by a Cauchy–Lorentz function that provides a novel empirical quantification to model the spatial dependency of ground motion. Online Material: Figures of slip distributions, residuals to ground‐motion prediction equations (GMPEs), distance and azimuthal dependence, and directivity predictor of ground‐motion variability for different source models.

  12. Variation among Species in the Temperature Dependence of the Reappearance of Variable Fluorescence following Illumination.

    Science.gov (United States)

    Burke, J J

    1990-06-01

    The relationship between the thermal dependence of the reappearance of chlorophyll variable fluorescence following illumination and temperature dependence of the apparent Michaelis constant (K(m)) of NADH hydroxypyruvate reductase for NADH was investigated in cool and warm season plant species. Brancker SF-20 and SF-30 fluorometers were used to evaluate induced fluorescence transients from detached leaves of wheat (Triticum aestivum L. cv TAM-101), cotton (Gossypium hirsutum L. cv Paymaster 145), tomato (Lycopersicon esculentum cv Del Oro), bell pepper (Capsicum annuum L. cv California Wonder), and petunia (Petunia hybrida cv. Red Sail). Following an illumination period at 25 degrees C, the reappearance of variable fluorescence during a dark incubation was determined at 5 degrees C intervals from 15 degrees C to 45 degrees C. Variable fluorescence recovery was normally distributed with the maximum recovery observed at 20 degrees C in wheat, 30 degrees C in cotton, 20 degrees C to 25 degrees C in tomato, 30 to 35 degrees C in bell pepper and 25 degrees C in petunia. Comparison of the thermal response of fluorescence recovery with the temperature sensitivity of the apparent K(m) of hydroxypyruvate reductase for NADH showed that the range of temperatures providing fluorescence recovery corresponded with those temperatures providing the minimum apparent K(m) values (viz. the thermal kinetic window).

  13. An Objective Screening Method for Major Depressive Disorder Using Logistic Regression Analysis of Heart Rate Variability Data Obtained in a Mental Task Paradigm

    Directory of Open Access Journals (Sweden)

    Guanghao Sun

    2016-11-01

    Full Text Available Background and Objectives: Heart rate variability (HRV has been intensively studied as a promising biological marker of major depressive disorder (MDD. Our previous study confirmed that autonomic activity and reactivity in depression revealed by HRV during rest and mental task (MT conditions can be used as diagnostic measures and in clinical evaluation. In this study, logistic regression analysis (LRA was utilized for the classification and prediction of MDD based on HRV data obtained in an MT paradigm.Methods: Power spectral analysis of HRV on R-R intervals before, during, and after an MT (random number generation was performed in 44 drug-naïve patients with MDD and 47 healthy control subjects at Department of Psychiatry in Shizuoka Saiseikai General Hospital. Logit scores of LRA determined by HRV indices and heart rates discriminated patients with MDD from healthy subjects. The high frequency (HF component of HRV and the ratio of the low frequency (LF component to the HF component (LF/HF correspond to parasympathetic and sympathovagal balance, respectively.Results: The LRA achieved a sensitivity and specificity of 80.0% and 79.0%, respectively, at an optimum cutoff logit score (0.28. Misclassifications occurred only when the logit score was close to the cutoff score. Logit scores also correlated significantly with subjective self-rating depression scale scores (p < 0.05.Conclusion: HRV indices recorded during a mental task may be an objective tool for screening patients with MDD in psychiatric practice. The proposed method appears promising for not only objective and rapid MDD screening, but also evaluation of its severity.

  14. On Weighted Support Vector Regression

    DEFF Research Database (Denmark)

    Han, Xixuan; Clemmensen, Line Katrine Harder

    2014-01-01

    We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...

  15. Degree of multicollinearity and variables involved in linear dependence in additive-dominant models

    Directory of Open Access Journals (Sweden)

    Juliana Petrini

    2012-12-01

    Full Text Available The objective of this work was to assess the degree of multicollinearity and to identify the variables involved in linear dependence relations in additive-dominant models. Data of birth weight (n=141,567, yearling weight (n=58,124, and scrotal circumference (n=20,371 of Montana Tropical composite cattle were used. Diagnosis of multicollinearity was based on the variance inflation factor (VIF and on the evaluation of the condition indexes and eigenvalues from the correlation matrix among explanatory variables. The first model studied (RM included the fixed effect of dam age class at calving and the covariates associated to the direct and maternal additive and non-additive effects. The second model (R included all the effects of the RM model except the maternal additive effects. Multicollinearity was detected in both models for all traits considered, with VIF values of 1.03 - 70.20 for RM and 1.03 - 60.70 for R. Collinearity increased with the increase of variables in the model and the decrease in the number of observations, and it was classified as weak, with condition index values between 10.00 and 26.77. In general, the variables associated with additive and non-additive effects were involved in multicollinearity, partially due to the natural connection between these covariables as fractions of the biological types in breed composition.

  16. Stochasticity and determinism: how density-independent and density-dependent processes affect population variability.

    Science.gov (United States)

    Ohlberger, Jan; Rogers, Lauren A; Stenseth, Nils Chr

    2014-01-01

    A persistent debate in population ecology concerns the relative importance of environmental stochasticity and density dependence in determining variability in adult year-class strength, which contributes to future reproduction as well as potential yield in exploited populations. Apart from the strength of the processes, the timing of density regulation may affect how stochastic variation, for instance through climate, translates into changes in adult abundance. In this study, we develop a life-cycle model for the population dynamics of a large marine fish population, Northeast Arctic cod, to disentangle the effects of density-independent and density-dependent processes on early life-stages, and to quantify the strength of compensatory density dependence in the population. The model incorporates information from scientific surveys and commercial harvest, and dynamically links multiple effects of intrinsic and extrinsic factors on all life-stages, from eggs to spawners. Using a state-space approach we account for observation error and stochasticity in the population dynamics. Our findings highlight the importance of density-dependent survival in juveniles, indicating that this period of the life cycle largely determines the compensatory capacity of the population. Density regulation at the juvenile life-stage dampens the impact of stochastic processes operating earlier in life such as environmental impacts on the production of eggs and climate-dependent survival of larvae. The timing of stochastic versus regulatory processes thus plays a crucial role in determining variability in adult abundance. Quantifying the contribution of environmental stochasticity and compensatory mechanisms in determining population abundance is essential for assessing population responses to climate change and exploitation by humans.

  17. The dependence of J/ψ-nucleon inelastic cross section on the Feynman variable

    Institute of Scientific and Technical Information of China (English)

    DUAN Chun-Gui; LIU Na; MIAO Wen-Dan

    2011-01-01

    By means of two typical sets of nuclear parton distribution functions,meanwhile taking account of the energy loss of the beam proton and the nuclear absorption of the charmonium states traversing the nuclear matter in the uniform framework of the Glauber model,a leading order phenomenological analysis is given in the color evaporation model of the E866 experimental data on J/Ψ production differential cross section ratios RFe/Be(xF).It is shown that the energy loss effect of beam proton on RFe/Be(xF)is more important than the nuclear effects on parton distribution functions in the high Feynman variable xF region.It is found that the J/Ψ-nucleon inelastic cross section depends on the Feynman variable XF and increases linearly with XF in the region xF > 0.2.

  18. Autistic Regression

    Science.gov (United States)

    Matson, Johnny L.; Kozlowski, Alison M.

    2010-01-01

    Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…

  19. Using a latent variable approach to inform gender and racial/ethnic differences in cocaine dependence: a National Drug Abuse Treatment Clinical Trials Network study.

    Science.gov (United States)

    Wu, Li-Tzy; Pan, Jeng-Jong; Blazer, Dan G; Tai, Betty; Stitzer, Maxine L; Woody, George E

    2010-06-01

    This study applies a latent variable approach to examine gender and racial/ethnic differences in cocaine dependence, to determine the presence of differential item functioning (DIF) or item-response bias to diagnostic questions of cocaine dependence, and to explore the effects of DIF on the predictor analysis of cocaine dependence. The analysis sample included 682 cocaine users enrolled in two national multisite studies of the National Drug Abuse Treatment Clinical Trials Network (CTN). Participants were recruited from 14 community-based substance abuse treatment programs associated with the CTN, including 6 methadone and 8 outpatient nonmethadone programs. Factor and multiple indicators-multiple causes (MIMIC) procedures evaluated the latent continuum of cocaine dependence and its correlates. MIMIC analysis showed that men exhibited lower odds of cocaine dependence than women (regression coefficient, beta = -0.34), controlling for the effects of DIF, years of cocaine use, addiction treatment history, comorbid drug dependence diagnoses, and treatment setting. There were no racial/ethnic differences in cocaine dependence; however, DIF by race/ethnicity was noted. Within the context of multiple community-based addiction treatment settings, women were more likely than men to exhibit cocaine dependence. Addiction treatment research needs to further evaluate gender-related differences in drug dependence in treatment entry and to investigate how these differences may affect study participation, retention, and treatment response to better serve this population.

  20. Water-quality variability and constituent transport and processes in streams of Johnson County, Kansas, using continuous monitoring and regression models, 2003-11

    Science.gov (United States)

    Rasmussen, Teresa; Gatotho, Jackline

    2014-01-01

    The population of Johnson County, Kansas increased by about 24 percent between 2000 and 2012, making it one of the most rapidly developing areas of Kansas. The U.S. Geological Survey, in cooperation with the Johnson County Stormwater Management Program, began a comprehensive study of Johnson County streams in 2002 to evaluate and monitor changes in stream quality. The purpose of this report is to describe water-quality variability and constituent transport for streams representing the five largest watersheds in Johnson County, Kansas during 2003 through 2011. The watersheds ranged in urban development from 98.3 percent urban (Indian Creek) to 16.7 percent urban (Kill Creek). Water-quality conditions are quantified among the watersheds of similar size (50.1 square miles to 65.7 square miles) using continuous, in-stream measurements, and using regression models developed from continuous and discrete data. These data are used to quantify variability in concentrations and loads during changing streamflow and seasonal conditions, describe differences among sites, and assess water quality relative to water-quality standards and stream management goals. Water quality varied relative to streamflow conditions, urbanization in the upstream watershed, and contributions from wastewater treatment facilities and storm runoff. Generally, as percent impervious surface (a measure of urbanization) increased, streamflow yield increased. Water temperature of Indian Creek, the most urban site which is also downstream from wastewater facility discharges, was higher than the other sites about 50 percent of the time, particularly during winter months. Dissolved oxygen concentrations were less than the Kansas Department of Health and Environment minimum criterion of 5 milligrams per liter about 15 percent of the time at the Indian Creek site. Dissolved oxygen concentrations were less than the criterion about 10 percent of the time at the rural Blue River and Kill Creek sites, and less than

  1. Post-processing through linear regression

    Directory of Open Access Journals (Sweden)

    B. Van Schaeybroeck

    2011-03-01

    Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.

    These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  2. Post-processing through linear regression

    Science.gov (United States)

    van Schaeybroeck, B.; Vannitsem, S.

    2011-03-01

    Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  3. Boosted beta regression.

    Directory of Open Access Journals (Sweden)

    Matthias Schmid

    Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.

  4. An improved multiple linear regression and data analysis computer program package

    Science.gov (United States)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  5. Biplots in Reduced-Rank Regression

    NARCIS (Netherlands)

    Braak, ter C.J.F.; Looman, C.W.N.

    1994-01-01

    Regression problems with a number of related response variables are typically analyzed by separate multiple regressions. This paper shows how these regressions can be visualized jointly in a biplot based on reduced-rank regression. Reduced-rank regression combines multiple regression and principal c

  6. Independent Sampling vs Interitem Dependencies in Whole Report Processing: Contributions of Processing Architecture and Variable Attention.

    Science.gov (United States)

    Busey, Thomas A.; Townsend, James T.

    2001-04-01

    All current models of visual whole report processing assume perceptual independence among the displayed items in which the perceptual processing of individual items is not affected by other items in the display. However, models proposed by Townsend (1981, Acta Psychologica 47, 149-173), Shibuya and Bundesen (1988, Journal of Experimental Psychology: Human Perception and Performance 14, 591-600), and Bundesen (1990, Psychological Review 97, 523-547) contain postperceptual buffers that must predict negative dependencies. The perceptual-independence assumption forms what we term the modal model class. A recent example of a model that assumes perceptual independence is the Independent Sampling Model of Loftus, Busey, and Senders (1993, Perception and Psychophysics 54, 535-554). The fundamental independence assumption has only been directly tested once before, where tests revealed no dependencies except those produced by guessing. The present study tests the independence assumption using several different statistics and, contrary to most extant models of whole report, finds significant positive dependence. Poisson models do predict a positive dependence and we develop a succinctly parameterized version, the Weighted Path Poisson Model, which allows the finishing order to be a weighted probabilistic mechanism. However, it does not predict the data quite as well as a new model, the Variable Attention Model, which allows independence within trials (unlike the Poisson models). This model assumes that attention (or, potentially, other aspects such as signal quality) varies widely across trials, thus predicting an overall positive dependence. Intuitions for and against the competing models are discussed. In addition, we show, through mimicking formulae, that models which contain the proper qualitative type of dependence structure can be cast in either serial or parallel form. Copyright 2001 Academic Press.

  7. Timing and Variability of Galactose Metabolic Gene Activation Depend on the Rate of Environmental Change.

    Directory of Open Access Journals (Sweden)

    Truong D Nguyen-Huu

    2015-07-01

    Full Text Available Modulation of gene network activity allows cells to respond to changes in environmental conditions. For example, the galactose utilization network in Saccharomyces cerevisiae is activated by the presence of galactose but repressed by glucose. If both sugars are present, the yeast will first metabolize glucose, depleting it from the extracellular environment. Upon depletion of glucose, the genes encoding galactose metabolic proteins will activate. Here, we show that the rate at which glucose levels are depleted determines the timing and variability of galactose gene activation. Paradoxically, we find that Gal1p, an enzyme needed for galactose metabolism, accumulates more quickly if glucose is depleted slowly rather than taken away quickly. Furthermore, the variability of induction times in individual cells depends non-monotonically on the rate of glucose depletion and exhibits a minimum at intermediate depletion rates. Our mathematical modeling suggests that the dynamics of the metabolic transition from glucose to galactose are responsible for the variability in galactose gene activation. These findings demonstrate that environmental dynamics can determine the phenotypic outcome at both the single-cell and population levels.

  8. Precipitation variability on global pasturelands may affect food security in livestock-dependent regions

    Science.gov (United States)

    Sloat, L.; Gerber, J. S.; Samberg, L. H.; Smith, W. K.; West, P. C.; Herrero, M.; Brendan, P.; Cecile, G.; Katharina, W.; Smith, W. K.

    2016-12-01

    The need to feed an increasing number of people while maintaining biodiversity and ecosystem services is one of the key challenges currently facing humanity. Livestock systems are likely to be a crucial piece of this puzzle, as urbanization and changing diets in much of the world lead to increases in global meat consumption. This predicted increase in global demand for livestock products will challenge the ability of pastures and rangelands to maintain or increase their productivity. The majority of people that depend on animal production for food security do so through grazing and herding on natural rangelands, and these systems make a significant contribution to global production of meat and milk. The vegetation dynamics of natural forage are highly dependent on climate, and subject to disruption with changes in climate and climate variability. Precipitation heterogeneity has been linked to the ecosystem dynamics of grazing lands through impacts on livestock carrying capacity and grassland degradation potential. Additionally, changes in precipitation variability are linked to the increased incidence of extreme events (e.g. droughts, floods) that negatively impact food production and food security. Here, we use the inter-annual coefficient of variation (CV) of precipitation as a metric to assess climate risk on global pastures. Comparisons of global satellite measures of vegetation greenness to climate reveal that the CV of precipitation is negatively related to mean annual NDVI, such that areas with low year-to-year precipitation variability have the highest measures of vegetation greenness, and vice versa. Furthermore, areas with high CV of precipitation support lower livestock densities and produce less meat. A sliding window analysis of changes in CV of precipitation over the last century shows that, overall, precipitation variability is increasing in global pasture areas, although global maps reveal a patchwork of both positive and negative changes. We use

  9. A note on the maximum likelihood estimator in the gamma regression model

    Directory of Open Access Journals (Sweden)

    Jerzy P. Rydlewski

    2009-01-01

    Full Text Available This paper considers a nonlinear regression model, in which the dependent variable has the gamma distribution. A model is considered in which the shape parameter of the random variable is the sum of continuous and algebraically independent functions. The paper proves that there is exactly one maximum likelihood estimator for the gamma regression model.

  10. Detecting ecological breakpoints: a new tool for piecewise regression

    Directory of Open Access Journals (Sweden)

    Alessandro Ferrarini

    2011-06-01

    Full Text Available Simple linear regression tries to determine a linear relationship between a given variable X (predictor and a dependent variable Y. Since most of the environmental problems involve complex relationships, X-Y relationship is often better modeled through a regression where, instead of fitting a single straight line to the data, the algorithm allows the fitting to bend. Piecewise regressions just do it, since they allow emphasize local, instead of global, rules connecting predictor and dependent variables. In this work, a tool called RolReg is proposed as an implementation of Krummel's method to detect breakpoints in regression models. RolReg, which is freely available upon request from the author, could useful to detect proper breakpoints in ecological laws.

  11. The Timescale-dependent Color Variability of Quasars Viewed with /GALEX

    Science.gov (United States)

    Zhu, Fei-Fan; Wang, Jun-Xian; Cai, Zhen-Yi; Sun, Yu-Han

    2016-11-01

    In a recent work by Sun et al., the color variation of quasars, namely the bluer-when-brighter trend, was found to be timescale dependent using the SDSS g/r band light curves in Stripe 82. Such timescale dependence, i.e., bluer variation at shorter timescales, supports the thermal fluctuation origin of the UV/optical variation in quasars, and can be modeled well with the inhomogeneous accretion disk model. In this paper, we extend the study to much shorter wavelengths in the rest frame (down to extreme UV) using GALaxy Evolution eXplorer (GALEX) photometric data of quasars collected in two ultraviolet bands (near-UV and far-UV). We develop Monte Carlo simulations to correct for possible biases due to the considerably larger photometric uncertainties in the GALEX light curves (particularly in the far-UV, compared with the SDSS g/r bands), which otherwise could produce artificial results. We securely confirm the previously discovered timescale dependence of the color variability with independent data sets and at shorter wavelengths. We further find that the slope of the correlation between the amplitude of the color variation and timescale appears even steeper than predicted by the inhomogeneous disk model, which assumes that disk fluctuations follow a damped random walk (DRW) process. The much flatter structure function observed in the far-UV compared with that at longer wavelengths implies deviation from the DRW process in the inner disk, where rest-frame extreme UV radiation is produced.

  12. The Timescale-Dependent Color Variability of Quasars Viewed with GALEX

    CERN Document Server

    Zhu, Fei-Fan; Cai, Zhen-Yi; Sun, Yu-Han

    2016-01-01

    In recent work done by Sun et. al., the color variation of quasars, namely the bluer-when-brighter trend, was found to be timescale-dependent using SDSS $g/r$ band light curves in the Stripe 82. Such timescale dependence, i.e., bluer variation at shorter timescales, supports the thermal fluctuation origin of the UV/optical variation in quasars, and can be well modeled with the inhomogeneous accretion disk model. In this paper, we extend the study to much shorter wavelengths in the rest frame (down to extreme UV), using GALaxy Evolution eXplorer (GALEX) photometric data of quasars collected in two ultraviolet bands (near-UV and far-UV). We develop Monte-Carlo simulations to correct possible biases due to the considerably larger photometric uncertainties in GALEX light curves (particularly in far-UV, comparing with SDSS $g/r$ bands), which otherwise could produce artificial results. We securely confirm the previously discovered timescale dependence of the color variability with independent datasets and at short...

  13. Multicollinearity and correlation among local regression coefficients in geographically weighted regression

    Science.gov (United States)

    Wheeler, David; Tiefelsdorf, Michael

    2005-06-01

    Present methodological research on geographically weighted regression (GWR) focuses primarily on extensions of the basic GWR model, while ignoring well-established diagnostics tests commonly used in standard global regression analysis. This paper investigates multicollinearity issues surrounding the local GWR coefficients at a single location and the overall correlation between GWR coefficients associated with two different exogenous variables. Results indicate that the local regression coefficients are potentially collinear even if the underlying exogenous variables in the data generating process are uncorrelated. Based on these findings, applied GWR research should practice caution in substantively interpreting the spatial patterns of local GWR coefficients. An empirical disease-mapping example is used to motivate the GWR multicollinearity problem. Controlled experiments are performed to systematically explore coefficient dependency issues in GWR. These experiments specify global models that use eigenvectors from a spatial link matrix as exogenous variables.

  14. Adaptive metric kernel regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    2000-01-01

    regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...

  15. Plotting partial correlation and regression in ecological studies

    Directory of Open Access Journals (Sweden)

    J. Moya-Laraño

    2008-06-01

    Full Text Available Multiple regression, the General linear model (GLM and the Generalized linear model (GLZ are widely used in ecology. The widespread use of graphs that include fitted regression lines to document patterns in simple linear regression can be easily extended to these multivariate techniques in plots that show the partial relationship of the dependent variable with each independent variable. However, the latter procedure is not nearly as widely used in ecological studies. In fact, a brief review of the recent ecological literature showed that in ca. 20% of the papers the results of multiple regression are displayed by plotting the dependent variable against the raw values of the independent variable. This latter procedure may be misleading because the value of the partial slope may change in magnitude and even in sign relative to the slope obtained in simple least-squares regression. Plots of partial relationships should be used in these situations. Using numerical simulations and real data we show how displaying plots of partial relationships may also be useful for: 1 visualizing the true scatter of points around the partial regression line, and 2 identifying influential observations and non-linear patterns more efficiently than using plots of residuals vs. fitted values. With the aim to help in the assessment of data quality, we show how partial residual plots (residuals from overall model + predicted values from the explanatory variable vs. the explanatory variable should only be used in restricted situations, and how partial regression plots (residuals of Y on the remaining explanatory variables vs. residuals of the target explanatory variable on the remaining explanatory variables should be the ones displayed in publications because they accurately reflect the scatter of partial correlations. Similarly, these partial plots can be applied to visualize the effect of continuous variables in GLM and GLZ for normal distributions and identity link

  16. Nonlinear Forecasting With Many Predictors Using Kernel Ridge Regression

    DEFF Research Database (Denmark)

    Exterkate, Peter; Groenen, Patrick J.F.; Heij, Christiaan

    This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation of the predi......This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation...... of the predictive regression model is based on a shrinkage estimator to avoid overfitting. We extend the kernel ridge regression methodology to enable its use for economic time-series forecasting, by including lags of the dependent variable or other individual variables as predictors, as typically desired...... in macroeconomic and financial applications. Monte Carlo simulations as well as an empirical application to various key measures of real economic activity confirm that kernel ridge regression can produce more accurate forecasts than traditional linear and nonlinear methods for dealing with many predictors based...

  17. Time-dependent sleep stage transition model based on heart rate variability.

    Science.gov (United States)

    Takeda, Toki; Mizuno, Osamu; Tanaka, Tomohiro

    2015-01-01

    A new model is proposed to automatically classify sleep stages using heart rate variability (HRV). The generative model, based on the characteristics that the distribution and the transition probabilities of sleep stages depend on the elapsed time from the beginning of sleep, infers the sleep stage with a Gibbs sampler. Experiments were conducted using a public data set consisting of 45 healthy subjects and the model's classification accuracy was evaluated for three sleep stages: wake state, rapid eye movement (REM) sleep, and non-REM sleep. Experimental results demonstrated that the model provides more accurate sleep stage classification than conventional (naive Bayes and Support Vector Machine) models that do not take the above characteristics into account. Our study contributes to improve the quality of sleep monitoring in the daily life using easy-to-wear HRV sensors.

  18. Cast microstructure of Inconel 713C and its dependence on solidification variables

    Energy Technology Data Exchange (ETDEWEB)

    Bhambri, A.K.; Kattamis, T.Z.; Morral, J.E.

    1975-03-01

    The dependence of cast microstructure of Inconel 713C on solidification variables was investigated over a wide range of local cooling rates, epsilon, and thermal gradients in the liquid at the solid-liquid interface, G. The shape of MC carbide particles was found to depend greatly on: 1) the G/R ratio at the solid-liquid interface, where R is growth rate, through the effect of this ratio on the solid phase, ..gamma../sub g/, growth morphology. Under planar front growth conditions the carbide particles were octahedral, under cellular growth conditions they were plate-like, elongated along the cellular growth direction, and under dendritic growth conditions they were irregularly shaped; 2) the local cooling rate, epsilon, when ..gamma.. was dendritic, with a transition from octahedral to dendritic with increasing epsilon. The size of MC carbide particles was found to be controlled by coarsening and to become finer with increasing epsilon. In this alloy the composition of the MC carbide was established as (Nb/sub 0/./sub 63/Ti/sub 0/./sub 31/M0/sub 0/./sub 06/)C and was practically independent of local cooling rate. Other observations were that the precipitation of ..gamma.., d the formation of nonequilibrium eutectics, such as MC-..gamma.., ..gamma..-..gamma..' or MC-..gamma..-..gamma..' were suppressed at splat-cooling rates. Also, microsegregation of all alloying elements with the exception of aluminum was normal, with concentration increasing from the dendrite center-line to the dendrite arm boundary. Aluminum behaved in the opposite manner. Within the cooling rate range used herein, this variable had only a slight effect on microsegregation.

  19. A Bayesian Alternative to Mutual Information for the Hierarchical Clustering of Dependent Random Variables.

    Directory of Open Access Journals (Sweden)

    Guillaume Marrelec

    Full Text Available The use of mutual information as a similarity measure in agglomerative hierarchical clustering (AHC raises an important issue: some correction needs to be applied for the dimensionality of variables. In this work, we formulate the decision of merging dependent multivariate normal variables in an AHC procedure as a Bayesian model comparison. We found that the Bayesian formulation naturally shrinks the empirical covariance matrix towards a matrix set a priori (e.g., the identity, provides an automated stopping rule, and corrects for dimensionality using a term that scales up the measure as a function of the dimensionality of the variables. Also, the resulting log Bayes factor is asymptotically proportional to the plug-in estimate of mutual information, with an additive correction for dimensionality in agreement with the Bayesian information criterion. We investigated the behavior of these Bayesian alternatives (in exact and asymptotic forms to mutual information on simulated and real data. An encouraging result was first derived on simulations: the hierarchical clustering based on the log Bayes factor outperformed off-the-shelf clustering techniques as well as raw and normalized mutual information in terms of classification accuracy. On a toy example, we found that the Bayesian approaches led to results that were similar to those of mutual information clustering techniques, with the advantage of an automated thresholding. On real functional magnetic resonance imaging (fMRI datasets measuring brain activity, it identified clusters consistent with the established outcome of standard procedures. On this application, normalized mutual information had a highly atypical behavior, in the sense that it systematically favored very large clusters. These initial experiments suggest that the proposed Bayesian alternatives to mutual information are a useful new tool for hierarchical clustering.

  20. A Bayesian Alternative to Mutual Information for the Hierarchical Clustering of Dependent Random Variables.

    Science.gov (United States)

    Marrelec, Guillaume; Messé, Arnaud; Bellec, Pierre

    2015-01-01

    The use of mutual information as a similarity measure in agglomerative hierarchical clustering (AHC) raises an important issue: some correction needs to be applied for the dimensionality of variables. In this work, we formulate the decision of merging dependent multivariate normal variables in an AHC procedure as a Bayesian model comparison. We found that the Bayesian formulation naturally shrinks the empirical covariance matrix towards a matrix set a priori (e.g., the identity), provides an automated stopping rule, and corrects for dimensionality using a term that scales up the measure as a function of the dimensionality of the variables. Also, the resulting log Bayes factor is asymptotically proportional to the plug-in estimate of mutual information, with an additive correction for dimensionality in agreement with the Bayesian information criterion. We investigated the behavior of these Bayesian alternatives (in exact and asymptotic forms) to mutual information on simulated and real data. An encouraging result was first derived on simulations: the hierarchical clustering based on the log Bayes factor outperformed off-the-shelf clustering techniques as well as raw and normalized mutual information in terms of classification accuracy. On a toy example, we found that the Bayesian approaches led to results that were similar to those of mutual information clustering techniques, with the advantage of an automated thresholding. On real functional magnetic resonance imaging (fMRI) datasets measuring brain activity, it identified clusters consistent with the established outcome of standard procedures. On this application, normalized mutual information had a highly atypical behavior, in the sense that it systematically favored very large clusters. These initial experiments suggest that the proposed Bayesian alternatives to mutual information are a useful new tool for hierarchical clustering.

  1. A duality approach to the worst case value at risk for a sum of dependent random variables with known covariances

    OpenAIRE

    Brice Franke; Michael Stolz

    2009-01-01

    We propose an approach to the aggregation of risks which is based on estimation of simple quantities (such as covariances) associated to a vector of dependent random variables, and which avoids the use of parametric families of copulae. Our main result demonstrates that the method leads to bounds on the worst case Value at Risk for a sum of dependent random variables. Its proof applies duality theory for infinite dimensional linear programs.

  2. Time-dependent reliability of corrosion-affected RC beams. Part 3: Effect of corrosion initiation time and its variability on time-dependent failure probability

    Energy Technology Data Exchange (ETDEWEB)

    Bhargava, Kapilesh, E-mail: kapil_66@barc.gov.i [Architecture and Civil Engineering Division, Bhabha Atomic Research Center, Trombay, Mumbai 400 085 (India); Mori, Yasuhiro [Graduate School of Environmental Studies, Nagoya University, Nagoya 464-8603 (Japan); Ghosh, A.K. [Reactor Safety Division, Bhabha Atomic Research Center, Trombay, Mumbai 400 085 (India)

    2011-05-15

    This paper forms the third part of a study which addresses time-dependent reliability analyses of reinforced concrete (RC) beams affected by reinforcement corrosion. Parts 1 and 2 of the reliability study are presented in companion papers. Part 1 of the reliability study presents evaluation of probabilistic descriptions for time-dependent strengths of a typical simply supported corrosion-affected RC beam. These probabilistic descriptions, i.e., mean and coefficient of variation (c.o.v.) for the time-dependent strengths are presented for two limit states: (a) flexural failure; and (b) shear failure. Part 2 of the reliability study presents evaluation of time-dependent failure probability for the considered RC beam by utilizing the information on probabilistic descriptions for time-dependent strengths available in Part 1. Evaluation of time-dependent failure probability considering the variability in time-dependent strengths and/or time-dependent degradation functions is also presented. This paper investigates the effects of time to corrosion initiation and its variability on failure probability of the same RC beam presented in companion papers. By considering variability in the identified variables that could affect the expected time of first corrosion, simple estimations are presented for mean time to corrosion initiation and variability associated with time to corrosion initiation. Evaluation of time-dependent failure probability for the beam is presented by considering estimated probabilistic descriptions, i.e., mean and c.o.v. for time to corrosion initiation. Parametric analyses show that failure probability for the beam is sensitive to the mode of strength degradation and time to corrosion initiation.

  3. From Rasch scores to regression

    DEFF Research Database (Denmark)

    Christensen, Karl Bang

    2006-01-01

    Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....

  4. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: insights into spatial variability using high-resolution satellite data.

    Science.gov (United States)

    Alexeeff, Stacey E; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A

    2015-01-01

    Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1 km × 1 km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R(2) yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with >0.9 out-of-sample R(2) yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the SEs. Land use regression models performed better in chronic effect simulations. These results can help researchers when interpreting health effect estimates in these types of studies.

  5. An Investigation of the Relationship of Intellective and Personality Variables to Success in an Independent Study Science Course Through the Use of a Modified Multiple Regression Model.

    Science.gov (United States)

    Szabo, Michael; Feldhusen, John F.

    This is an empirical study of selected learner characteristics and their relation to academic success, as indicated by course grades, in a structured independent study learning program. This program, called the Audio-Tutorial System, was utilized in an undergraduate college course in the biological sciences. By use of multiple regression analysis,…

  6. Ionic strength-dependent changes in tentacular ion exchangers with variable ligand density. II. Functional properties.

    Science.gov (United States)

    Bhambure, Rahul; Angelo, James M; Gillespie, Christopher M; Phillips, Michael; Graalfs, Heiner; Lenhoff, Abraham M

    2017-07-14

    The effect of ligand density was studied on protein adsorption and transport behavior in tentacular cation-exchange sorbents at different ionic strengths. Results were obtained for lysozyme, lactoferrin and a monoclonal antibody (mAb) in order to examine the effects of protein size and charge. The combination of ligand density and ionic strength results in extensive variability of the static and dynamic binding capacities, transport rate and binding affinity of the proteins. Uptake and elution experiments were performed to quantify the transport behavior of selected proteins, specifically to estimate intraparticle protein diffusivities. The observed trend of decreasing uptake diffusivities with an increase in ligand density was correlated to structural properties of the ligand-density variants, particularly the accessible porosity. Increasing the ionic strength of the equilibration buffer led to enhanced mass transfer during uptake, independent of the transport model used, and specifically for larger proteins like lactoferrin and mAb, the most significant effects were evident in the sorbent of the highest ligand density. For lysozyme, higher ligand density leads to higher static and dynamic binding capacities whereas for lactoferrin and the mAb, the binding capacity is a complex function of accessible porosity due to ionic strength-dependent changes. Ligand density has a less pronounced effect on the elution rate, presumably due to ionic strength-dependent changes in the pore architecture of the sorbents. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Regression modeling of ground-water flow

    Science.gov (United States)

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  8. The comparison of robust partial least squares regression with robust principal component regression on a real

    Science.gov (United States)

    Polat, Esra; Gunay, Suleyman

    2013-10-01

    One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.

  9. 定序变量回归模型在心理数据分析中的应用%A Regression Analysis Model of Ordinal Variable to Psychological Data

    Institute of Scientific and Technical Information of China (English)

    徐芃; 祁禄; 熊健; 叶浩生

    2015-01-01

    定序变量在心理现象和心理数据中随处可见, 采用综合的定序变量回归分析模型可以对"镜像模式"和"漏斗模型"的心理现象做出合理的解释和预测.首先通过非参数检验对影响因素进行初步降维, 其次用 Probit 定序回归对降维后的影响因素贡献率进行判别, 从而进一步筛选具有显著性判断水平的有效指标,最后用Logistic回归模型对某种特定的心理现象发生与否进行信息量足够大的解释和预测.大学毕业生工作生活质量满意度的预测对这种综合定序变量回归分析模型的实例拟合, 证实了综合定序变量回归分析模型在心理现象和心理数据分析中的应用价值.%Ordinal variables are the common form of categorical variables in random phenomenon. Ordinal data which is formed from the level of ordinal variables by sequencing scale measurement has been widely used in psychological research. Psychological data is a kind of data from randomized hidden variable, which seems to be noticeable but could not be touched such as degree of satisfaction, preference degree, cognition degree, sentiment perceptibility, behavioral level and so on. The mental impression is hard to be calculated. To be exposed for calculated ordinal data is a kind of judgment standard or decision threshold criteria of an individual psychological activity in implicit psychological data. When a certain degree of psychological feeling happens to be just between two adjacent thresholds, the individual would be given a numerical value like a scale to project this "Mirror mode" of the psychological decision threshold criteria. Meanwhile, people are always concerned about what factors or conditions decide the high-low of threshold value of these ordinal variables based on cognitive instinct. This sort of "Hopper model" which is used to study the factors affecting to the psychological decision threshold criteria is a typical regression model. The paper

  10. Multivariate regression analytical method based on heuristic constructed variable under condition of incomplete data%数据缺失条件下基于启发式构元的多元回归分析方法

    Institute of Scientific and Technical Information of China (English)

    张希翔; 李陶深

    2012-01-01

    Regression analysis is often used for filling and predicting incomplete data, whereas it has some flaws when constructing regression equation, the independent variable form is fixed and single. In order to solve the problem, the paper proposed an improved multivariate regression analytical method based on heuristic constructed variable. Firstly, the existing variables' optimized combination forms were found by means of greedy algorithm, then the new constructed variable for multivariate regression analysis was chosen to get a better goodness of fit. Results of calculating and estimating incomplete data of wheat stalks' mechanical strength prove thai the proposed method is feasible and effective, and it can get a better goodness of fit when predicting incomplete data.%传统的多元回归分析方法可以对缺失数据进行预测填补,但它在构造回归方程时存在自变量形式较为固定、单一等不足.为此,提出一种基于启发式构元的多元回归分析方法,通过贪婪算法找出现有变量的优化组合形式,选取若干新构变量进行回归分析,从而得到更好的拟合优度.通过对案例中小麦茎秆机械强度缺失数据信息进行仿真计算和评估,证实了方法的有效性.算例结果表明该方法运用在缺失数据预测中拥有较好的精准性.

  11. Türkiye'nin Turizm Gelirini Etkileyen Değişkenler İçin En Uygun Regresyon Denkleminin Belirlenmesi = Obtaining the Optimum Regression Equation for Variables Which Effects Incoming of Tourism in Turkey

    Directory of Open Access Journals (Sweden)

    Cengiz AKTAŞ

    2005-06-01

    Full Text Available In this study, we investigate the importance of tourism for Turkish ecenomy, and define the optimum variables which affect tourism revenues. In this type of econometric study that needs the multiple regression models, one of the problems in estimation of parameters is stationarity in time series. Therefore, usableness of the problem for long run relationship is analyzed. Finally autocorrelation, multicollinearity and heteroscedasticity are investigated.

  12. Group Lasso for high dimensional sparse quantile regression models

    CERN Document Server

    Kato, Kengo

    2011-01-01

    This paper studies the statistical properties of the group Lasso estimator for high dimensional sparse quantile regression models where the number of explanatory variables (or the number of groups of explanatory variables) is possibly much larger than the sample size while the number of variables in "active" groups is sufficiently small. We establish a non-asymptotic bound on the $\\ell_{2}$-estimation error of the estimator. This bound explains situations under which the group Lasso estimator is potentially superior/inferior to the $\\ell_{1}$-penalized quantile regression estimator in terms of the estimation error. We also propose a data-dependent choice of the tuning parameter to make the method more practical, by extending the original proposal of Belloni and Chernozhukov (2011) for the $\\ell_{1}$-penalized quantile regression estimator. As an application, we analyze high dimensional additive quantile regression models. We show that under a set of primitive regularity conditions, the group Lasso estimator c...

  13. Nanostructures study of CNT nanofluids transport with temperature-dependent variable viscosity in a muscular tube

    Science.gov (United States)

    Akbar, Noreen Sher; Abid, Syed Ali; Tripathi, Dharmendra; Mir, Nazir Ahmed

    2017-03-01

    The transport of single-wall carbon nanotube (CNT) nanofluids with temperature-dependent variable viscosity is analyzed by peristaltically driven flow. The main flow problem has been modeled using cylindrical coordinates and flow equations are simplified to ordinary differential equations using long wavelength and low Reynolds' number approximation. Analytical solutions have been obtained for axial velocity, pressure gradient and temperature. Results acquired are discussed graphically for better understanding. It is observed that with an increment in the Grashof number the velocity of the governing fluids starts to decrease significantly and the pressure gradient is higher for pure water as compared to single-walled carbon nanotubes due to low density. As the specific heat is very high for pure water as compared to the multi-wall carbon nanotubes, it raises temperature of the muscles, in the case of pure water, as compared to the multi-walled carbon nanotubes. Furthermore, it is noticed that the trapped bolus starts decreasing in size as the buoyancy forces are dominant as compared to viscous forces. This model may be applicable in biomedical engineering and nanotechnology to design the biomedical devices.

  14. Metallurgical coke quality depending on the variability of properties of coking coal mixes components

    Energy Technology Data Exchange (ETDEWEB)

    M. Kaloc; S. Bartusek; S. Czudek [VSB-TU Ostrava (Czech Republic)

    2005-07-01

    The main sources for this report are the experiences acquired by the long lasting practice of the coking coal mixes preparing and tuning them in accordance with the variable qualitative properties of the coals mined in the coalfields of the OKD Company Ostrava. The systematic database, made by summarizing the values of measured indexes, became a very useful instrument for the coal mixes composing with regard on the two today very important points of view, namely: Contemporary presence and the long lasting availability of the definite coal type from any local source. Price basis influencing strongly the economics of the coke production. The method of prognostic estimating of the metallurgical coke quality dependence on the coking mixes composition, developed some time ago by authors of presented paper, was published in the Cokemaking International Vol. 13, 2/2001 (Czudek S. and al.: Simulation of Carbonization Process under Laboratory Conditions). The original procedure was newly accomplished by implementing a special method of the multi criteria evaluation of the definite coal components. New method is based on special processing of the technologic significant qualitative properties of the mined coal brands enabling deeply estimate the impacts of their application in metallurgical coke production. The importance of this evaluating system exceeds largely the well known method that is incorporated in the international coal classification. The main advantage of the new method is the fully respecting of the specialties marking the geographic different coalfields. (Abstract only)

  15. Satellite derived precipitation and freshwater flux variability and its dependence on the North Atlantic Oscillation

    Science.gov (United States)

    Andersson, Axel; Bakan, Stephan; Graßl, Hartmut

    2010-08-01

    The variability of satellite retrieved precipitation and freshwater flux from the `Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data' (HOAPS) is assessed with special emphasis on the `North Atlantic Oscillation' (NAO). To cover also land areas, a novel combination of the satellite derived precipitation climatology with the rain gauge based `Full Data Reanalysis Product Version 4', of the `Global Precipitation Climatology Centre' (GPCC) is used. This yields unique high-resolution, quasi-global precipitation fields compiled from two independent data sources. Over the ocean, the response of the freshwater balance and the related parameters to the NAO is investigated for the first time by using a purely satellite based data set. A strong dependence of precipitation patterns to the state of the NAO is found. On synoptic scale this is in accordance with earlier findings by other satellite based and reanalysis products. Furthermore, the consistency of the combined HOAPS-3/GPCC data set allows also detailed regional analyses of precipitation patterns. The response of HOAPS-3 freshwater flux to the NAO is dominated by precipitation at mid and high latitudes, while for the subtropical regions the feedback of the evaporation is stronger.

  16. A classification of substance-dependent men on temperament and severity variables.

    Science.gov (United States)

    Henderson, Melinda J; Galen, Luke W

    2003-06-01

    This study examined the validity of classifying substance abusers based on temperament and dependence severity, and expanded the scope of typology differences to proximal determinants of use (e.g., expectancies, motives). Patients were interviewed about substance use, depression, and family history of alcohol and drug abuse. Self-report instruments measuring temperament, expectancies, and motives were completed. Participants were 147 male veterans admitted to inpatient substance abuse treatment at a U.S. Department of Veterans Affairs medical center. Cluster analysis identified four types of users with two high substance problem severity and two low substance problem severity groups. Two, high problem severity, early onset groups differed only on the cluster variable of negative affectivity (NA), but showed differences on antisocial personality characteristics, hypochondriasis, and coping motives for alcohol. The two low problem severity groups were distinguished by age of onset and positive affectivity (PA). The late onset, low PA group had a higher incidence of depression, a greater tendency to use substances in solitary contexts, and lower enhancement motives for alcohol compared to the early onset, high PA cluster. The four-cluster solution yielded more distinctions on external criteria than the two-cluster solution. Such temperament variation within both high and low severity substance abusers may be important for treatment planning.

  17. Seismic hazard from induced seismicity: effect of time-dependent hazard variables

    Science.gov (United States)

    Convertito, V.; Sharma, N.; Maercklin, N.; Emolo, A.; Zollo, A.

    2012-12-01

    of the peak-ground motion parameters (e.g., magnitude, geometrical spreading and anelastic attenuation). Moreover, we consider both the inter-event and intra-event components of the standard deviation. For comparison, we use the same dataset analyzed by Convertito et al. (2012), and for successive time windows we perform the regression analysis to infer the time-dependent coefficients of the GMPE. After having tested the statistical significance of the new coefficients and having verified a reduction in the total standard deviation, we introduce the new model in the hazard integral. Hazard maps and site-specific analyses in terms of a uniform hazard spectrum are used to compare the new results with those obtained in our previous study to investigate which coefficients and which components of the total standard deviation do really matter for refining seismic hazard estimates for induced seismicity. Convertito et al. (2012). From Induced Seismicity to Direct Time-Dependent Seismic Hazard, BSSA 102(6), doi:10.1785/0120120036.

  18. Aid and growth regressions

    DEFF Research Database (Denmark)

    Hansen, Henrik; Tarp, Finn

    2001-01-01

    . There are, however, decreasing returns to aid, and the estimated effectiveness of aid is highly sensitive to the choice of estimator and the set of control variables. When investment and human capital are controlled for, no positive effect of aid is found. Yet, aid continues to impact on growth via...... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes....

  19. Radial dependence of line profile variability in seven O9--B0.5 stars

    CERN Document Server

    Martins, F; Hillier, D J; Donati, J -F; Bouret, J -C

    2014-01-01

    Massive stars show a variety of spectral variability: presence of discrete absorption components in UV P-Cygni profiles, optical line profile variability, X-ray variability, radial velocity modulations. Our goal is to study the spectral variability of single OB stars to better understand the relation between photospheric and wind variability. For that, we rely on high spectral resolution, high signal-to-noise ratio optical spectra collected with the spectrograph NARVAL on the Telescope Bernard Lyot at Pic du Midi. We investigate the variability of twelve spectral lines by means of the Temporal Variance Spectrum (TVS). The selected lines probe the radial structure of the atmosphere, from the photosphere to the outer wind. We also perform a spectroscopic analysis with atmosphere models to derive the stellar and wind properties, and to constrain the formation region of the selected lines. We show that variability is observed in the wind lines of all bright giants and supergiants, on a daily timescale. Lines form...

  20. Alaskan soil carbon stocks: spatial variability and dependence on environmental factors

    Directory of Open Access Journals (Sweden)

    U. Mishra

    2012-09-01

    Full Text Available The direction and magnitude of soil organic carbon (SOC changes in response to climate change depend on the spatial and vertical distributions of SOC. We estimated spatially resolved SOC stocks from surface to C horizon, distinguishing active-layer and permafrost-layer stocks, based on geospatial analysis of 472 soil profiles and spatially referenced environmental variables for Alaska. Total Alaska state-wide SOC stock was estimated to be 77 Pg, with 61% in the active-layer, 27% in permafrost, and 12% in non-permafrost soils. Prediction accuracy was highest for the active-layer as demonstrated by highest ratio of performance to deviation (1.5. Large spatial variability was predicted, with whole-profile, active-layer, and permafrost-layer stocks ranging from 1–296 kg C m−2, 2–166 kg m−2, and 0–232 kg m−2, respectively. Temperature and soil wetness were found to be primary controllers of whole-profile, active-layer, and permafrost-layer SOC stocks. Secondary controllers, in order of importance, were found to be land cover type, topographic attributes, and bedrock geology. The observed importance of soil wetness rather than precipitation on SOC stocks implies that the poor representation of high-latitude soil wetness in Earth system models may lead to large uncertainty in predicted SOC stocks under future climate change scenarios. Under strict caveats described in the text and assuming temperature changes from the A1B Intergovernmental Panel on Climate Change emissions scenario, our geospatial model indicates that the equilibrium average 2100 Alaska active-layer depth could deepen by 11 cm, resulting in a thawing of 13 Pg C currently in permafrost. The equilibrium SOC loss associated with this warming would be highest under continuous permafrost (31%, followed by discontinuous (28%, isolated (24.3%, and sporadic (23.6% permafrost areas. Our high-resolution mapping of soil carbon stock reveals the

  1. Category learning in Alzheimer's disease and normal cognitive aging depends on initial experience of feature variability.

    Science.gov (United States)

    Phillips, Jeffrey S; McMillan, Corey T; Smith, Edward E; Grossman, Murray

    2017-04-01

    Semantic category learning is dependent upon several factors, including the nature of the learning task, as well as individual differences in the quality and heterogeneity of exemplars that an individual encounters during learning. We trained healthy older adults (n=39) and individuals with a diagnosis of Alzheimer's disease or Mild Cognitive Impairment (n=44) to recognize instances of a fictitious animal, a "crutter". Each stimulus item contained 10 visual features (e.g., color, tail shape) which took one of two values for each feature (e.g., yellow/red, curly/straight tails). Participants were presented with a series of items (learning phase) and were either told the items belonged to a semantic category (explicit condition) or were told to think about the appearance of the items (implicit condition). Half of participants saw learning items with higher similarity to an unseen prototype (high typicality learning set), and thus lower between-item variability in their constituent features; the other half learned from items with lower typicality (low typicality learning set) and higher between-item feature variability. After the learning phase, participants were presented with test items one at a time that varied in the number of typical features from 0 (antitype) to 10 (prototype). We examined between-subjects factors of learning set (lower or higher typicality), instruction type (explicit or implicit), and group (patients vs. elderly control). Learning in controls was aided by higher learning set typicality: while controls in both learning set groups demonstrated significant learning, those exposed to a high-typicality learning set appeared to develop a prototype that helped guide their category membership judgments. Overall, patients demonstrated more difficulty with category learning than elderly controls. Patients exposed to the higher-typicality learning set were sensitive to the typical features of the category and discriminated between the most and least

  2. Alaskan soil carbon stocks: spatial variability and dependence on environmental factors

    Directory of Open Access Journals (Sweden)

    U. Mishra

    2012-05-01

    Full Text Available The direction and magnitude of soil organic carbon (SOC changes in response to climate change depend on the spatial and vertical distributions of SOC. We estimated spatially-resolved SOC stocks from surface to C horizon, distinguishing active-layer and permafrost-layer stocks, based on geospatial analysis of 472 soil profiles and spatially referenced environmental variables for Alaska. Total Alaska state-wide SOC stock was estimated to be 77 Pg, with 61% in the active-layer, 27% in permafrost, and 12% in non-permafrost soils. Prediction accuracy was highest for the active-layer as demonstrated by highest ratio of performance to deviation (1.5. Large spatial variability was predicted, with whole-profile, active-layer, and permafrost-layer stocks ranging from 1–296 kg C m−2, 2–166 kg m−2, and 0–232 kg m−2, respectively. Temperature and soil wetness were found to be primary controllers of whole-profile, active-layer, and permafrost-layer SOC stocks. Secondary controllers, in order of importance, were: land cover type, topographic attributes, and bedrock geology. The observed importance of soil wetness rather than precipitation on SOC stocks implies that the poor representation of high-latitude soil wetness in Earth System Models may lead to large uncertainty in predicted SOC stocks under future climate change scenarios. Under strict caveats described in the text and assuming temperature changes from the A1B Intergovernmental Panel on Climate Change emissions scenario, our geospatial model indicates that the equilibrium average 2100 Alaska active-layer depth could deepen by 11 cm, resulting in a thawing of 13 Pg C currently in permafrost. The equilibrium SOC loss associated with this warming would be highest under continuous permafrost (31%, followed by discontinuous (28%, isolated (24.3%, and sporadic (23.6% permafrost areas. Our high resolution mapping of soil carbon stock reveals the potential

  3. Nonparametric instrumental regression with non-convex constraints

    Science.gov (United States)

    Grasmair, M.; Scherzer, O.; Vanhems, A.

    2013-03-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.

  4. Logistic Regression: Concept and Application

    Science.gov (United States)

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  5. Comparison of Artificial Neural Network with Logistic Regression as Classification Models for Variable Selection for Prediction of Breast Cancer Patient Outcomes

    Directory of Open Access Journals (Sweden)

    Valérie Bourdès

    2010-01-01

    Full Text Available The aim of this study was to compare multilayer perceptron neural networks (NNs with standard logistic regression (LR to identify key covariates impacting on mortality from cancer causes, disease-free survival (DFS, and disease recurrence using Area Under Receiver-Operating Characteristics (AUROC in breast cancer patients. From 1996 to 2004, 2,535 patients diagnosed with primary breast cancer entered into the study at a single French centre, where they received standard treatment. For specific mortality as well as DFS analysis, the ROC curves were greater with the NN models compared to LR model with better sensitivity and specificity. Four predictive factors were retained by both approaches for mortality: clinical size stage, Scarff Bloom Richardson grade, number of invaded nodes, and progesterone receptor. The results enhanced the relevance of the use of NN models in predictive analysis in oncology, which appeared to be more accurate in prediction in this French breast cancer cohort.

  6. Method of frequency dependent correlations: investigating the variability of total solar irradiance

    Science.gov (United States)

    Pelt, J.; Käpylä, M. J.; Olspert, N.

    2017-03-01

    Context. This paper contributes to the field of modeling and hindcasting of the total solar irradiance (TSI) based on different proxy data that extend further back in time than the TSI that is measured from satellites. Aims: We introduce a simple method to analyze persistent frequency-dependent correlations (FDCs) between the time series and use these correlations to hindcast missing historical TSI values. We try to avoid arbitrary choices of the free parameters of the model by computing them using an optimization procedure. The method can be regarded as a general tool for pairs of data sets, where correlating and anticorrelating components can be separated into non-overlapping regions in frequency domain. Methods: Our method is based on low-pass and band-pass filtering with a Gaussian transfer function combined with de-trending and computation of envelope curves. Results: We find a major controversy between the historical proxies and satellite-measured targets: a large variance is detected between the low-frequency parts of targets, while the low-frequency proxy behavior of different measurement series is consistent with high precision. We also show that even though the rotational signal is not strongly manifested in the targets and proxies, it becomes clearly visible in FDC spectrum. A significant part of the variability can be explained by a very simple model consisting of two components: the original proxy describing blanketing by sunspots, and the low-pass-filtered curve describing the overall activity level. The models with the full library of the different building blocks can be applied to hindcasting with a high level of confidence, Rc ≈ 0.90. The usefulness of these models is limited by the major target controversy. Conclusions: The application of the new method to solar data allows us to obtain important insights into the different TSI modeling procedures and their capabilities for hindcasting based on the directly observed time intervals.

  7. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    Science.gov (United States)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The

  8. Linear regression in astronomy. I

    Science.gov (United States)

    Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh

    1990-01-01

    Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.

  9. Multicollinearity is a red herring in the search for moderator variables: A guide to interpreting moderated multiple regression models and a critique of Iacobucci, Schneider, Popovich, and Bakamitsos (2016).

    Science.gov (United States)

    McClelland, Gary H; Irwin, Julie R; Disatnik, David; Sivan, Liron

    2017-02-01

    Multicollinearity is irrelevant to the search for moderator variables, contrary to the implications of Iacobucci, Schneider, Popovich, and Bakamitsos (Behavior Research Methods, 2016, this issue). Multicollinearity is like the red herring in a mystery novel that distracts the statistical detective from the pursuit of a true moderator relationship. We show multicollinearity is completely irrelevant for tests of moderator variables. Furthermore, readers of Iacobucci et al. might be confused by a number of their errors. We note those errors, but more positively, we describe a variety of methods researchers might use to test and interpret their moderated multiple regression models, including two-stage testing, mean-centering, spotlighting, orthogonalizing, and floodlighting without regard to putative issues of multicollinearity. We cite a number of recent studies in the psychological literature in which the researchers used these methods appropriately to test, to interpret, and to report their moderated multiple regression models. We conclude with a set of recommendations for the analysis and reporting of moderated multiple regression that should help researchers better understand their models and facilitate generalizations across studies.

  10. General Trimmed Estimation : Robust Approach to Nonlinear and Limited Dependent Variable Models

    NARCIS (Netherlands)

    Cizek, P.

    2004-01-01

    High breakdown-point regression estimators protect against large errors and data con- tamination. Motivated by some { the least trimmed squares and maximum trimmed like- lihood estimators { we propose a general trimmed estimator, which unifies and extends many existing robust procedures. We derive

  11. The derivative-dependent functional variable separation for the evolution equations

    Institute of Scientific and Technical Information of China (English)

    Zhang Shun-Li; Lou Sen-Yue; Qu Chang-Zheng

    2006-01-01

    This paper studies variable separation of the evolution equations via the generalized conditional symmetry. To illustrate, we classify the extended nonlinear wave equation utt = A(u,ux)uxx+B(u,ux,ut) which admits the derivativedependent functional separable solutions (DDFSSs). We also extend the concept of the DDFSS to cover other variable separation approaches.

  12. Enhance-Synergism and Suppression Effects in Multiple Regression

    Science.gov (United States)

    Lipovetsky, Stan; Conklin, W. Michael

    2004-01-01

    Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…

  13. Controlling the Type I Error Rate in Stepwise Regression Analysis.

    Science.gov (United States)

    Pohlmann, John T.

    Three procedures used to control Type I error rate in stepwise regression analysis are forward selection, backward elimination, and true stepwise. In the forward selection method, a model of the dependent variable is formed by choosing the single best predictor; then the second predictor which makes the strongest contribution to the prediction of…

  14. MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

    Directory of Open Access Journals (Sweden)

    Erika KULCSÁR

    2009-12-01

    Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

  15. MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

    Directory of Open Access Journals (Sweden)

    Erika KULCSÁR

    2009-12-01

    Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

  16. Business applications of multiple regression

    CERN Document Server

    Richardson, Ronny

    2015-01-01

    This second edition of Business Applications of Multiple Regression describes the use of the statistical procedure called multiple regression in business situations, including forecasting and understanding the relationships between variables. The book assumes a basic understanding of statistics but reviews correlation analysis and simple regression to prepare the reader to understand and use multiple regression. The techniques described in the book are illustrated using both Microsoft Excel and a professional statistical program. Along the way, several real-world data sets are analyzed in deta

  17. Robust Nonstationary Regression

    OpenAIRE

    1993-01-01

    This paper provides a robust statistical approach to nonstationary time series regression and inference. Fully modified extensions of traditional robust statistical procedures are developed which allow for endogeneities in the nonstationary regressors and serial dependence in the shocks that drive the regressors and the errors that appear in the equation being estimated. The suggested estimators involve semiparametric corrections to accommodate these possibilities and they belong to the same ...

  18. Time-dependent reliability of corrosion-affected RC beams-Part 1: Estimation of time-dependent strengths and associated variability

    Energy Technology Data Exchange (ETDEWEB)

    Bhargava, Kapilesh, E-mail: kapilesh_66@yahoo.co.u [Architecture and Civil Engineering Division, Bhabha Atomic Research Center, Trombay, Mumbai 400 085 (India); Mori, Yasuhiro [Graduate School of Environmental Studies, Nagoya University, Nagoya 464-8603 (Japan); Ghosh, A.K. [Reactor Safety Division, Bhabha Atomic Research Center, Trombay, Mumbai 400 085 (India)

    2011-05-15

    Research highlights: Predictive models for corrosion-induced damages in RC structures. Formulations for time-dependent flexural and shear strengths of corroded RC beams. Methodology for mean and c.o.v. for time-dependent strengths of corroded RC beams. Simple estimation of mean and c.o.v. for flexural strength with loss of bond. - Abstract: The structural deterioration of reinforced concrete (RC) structures due to reinforcement corrosion is a major worldwide problem. Damages to RC structures due to reinforcement corrosion manifest in the form of expansion, cracking and eventual spalling of the cover concrete; thereby resulting in serviceability and durability degradation of such structures. In addition to loss of cover, RC structure may suffer structural damages due to loss of reinforcement cross-sectional area, and loss of bond between corroded reinforcement and surrounding cracked concrete, sometimes to the extent that the structural failure becomes inevitable. This paper forms the first part of a study which addresses time-dependent reliability analyses of RC beams affected by reinforcement corrosion. In this paper initially the predictive models are presented for the quantitative assessment of time-dependent damages in RC beams, recognized as loss of mass and cross-sectional area of reinforcing bar, loss of concrete section owing to the peeling of cover concrete, and loss of bond between corroded reinforcement and surrounding cracked concrete. Then these models have been used to present analytical formulations for evaluating time-dependent flexural and shear strengths of corroded RC beams, based on the standard composite mechanics expressions for RC sections. Further by considering variability in the identified basic variables that could affect the time-dependent strengths of corrosion-affected RC beams, the estimation of statistical descriptions for the time-dependent strengths is presented for a typical simply supported RC beam. The statistical descriptions

  19. Heart rate variability biofeedback in patients with alcohol dependence: a randomized controlled study

    Directory of Open Access Journals (Sweden)

    Penzlin AI

    2015-10-01

    Full Text Available Ana Isabel Penzlin,1 Timo Siepmann,2 Ben Min-Woo Illigens,3 Kerstin Weidner,4 Martin Siepmann4 1Institute of Clinical Pharmacology, 2Department of Neurology, University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Saxony, Germany; 3Department of Neurology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA; 4Department of Psychotherapy and Psychosomatic Medicine, University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Saxony, Germany Background and objective: In patients with alcohol dependence, ethyl-toxic damage of vasomotor and cardiac autonomic nerve fibers leads to autonomic imbalance with neurovascular and cardiac dysfunction, the latter resulting in reduced heart rate variability (HRV. Autonomic imbalance is linked to increased craving and cardiovascular mortality. In this study, we sought to assess the effects of HRV biofeedback training on HRV, vasomotor function, craving, and anxiety. Methods: We conducted a randomized controlled study in 48 patients (14 females, ages 25–59 years undergoing inpatient rehabilitation treatment. In the treatment group, patients (n=24 attended six sessions of HRV biofeedback over 2 weeks in addition to standard rehabilitative care, whereas, in the control group, subjects received standard care only. Psychometric testing for craving (Obsessive Compulsive Drinking Scale, anxiety (Symptom Checklist-90-Revised, HRV assessment using coefficient of variation of R-R intervals (CVNN analysis, and vasomotor function assessment using laser Doppler flowmetry were performed at baseline, immediately after completion of treatment or control period, and 3 and 6 weeks afterward (follow-ups 1 and 2. Results: Psychometric testing showed decreased craving in the biofeedback group immediately postintervention (OCDS scores: 8.6±7.9 post-biofeedback versus 13.7±11.0 baseline [mean ± standard deviation], P<0.05, whereas craving was unchanged at

  20. On modified skew logistic regression model and its applications

    Directory of Open Access Journals (Sweden)

    C. Satheesh Kumar

    2015-12-01

    Full Text Available Here we consider a modified form of the logistic regression model useful for situations where the dependent variable is dichotomous in nature and the explanatory variables exhibit asymmetric and multimodal behaviour. The proposed model has been fitted to some real life data set by using method of maximum likelihood estimation and illustrated its usefulness in certain medical applications.

  1. APPLYING LOGISTIC REGRESSION MODEL TO THE EXAMINATION RESULTS DATA

    Directory of Open Access Journals (Sweden)

    Goutam Saha

    2011-01-01

    Full Text Available The binary logistic regression model is used to analyze the school examination results(scores of 1002 students. The analysis is performed on the basis of the independent variables viz.gender, medium of instruction, type of schools, category of schools, board of examinations andlocation of schools, where scores or marks are assumed to be dependent variables. The odds ratioanalysis compares the scores obtained in two examinations viz. matriculation and highersecondary.

  2. Relationships of Measurement Error and Prediction Error in Observed-Score Regression

    Science.gov (United States)

    Moses, Tim

    2012-01-01

    The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…

  3. Common pitfalls in statistical analysis: Logistic regression.

    Science.gov (United States)

    Ranganathan, Priya; Pramesh, C S; Aggarwal, Rakesh

    2017-01-01

    Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous). In this article, we discuss logistic regression analysis and the limitations of this technique.

  4. Linear regression in astronomy. II

    Science.gov (United States)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  5. The relative dependence of Spanish landscape pattern on environmental and geographical variables over time

    NARCIS (Netherlands)

    Ortega, M.; Bunce, R.G.H.; Barrio, del J.M.G.; Elena-Rossello, R.

    2008-01-01

    The analysis of the dependence of landscape patterns on environment was carried out in order to investigate the landscape structure evolution of Spain. The underlying concept was that the dependence between landscape spatial structure and environmental factors could be gradually decreasing over

  6. A Bayesian approach to linear regression in astronomy

    CERN Document Server

    Sereno, Mauro

    2015-01-01

    Linear regression is common in astronomical analyses. I discuss a Bayesian hierarchical modeling of data with heteroscedastic and possibly correlated measurement errors and intrinsic scatter. The method fully accounts for time evolution. The slope, the normalization, and the intrinsic scatter of the relation can evolve with the redshift. The intrinsic distribution of the independent variable is approximated using a mixture of Gaussian distributions whose means and standard deviations depend on time. The method can address scatter in the measured independent variable (a kind of Eddington bias), selection effects in the response variable (Malmquist bias), and departure from linearity in form of a knee. I tested the method with toy models and simulations and quantified the effect of biases and inefficient modeling. The R-package LIRA (LInear Regression in Astronomy) is made available to perform the regression.

  7. Patterns of variability in early-life traits of fishes depend on spatial scale of analysis.

    Science.gov (United States)

    Di Franco, Antonio; Guidetti, Paolo

    2011-06-23

    Estimates of early-life traits of fishes (e.g. pelagic larval duration (PLD) and spawning date) are essential for investigating and assessing patterns of population connectivity. Such estimates are available for a large number of both tropical and temperate fish species, but few studies have assessed their variability in space, especially across multiple scales. The present study, where a Mediterranean fish (i.e. the white seabream Diplodus sargus sargus) was used as a model, shows that spawning date and PLD are spatially more variable at a scale of kilometres than at a scale of tens to hundreds of kilometres. This study indicates the importance of considering spatial variability of early-life traits of fishes in order to properly delineate connectivity patterns at larval stages (e.g. by means of Lagrangian simulations), thus providing strategically useful information on connectivity and relevant management goals (e.g. the creation of networks of marine reserves).

  8. Adaptive regression for modeling nonlinear relationships

    CERN Document Server

    Knafl, George J

    2016-01-01

    This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...

  9. Entrepreneurial intention modeling using hierarchical multiple regression

    Directory of Open Access Journals (Sweden)

    Marina Jeger

    2014-12-01

    Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.

  10. Combined Prediction Model of Death Toll for Road Traffic Accidents Based on Independent and Dependent Variables

    Science.gov (United States)

    Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang

    2014-01-01

    In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454

  11. Combined Prediction Model of Death Toll for Road Traffic Accidents Based on Independent and Dependent Variables

    Directory of Open Access Journals (Sweden)

    Feng Zhong-xiang

    2014-01-01

    Full Text Available In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.

  12. Time-dependent excitation and ionization modelling of absorption-line variability due to GRB080310

    DEFF Research Database (Denmark)

    Vreeswijk, P.M.; De Cia, A.; Jakobsson, P.

    2013-01-01

    We model the time-variable absorption of Feii, Feiii, Siii, Cii and Crii detected in Ultraviolet and Visual Echelle Spectrograph (UVES) spectra of gamma-ray burst (GRB) 080310, with the afterglow radiation exciting and ionizing the interstellar medium in the host galaxy at a redshift of z = 2.427...

  13. Architecting product diversification - Formalizing variability dependencies in software product family engineering

    NARCIS (Netherlands)

    Jaring, M; Bosch, J; Ehrich, HD; Schewe, KD

    2004-01-01

    In a software product family context, software architects design architectures that support product diversification in both space (multiple contexts) and time (changing contexts). Product diversification is based on the concept of variability: a single architecture and a set of components support a

  14. A characterization of marginal distributions of (possibly dependent) lifetime variables which right censor each other

    NARCIS (Netherlands)

    Bedford, T.; Meilijson, I.

    1997-01-01

    It is well known that the joint distribution of a pair of lifetime variables $X_1$ and $X_2$ which right censor each other cannot be specified in terms of the subsurvival functions $$P(X_2 > X_1 > x), \\quad P(X_1 > X_2 > x)$ \\quad \\text{and} \\quad $P(X_1 = X_2 > x)$$ without additional assumptions s

  15. Age Dependent Variability in Gene Expression in Fischer 344 Rat Retina.

    Science.gov (United States)

    Recent evidence suggests older adults may be a sensitive population with regard to environmental exposure to toxic compounds. One source of this sensitivity could be an enhanced variability in response. Studies on phenotypic differences have suggested that variation in response d...

  16. Saddlepoint expansions for sums of Markov dependent variables on a continuous state space

    DEFF Research Database (Denmark)

    Jensen, J.L.

    1991-01-01

    here very similar to the classical results for i.i.d. variables. In particular we establish also conditions under which the expansions hold uniformly over the range of the saddlepoint. Expansions are also derived for sums of the form f(X1, X0)+f(X2, X1)+...+f(Xn, Xn-1) although the uniformity result...

  17. Weighted sums of subexponential random variables and asymptotic dependence between returns on reinsurance equities

    NARCIS (Netherlands)

    J.L. Geluk (Jaap); C.G. de Vries (Casper)

    2004-01-01

    textabstractAsymptotic tail probabilities for bivariate linear combinations of subexponential random variables are given. These results are applied to explain the joint movements of the stocks of reinsurers. Portfolio investment and retrocession practices in the reinsurance industry, for reasons of

  18. Logistic regression applied to natural hazards: rare event logistic regression with replications

    Directory of Open Access Journals (Sweden)

    M. Guns

    2012-06-01

    Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  19. Regression: A Bibliography.

    Science.gov (United States)

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  20. Regression: A Bibliography.

    Science.gov (United States)

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  1. Accurate approximate solution to nonlinear oscillators in which the restoring force is inversely proportional to the dependent variable

    Energy Technology Data Exchange (ETDEWEB)

    Belendez, A; Gimeno, E; Mendez, D I; Alvarez, M L [Departamento de Fisica, IngenierIa de Sistemas y TeorIa de la Senal, Universidad de Alicante, Apartado 99, E-03080 Alicante (Spain); Fernandez, E [Departamento de Optica, FarmacologIa y AnatomIa, Universidad de Alicante, Apartado 99, E-03080 Alicante (Spain)], E-mail: a.belendez@ua.es

    2008-06-15

    A modified generalized, rational harmonic balance method is used to construct approximate frequency-amplitude relations for a conservative nonlinear singular oscillator in which the restoring force is inversely proportional to the dependent variable. The procedure is used to solve the nonlinear differential equation approximately. The approximate frequency obtained using this procedure is more accurate than those obtained using other approximate methods and the discrepancy between the approximate frequency and the exact one is lower than 0.40%.

  2. Contact-dependent performance variability of monolayer MoS{sub 2} field-effect transistors

    Energy Technology Data Exchange (ETDEWEB)

    Han, Gyuchull; Yoon, Youngki, E-mail: youngki.yoon@uwaterloo.ca [Department of Electrical and Computer Engineering and Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, Ontario N2L 3G1 (Canada)

    2014-11-24

    Using self-consistent quantum transport simulations, we investigate the performance variability of monolayer molybdenum disulfide (MoS{sub 2}) field-effect transistors (FETs) with various contact properties. Varying the Schottky barrier in MoS{sub 2} FETs affects the output characteristics more significantly than the transfer characteristics. If doped contacts are realized, the performance variation due to non-ideal contacts becomes negligible; otherwise, channel doping can effectively suppress the performance variability in metal-contact devices. Our scaling study also reveals that for sub-10-nm channels, doped-contact devices can be more robust in terms of switching, while metal-contact MoS{sub 2} FETs can undergo the smaller penalty in output conductance.

  3. Near-infrared thermal emission from near-Earth asteroids: Aspect-dependent variability

    CERN Document Server

    Moskovitz, Nicholas A; DeMeo, Francesca E; Binzel, Richard P; Endicott, Thomas; Yang, Bin; Howell, Ellen S; Vervack, Ronald J; Fernandez, Yanga R

    2016-01-01

    Here we explore a technique for constraining physical properties of near-Earth asteroids (NEAs) based on variability in thermal emission as a function of viewing aspect. We present case studies of the low albedo, near-Earth asteroids (285263) 1998 QE2 and (175706) 1996 FG3. The Near-Earth Asteroid Thermal Model (NEATM) is used to fit signatures of thermal emission in near-infrared (0.8 - 2.5 micron) spectral data. This analysis represents a systematic study of thermal variability in the near-IR as a function of phase angle. The observations of QE2 imply that carefully timed observations from multiple viewing geometries can be used to constrain physical properties like retrograde versus prograde pole orientation and thermal inertia. The FG3 results are more ambiguous with detected thermal variability possibly due to systematic issues with NEATM, an unexpected prograde rotation state, or a surface that is spectrally and thermally heterogenous. This study highlights the potential diagnostic importance of high ph...

  4. 基于多模态相关向量回归机的老年痴呆症临床变量预测%Predicting clinical variables in Alzheimer's disease based on multimodal relevance vector regression

    Institute of Scientific and Technical Information of China (English)

    程波; 张道强

    2012-01-01

    老年痴呆症(Alzheimer’s disease,AD)的临床变量值和多模态特征都是对其内在致病病理的外在反映.本文提出一种多模态相关向量回归机,通过对多模态特征的学习来预测临床变量值.首先采用核方法将多模态数据融合成一个混合核矩阵,然后使用相关向量回归机对临床变量简易精神状态检查(mini mental state examination,MMSE)和老年痴呆症评定量表(Alzheimer’s disease assessment scale,ADAS-Cog)建立回归模型,最后用相关系数和平方根均方误差来验证算法的性能.在标准数据集ADNI上的实验结果表明,本文提出的多模态方法的预测性能优于单模态方法.%Recently, effective and accurate diagnosis disease stage of Alzheimer's disease (AD) or mild cognitive impairment (MCI) has attracted more and more attention. Numerous studies have demonstrated that clinical variables and multimodal features of AD are external reflections of the intrinsic disease pathology. This paper proposes a multimodal regression method for estimating disease stage and predicting clinical progression from three modalities of biomarkers, i. e., magnetic resonance imaging (MRI), fluoro-deoxy-glucose-positron emission tomography ( FDG-PET), and cerebrospinal fluid (CSF) biomarkers. Specifically, our multimodal regression framework includes three key steps: firstly, we use the specific application tool to orginal MRI and FDG-PET images data from the 202 Alzheimer' s disease neuroimaging initiative(ADNI) subjects. For each preproeessed original MR or FDG PET image, 93 regions of interest (ROIs) are labeled by an atlas warping algorithm. And then, for each MR or FDG-PET image, 93 volumetric features are extracted from the 93 ROIs. Therefore, for each subject, the last features come from 93 features from the MRI image, another 93 features from the PET image, and 3 features from the CSF biomarkers which

  5. Wrong Signs in Regression Coefficients

    Science.gov (United States)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  6. Variable Viscosity Effects on Time Dependent Magnetic Nanofluid Flow past a Stretchable Rotating Plate

    Directory of Open Access Journals (Sweden)

    Ram Paras

    2016-01-01

    Full Text Available An attempt has been made to describe the effects of geothermal viscosity with viscous dissipation on the three dimensional time dependent boundary layer flow of magnetic nanofluids due to a stretchable rotating plate in the presence of a porous medium. The modelled governing time dependent equations are transformed a from boundary value problem to an initial value problem, and thereafter solved by a fourth order Runge-Kutta method in MATLAB with a shooting technique for the initial guess. The influences of mixed temperature, depth dependent viscosity, and the rotation strength parameter on the flow field and temperature field generated on the plate surface are investigated. The derived results show direct impact in the problems of heat transfer in high speed computer disks (Herrero et al. [1] and turbine rotor systems (Owen and Rogers [2].

  7. Anatomy of a population cycle: the role of density dependence and demographic variability on numerical instability and periodicity.

    Science.gov (United States)

    Row, Jeffrey R; Wilson, Paul J; Murray, Dennis L

    2014-07-01

    Determining the causes of cyclic fluctuations in population size is a central tenet in population ecology and provides insights into population regulatory mechanisms. We have a firm understanding of how direct and delayed density dependence affects population stability and cyclic dynamics, but there remains considerable uncertainty in the specific processes contributing to demographic variability and consequent change in cyclic propensity. Spatiotemporal variability in cyclic propensity, including recent attenuation or loss of cyclicity among several temperate populations and the implications of habitat fragmentation and climate change on this pattern, highlights the heightened need to understand processes underlying cyclic variation. Because these stressors can differentially impact survival and productivity and thereby impose variable time delays in density dependence, there is a specific need to elucidate how demographic vital rates interact with the type and action of density dependence to contribute to population stability and cyclic variation. Here, we address this knowledge gap by comparing the stability of time series derived from general and species-specific (Canada lynx: Lynx canadensis; small rodents: Microtus, Lemmus and Clethrionomys spp.) matrix population models, which vary in their demographic rates and the direct action of density dependence. Our results reveal that density dependence acting exclusively on survival as opposed to productivity is destabilizing, suggesting that a shift in the action of population regulation toward reproductive output may decrease cyclic propensity and cycle amplitude. This result was the same whether delayed density dependence was pulsatile and acted on a single time period (e.g. t-1, t-2 or t-3) vs. more constant by affecting a successive range of years (e.g. t-1,…, t-3). Consistent with our general models, reductions in reproductive potential in both the lynx and small rodent systems led to notably large drops in

  8. The Relationships between Cognitive Style of Field Dependence and Learner Variables in E-Learning Instruction

    Science.gov (United States)

    Sozcu, Omer Faruk

    2014-01-01

    This study examines the relationships between cognitive styles of field dependent learners with their attitudes towards e-learning (distance education) and instructional behavior in e-learning instruction. The Group Embedded Figures Test (GEFT) and the attitude survey (for students' preferences) towards e-learning instruction as distance education…

  9. Logistic regression: a brief primer.

    Science.gov (United States)

    Stoltzfus, Jill C

    2011-10-01

    Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model

  10. Near-infrared thermal emission from near-Earth asteroids: Aspect-dependent variability

    Science.gov (United States)

    Moskovitz, Nicholas A.; Polishook, David; DeMeo, Francesca E.; Binzel, Richard P.; Endicott, Thomas; Yang, Bin; Howell, Ellen S.; Vervack, , Ronald J.; Fernández, Yanga R.

    2017-03-01

    Here we explore a technique for constraining physical properties of near-Earth asteroids (NEAs) based on variability in thermal emission as a function of viewing aspect. We present case studies of the low albedo, near-Earth asteroids (285263) 1998 QE2 and (175706) 1996 FG3. The Near-Earth Asteroid Thermal Model (NEATM) is used to fit signatures of thermal emission in near-infrared (0.8 - 2.5 μm) spectral data. This analysis represents a systematic study of thermal variability in the near-IR as a function of phase angle. The observations of QE2 imply that carefully timed observations from multiple viewing geometries can be used to constrain physical properties like retrograde versus prograde pole orientation and thermal inertia. The FG3 results are more ambiguous with detected thermal variability possibly due to systematic issues with NEATM, an unexpected prograde rotation state, or a surface that is spectrally and thermally heterogenous. This study highlights the potential diagnostic importance of high phase angle thermal measurements on both sides of opposition. We find that the NEATM thermal beaming parameters derived from our near-IR data tend to be of order10's of percent higher than parameters from ensemble analyses of longer wavelength data sets. However, a systematic comparison of NEATM applied to data in different wavelength regimes is needed to understand whether this offset is simply a reflection of small number statistics or an intrinsic limitation of NEATM when applied to near-IR data. With the small sample presented here, it remains unclear whether NEATM modeling at near-IR wavelengths can robustly determine physical properties like pole orientation and thermal inertia.

  11. THE DIFFERENCES IN MORAL, GROUP IDENTITY AND THE PERCON’S VARIABILITY DEPENDING ON THE EDUCATION

    Directory of Open Access Journals (Sweden)

    Irina Aleksandrobna Kolinichenko

    2017-06-01

    Results. The results of the study have revealed the dominance of all specified assessment parameters in the group of test subjects with incomplete higher education: higher level of moral development in all dilemmas (the opposition of life values (compassion and following the law, self-interest – the interests of the city (law, business (benefit and law, personal interests (career and the freedom of another person, except for the dilemma of the opposition between the interests of a majority and a single person. The differences have also been revealed between the two groups of test subjects according to the group identity, group variability, the desirability of the common categories of identity.

  12. Modeling the Effects of a Normal-Stress-Dependent State Variable, Within the Rate- and State-Dependent Friction Framework, at Stepovers and Dip-Slip Faults

    Science.gov (United States)

    Ryan, Kenny J.; Oglesby, David D.

    2017-03-01

    The development of the rate- and state-dependent friction framework (Dieterich Appl Geophys 116:790-806, 1978; J Geophys Res 84, 2161-2168, 1979; Ruina Friction laws and instabilities: a quasistatic analysis of some dry friction behavior, Ph.D. Thesis, Brown Univ., Providence, R.I., 1980; J Geophys Res 88:10359-10370, 1983) includes the dependence of friction coefficient on normal stress (Linker and Dieterich J Geophys Res 97:4923-4940, 1992); however, a direct dependence of the friction law on time-varying normal stress in dynamic stepover and dip-slip fault models has not yet been extensively explored. Using rate- and state-dependent friction laws and a 2-D dynamic finite element code (Barall J Int 178, 845-859, 2009), we investigate the effect of the Linker-Dieterich dependence of state variable on normal stress at stepovers and dip-slip faults, where normal stress should not be constant with time (e.g., Harris and Day J Geophys Res 98:4461-4472, 1993; Nielsen Geophys Res Lett 25:125-128, 1998). Specifically, we use the relation d ψ/d t = -( α/ σ)(d σ/d t) from Linker and Dieterich (J Geophys Res 97:4923-4940, 1992), in which a change in normal stress leads to a change in state variable of the opposite sign. We investigate a range of values for alpha, which scales the impact of the normal stress change on state, from 0 to 0.5 (laboratory values range from 0.2 to 0.56). For stepovers, we find that adding normal-stress dependence to the state variable delays or stops re-nucleation on the secondary fault segment when compared to normal-stress-independent state evolution. This inhibition of jumping rupture is due to the fact that re-nucleation along the secondary segment occurs in areas of decreased normal stress in both compressional and dilational stepovers. However, the magnitude of such an effect differs between dilational and compressional systems. Additionally, it is well known that the asymmetric geometry of reverse and normal faults can lead to greater

  13. Assay-dependent variability of serum insulin concentrations: a comparison of eight assays.

    Science.gov (United States)

    Tohidi, Maryam; Arbab, Parvaneh; Ghasemi, Asghar

    2017-04-01

    Although insulin measurement is essential for both clinical and research purposes, there is currently no reference method for insulin assays. The aim of this study was to compare results of serum insulin determined by a number of commercially available assays. We compared eight insulin assays by analyzing 165 serum samples. Assays included two chemiluminescence (Roche and DiaSorin), four ELISA (Tosoh, Mercodia, Monobind, and Diametra), and two IRMA (Izotop and BioSource) methods. Each assay was compared with the mean of all assay methods and Bland-Altman difference plots were used to measure agreement between each assay and overall mean. Least squared perpendicular distance regression analysis (Deming's method) was used to calculate slope and intercept for bias and also for each assay vs. mean of eight assays. Findings showed that the lowest and highest median insulin concentrations varied by a factor of 1.8. Maximum and minimum correlations with mean of assays were observed for Roche (0.992) and BioSource (0.844), respectively. Significant bias was observed in six assays. In pairwise comparisons of different assays, the highest and least mean differences were 7.78 μU/mL and -0.14 μU/mL, respectively. In conclusion, serum insulin measurement with different assays showed a maximum of 1.8-fold difference, a point that should be taken into consideration in the interpretation of circulating insulin levels in both clinical and research fields.

  14. A functional limit theorem for partial sums of dependent random variables with infinite variance

    CERN Document Server

    Basrak, Bojan; Segers, Johan

    2010-01-01

    Under an appropriate regular variation condition, the affinely normalized partial sums of a sequence of independent and identically distributed random variables converges weakly to a non-Gaussian stable random variable. A functional version of this is known to be true as well, the limit process being a stable L\\'evy process. The main result in the paper is that for a stationary, regularly varying sequence for which clusters of high-threshold excesses can be broken down into asymptotically independent blocks, the properly centered partial sum process still converges to a stable L\\'evy process. Due to clustering, the L\\'evy triple of the limit process can be different from the one in the independent case. The convergence takes place in the space of c\\`adl\\`ag functions endowed with Skorohod's $M_1$ topology, the more usual $J_1$ topology being inappropriate as the partial sum processes may exhibit rapid successions of jumps within temporal clusters of large values, collapsing in the limit to a single jump. The ...

  15. The greek infinitive in variable deliberative, principally dependent questions: an interpretation in terms of naturalness theory

    Directory of Open Access Journals (Sweden)

    Jerneja Kavčič

    2004-12-01

    Full Text Available In the present paper I investigate the use of the infinitive in dependent delibera­ tive clauses in Greek, a phenomenon occurring in several (modern languages, cf. Slovene Nisem vedel, kaj storiti. 'I didn't know what to do?', English I didn't know what to do., German Was tun? 'What to do?'l. In the first part I present the development of deliberative infinitive clauses in Post-Classical Greek with a special emphasis on the use of this form in two Early Byzantine prose writings (in Pratum Spirituale and in Vita Theodori Syceotae, both belonging to the 6th;7th century AD, where some peculiarities are observed. In the second part an attempt is made to interpret the basic characteristics of the Greek infinitive in dependent deliberative clauses from the perspective of Naturalness Theory.

  16. Censored Hurdle Negative Binomial Regression (Case Study: Neonatorum Tetanus Case in Indonesia)

    Science.gov (United States)

    Yuli Rusdiana, Riza; Zain, Ismaini; Wulan Purnami, Santi

    2017-06-01

    Hurdle negative binomial model regression is a method that can be used for discreate dependent variable, excess zero and under- and overdispersion. It uses two parts approach. The first part estimates zero elements from dependent variable is zero hurdle model and the second part estimates not zero elements (non-negative integer) from dependent variable is called truncated negative binomial models. The discrete dependent variable in such cases is censored for some values. The type of censor that will be studied in this research is right censored. This study aims to obtain the parameter estimator hurdle negative binomial regression for right censored dependent variable. In the assessment of parameter estimation methods used Maximum Likelihood Estimator (MLE). Hurdle negative binomial model regression for right censored dependent variable is applied on the number of neonatorum tetanus cases in Indonesia. The type data is count data which contains zero values in some observations and other variety value. This study also aims to obtain the parameter estimator and test statistic censored hurdle negative binomial model. Based on the regression results, the factors that influence neonatorum tetanus case in Indonesia is the percentage of baby health care coverage and neonatal visits.

  17. Variability in projected elevation dependent warming in boreal midlatitude winter in CMIP5 climate models and its potential drivers

    Science.gov (United States)

    Rangwala, Imtiaz; Sinsky, Eric; Miller, James R.

    2016-04-01

    The future rate of climate change in mountains has many potential human impacts, including those related to water resources, ecosystem services, and recreation. Analysis of the ensemble mean response of CMIP5 global climate models (GCMs) shows amplified warming in high elevation regions during the cold season in boreal midlatitudes. We examine how the twenty-first century elevation-dependent response in the daily minimum surface air temperature [d(ΔTmin)/dz] varies among 27 different GCMs during winter for the RCP 8.5 emissions scenario. The focus is on regions within the northern hemisphere mid-latitude band between 27.5°N and 40°N, which includes both the Rocky Mountains and the Tibetan Plateau/Himalayas. We find significant variability in d(ΔTmin)/dz among the individual models ranging from 0.16 °C/km (10th percentile) to 0.97 °C/km (90th percentile), although nearly all of the GCMs (24 out of 27) show a significant positive value for d(ΔTmin)/dz. To identify some of the important drivers associated with the variability in d(ΔTmin)/dz during winter, we evaluate the co-variance between d(ΔTmin)/dz and the differential response of elevation-based anomalies in different climate variables as well as the GCMs' spatial resolution, their global climate sensitivity, and their elevation-dependent free air temperature response. We find that d(ΔTmin)/dz has the strongest correlation with elevation-dependent increases in surface water vapor, followed by elevation-dependent decreases in surface albedo, and a weak positive correlation with the GCMs' free air temperature response.

  18. Re-examining the ontogeny of the context preexposure facilitation effect in the rat through multiple dependent variables.

    Science.gov (United States)

    Pisano, M V; Ferreras, S; Krapacher, F A; Paglini, G; Arias, C

    2012-07-15

    The capability to acquire context conditioning does not emerge until weaning, at least when the defining features of the context lack explicit and salient olfactory cues. Contextual learning deficits in preweanling rats have been associated with functional immaturity of the dorsal hippocampus. According to recent studies, the so-called context preexposure facilitation effect (CPFE) - a hippocampus-dependent effect - is not observed until postnatal day 23 (PD23). In these studies the footshock intensity employed was higher (1.5 mA) than in adult studies, and context conditioning was inferred from a single behavioral measure (percentage of freezing). The present study examined the CPFE on PD17 and PD23 by analyzing multiple dependent variables, including fecal boli and an ethogram covering the complete behavioral repertoire of the rat. A non-shocked control group was included in the design and two footshock intensities were employed (0.5 and 1.5 mA). Results showed clear evidence of contextual fear conditioning in preweanling and weanling rats, as well as evidence of conditioned fear in non-preexposed rats from both age groups. In some cases, some dependent variables, such as grooming or vertical exploration, were more sensitive than freezing for detecting evidence of memory. Strong fear responses were detected in weanling (but not preweanling) rats, when rats were evaluated in a different context from the one employed at conditioning. These results indicate that preweanling rats are capable of acquiring contextual conditioning, even in a context lacking explicit odor cues, and highlight the importance of multiple dependent variables for analyzing the ontogeny of memory.

  19. Three-dimensional carotid ultrasound segmentation variability dependence on signal difference and boundary orientation.

    Science.gov (United States)

    Chiu, Bernard; Krasinski, Adam; Spence, J David; Parraga, Grace; Fenster, Aaron

    2010-01-01

    Quantitative measurements of the progression (or regression) of carotid plaque burden are important in monitoring patients and evaluating new treatment options. We previously developed a quantitative metric to analyze changes in carotid plaque morphology from 3-D ultrasound (US) on a point-by-point basis. This method requires multiple segmentations of the arterial wall and lumen boundaries to obtain the local standard deviation (SD) of vessel-wall-plus-plaque thickness (VWT) so that t-tests could be used to determine whether a change in VWT is statistically significant. However, the requirement for multiple segmentations makes clinical trials laborious and time-consuming. Therefore, this study was designed to establish the relationship between local segmentation SD and local signal difference on the arterial wall and lumen boundaries. We propose metrics to quantify segmentation SD and signal difference on a point-by-point basis, and studied whether the signal difference at arterial wall or lumen boundaries could be used to predict local segmentation SD. The ability to predict the local segmentation SD could eliminate the need of repeated segmentations of a 2-D transverse image to obtain the local segmentation standard deviation, thereby making clinical trials less laborious and saving time. Six subjects involved in this study were associated with different degrees of atherosclerosis: three carotid stenosis subjects with mean plaque area >3 cm(2) and >60% carotid stenosis were involved in a clinical study evaluating the effect of atorvastatin, a cholesterol-lowering and plaque-stabilizing drug; and three subjects with carotid plaque area >0.5 cm(2) were subjects with moderate atherosclerosis. Our results suggest that when local signal difference is higher than 8 greyscale value (GSV), the local segmentation SD stabilizes at 0.05 mm and is thus predictable. This information provides a target value of local signal difference on the arterial boundaries that should be

  20. Regression Analysis by Example. 5th Edition

    Science.gov (United States)

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  1. Interpretation of Standardized Regression Coefficients in Multiple Regression.

    Science.gov (United States)

    Thayer, Jerome D.

    The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for…

  2. Regression analysis by example

    National Research Council Canada - National Science Library

    Chatterjee, Samprit; Hadi, Ali S

    2012-01-01

    .... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...

  3. Time-dependence in Relativistic Collisionless Shocks: Theory of the Variable "Wisps" in the Crab Nebula

    CERN Document Server

    Spitkovsky, A; Spitkovsky, Anatoly; Arons, Jonathan

    2004-01-01

    We describe results from time-dependent numerical modeling of the collisionless reverse shock terminating the pulsar wind in the Crab Nebula. We treat the upstream relativistic wind as composed of ions and electron-positron plasma embedded in a toroidal magnetic field, flowing radially outward from the pulsar in a sector around the rotational equator. The relativistic cyclotron instability of the ion gyrational orbit downstream of the leading shock in the electron-positron pairs launches outward propagating magnetosonic waves. Because of the fresh supply of ions crossing the shock, this time-dependent process achieves a limit-cycle, in which the waves are launched with periodicity on the order of the ion Larmor time. Compressions in the magnetic field and pair density associated with these waves, as well as their propagation speed, semi-quantitatively reproduce the behavior of the wisp and ring features described in recent observations obtained using the Hubble Space Telescope and the Chandra X-Ray Observator...

  4. A Simulation Investigation of Principal Component Regression.

    Science.gov (United States)

    Allen, David E.

    Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…

  5. Dependency of the Cusp Density Anomaly on the Variability of Forcing Inside and Outside the Cusp

    Science.gov (United States)

    Brinkman, D. G.; Walterscheid, R. L.; Clemmons, J. H.

    2014-12-01

    The Earth's magnetospheric cusp provides direct access of energetic particles to the thermosphere. These particles produce ionization and kinetic (particle) heating of the atmosphere. The increased ionization coupled with enhanced electric fields in the cusp produces increased Joule heating and ion drag forcing. These energy inputs largely determine the neutral density structure in the cusp region. Measurements by the CHAMP satellite (460-390- km altitude) have shown a region of strong enhanced density attributed to the combination of cusp particle and Joule heating. The Streak mission (325-123 km), on the other hand, observed a relative depletion in density in the cusp. While particle precipitation in the cusp is comparatively well constrained, the characteristics of the steady and fluctuating components of the electric field in the cusp are poorly constrained. Also, the significance of harder particle precipitation in areas adjacent to the cusp in particular at lower altitudes has not been addressed as it relates to the cusp density anomaly. We address the response of the cusp region to a range electrodynamical forcing with our high resolution two-dimensional time-dependent nonhydrostatic nonlinear dynamical model. We take advantage of our model's high resolution and focus on a more typical cusp width of 2 degrees in latitude. Earlier simulations have also shown a significant contribution from soft particle precipitation. We simulate the atmospheric response to a range of realizable magnitudes of the fluctuating and steady components of the electric field to examine the dependence of the magnitude of the cusp density anomaly on a large range of observed characteristics of the electrodynamical forcing and examine, in particular, the importance of particle heating relative to Joule heating. In addition we investigate the role of harder particle precipitation in areas adjacent to the cusp in determining the lower altitude cusp density and wind structure. We compare

  6. Functional limit theorem for moving average processes generated by dependent random variables

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Let {Xt,t≥1} be a moving average process defined byXt = ∞∑j=0bjξt-j , where {bj,j≥0} is a sequence of real numbers and { ξt, ∞< t <∞ } is a doubly infinite sequence of strictly stationary φ- mixing random variables. Under conditions on { bj, j ≥0 }which entail that { Xt, t ≥ 1 } is either a long memory process or a linear process, we study asymptotics of Sn ( s ) = [ns]∑t=1 Xt (properly normalized). When { Xt, t≥1 } is a long memory process, we establish a functional limit theorem. When { Xt, t≥1 } is a linear process, we not only obtain the multi-dimensional weak convergence for { Xt, t≥1 }, but also weaken the moment condition on { ξt, - ∞< t <∞ } and the restriction on { bj,j≥0}. Finally, we give some applications of our results.

  7. Magnetohydrodynamic dissipative flow across the slendering stretching sheet with temperature dependent variable viscosity

    Science.gov (United States)

    Jayachandra Babu, M.; Sandeep, N.; Ali, M. E.; Nuhait, Abdullah O.

    The boundary layer flow across a slendering stretching sheet has gotten awesome consideration due to its inexhaustible pragmatic applications in nuclear reactor technology, acoustical components, chemical and manufacturing procedures, for example, polymer extrusion, and machine design. By keeping this in view, we analyzed the two-dimensional MHD flow across a slendering stretching sheet within the sight of variable viscosity and viscous dissipation. The sheet is thought to be convectively warmed. Convective boundary conditions through heat and mass are employed. Similarity transformations used to change over the administering nonlinear partial differential equations as a group of nonlinear ordinary differential equations. Runge-Kutta based shooting technique is utilized to solve the converted equations. Numerical estimations of the physical parameters involved in the problem are calculated for the friction factor, local Nusselt and Sherwood numbers. Viscosity variation parameter and chemical reaction parameter shows the opposite impact to each other on the concentration profile. Heat and mass transfer Biot numbers are helpful to enhance the temperature and concentration respectively.

  8. Time-dependent excitation and ionization modelling of absorption-line variability due to GRB 080310

    CERN Document Server

    Vreeswijk, P M; Raassen, A J J; Smette, A; De Cia, A; Woźniak, P R; Fox, A J; Vestrand, W T; Jakobsson, P

    2012-01-01

    We model the time-variable absorption of FeII, FeIII, SiII, CII and CrII detected in UVES spectra of GRB 080310, with the afterglow radiation exciting and ionizing the interstellar medium in the host galaxy at a redshift of z=2.42743. To estimate the rest-frame afterglow brightness as a function of time, we use a combination of the optical VRI photometry obtained by the RAPTOR-T telescope array -- which are presented in this paper -- and Swift's X-Ray Telescope observations. Excitation alone, which has been successfully applied for a handful of other GRBs, fails to describe the observed column-density evolution in the case of GRB 080310. Inclusion of ionization is required to explain the column-density decrease of all observed FeII levels (including the ground state 6D9/2) and increase of the FeIII 7S3 level. The large population of ions in this latter level (up to 10% of all FeIII) can only be explained through ionization of FeII, whereby a large fraction of the ionized FeII ions -- we calculate 31% using th...

  9. A tale of two methods: comparing regression and instrumental variables estimates of the effects of preschool child care type on the subsequent externalizing behavior of children in low-income families.

    Science.gov (United States)

    Crosby, Danielle A; Dowsett, Chantelle J; Gennetian, Lisa A; Huston, Aletha C

    2010-09-01

    We apply instrumental variables (IV) techniques to a pooled data set of employment-focused experiments to examine the relation between type of preschool childcare and subsequent externalizing problem behavior for a large sample of low-income children. To assess the potential usefulness of this approach for addressing biases that can confound causal inferences in child care research, we compare instrumental variables results with those obtained using ordinary least squares (OLS) regression. We find that our OLS estimates concur with prior studies showing small positive associations between center-based care and later externalizing behavior. By contrast, our IV estimates indicate that preschool-aged children with center care experience are rated by mothers and teachers as having fewer externalizing problems on entering elementary school than their peers who were not in child care as preschoolers. Findings are discussed in relation to the literature on associations between different types of community-based child care and children's social behavior, particularly within low-income populations. Moreover, we use this study to highlight the relative strengths and weaknesses of each analytic method for addressing causal questions in developmental research.

  10. Spatio-temporal dependencies between hospital beds, physicians and health expenditure using visual variables and data classification in statistical table

    Directory of Open Access Journals (Sweden)

    Medyńska-Gulij Beata

    2016-06-01

    Full Text Available This paper analyses the use of table visual variables of statistical data of hospital beds as an important tool for revealing spatio-temporal dependencies. It is argued that some of conclusions from the data about public health and public expenditure on health have a spatio-temporal reference. Different from previous studies, this article adopts combination of cartographic pragmatics and spatial visualization with previous conclusions made in public health literature. While the significant conclusions about health care and economic factors has been highlighted in research papers, this article is the first to apply visual analysis to statistical table together with maps which is called previsualisation.

  11. Studies of Hot Photoluminescence in Plasmonically Coupled Silicon via Variable Energy Excitation and Temperature-Dependent Spectroscopy

    Science.gov (United States)

    2015-01-01

    By integrating silicon nanowires (∼150 nm diameter, 20 μm length) with an Ω-shaped plasmonic nanocavity, we are able to generate broadband visible luminescence, which is induced by high order hybrid nanocavity-surface plasmon modes. The nature of this super bandgap emission is explored via photoluminescence spectroscopy studies performed with variable laser excitation energies (1.959 to 2.708 eV) and finite difference time domain simulations. Furthermore, temperature-dependent photoluminescence spectroscopy shows that the observed emission corresponds to radiative recombination of unthermalized (hot) carriers as opposed to a resonant Raman process. PMID:25120156

  12. Spatio-temporal dependencies between hospital beds, physicians and health expenditure using visual variables and data classification in statistical table

    Science.gov (United States)

    Medyńska-Gulij, Beata; Cybulski, Paweł

    2016-06-01

    This paper analyses the use of table visual variables of statistical data of hospital beds as an important tool for revealing spatio-temporal dependencies. It is argued that some of conclusions from the data about public health and public expenditure on health have a spatio-temporal reference. Different from previous studies, this article adopts combination of cartographic pragmatics and spatial visualization with previous conclusions made in public health literature. While the significant conclusions about health care and economic factors has been highlighted in research papers, this article is the first to apply visual analysis to statistical table together with maps which is called previsualisation.

  13. Directional Dependence in Developmental Research

    Science.gov (United States)

    von Eye, Alexander; DeShon, Richard P.

    2012-01-01

    In this article, we discuss and propose methods that may be of use to determine direction of dependence in non-normally distributed variables. First, it is shown that standard regression analysis is unable to distinguish between explanatory and response variables. Then, skewness and kurtosis are discussed as tools to assess deviation from…

  14. Quantile Regression Analysis of the Distributional Effects of Air Pollution on Blood Pressure, Heart Rate Variability, Blood Lipids, and Biomarkers of Inflammation in Elderly American Men: The Normative Aging Study.

    Science.gov (United States)

    Bind, Marie-Abele; Peters, Annette; Koutrakis, Petros; Coull, Brent; Vokonas, Pantel; Schwartz, Joel

    2016-08-01

    Previous studies have observed associations between air pollution and heart disease. Susceptibility to air pollution effects has been examined mostly with a test of effect modification, but little evidence is available whether air pollution distorts cardiovascular risk factor distribution. This paper aims to examine distributional and heterogeneous effects of air pollution on known cardiovascular biomarkers. A total of 1,112 men from the Normative Aging Study and residents of the greater Boston, Massachusetts, area with mean age of 69 years at baseline were included in this study during the period 1995-2013. We used quantile regression and random slope models to investigate distributional effects and heterogeneity in the traffic-related responses on blood pressure, heart rate variability, repolarization, lipids, and inflammation. We considered 28-day averaged exposure to particle number, PM2.5 black carbon, and PM2.5 mass concentrations (measured at a single monitor near the site of the study visits). We observed some evidence suggesting distributional effects of traffic-related pollutants on systolic blood pressure, heart rate variability, corrected QT interval, low density lipoprotein (LDL) cholesterol, triglyceride, and intercellular adhesion molecule-1 (ICAM-1). For example, among participants with LDL cholesterol below 80 mg/dL, an interquartile range increase in PM2.5 black carbon exposure was associated with a 7-mg/dL (95% CI: 5, 10) increase in LDL cholesterol, while among subjects with LDL cholesterol levels close to 160 mg/dL, the same exposure was related to a 16-mg/dL (95% CI: 13, 20) increase in LDL cholesterol. We observed similar heterogeneous associations across low versus high percentiles of the LDL distribution for PM2.5 mass and particle number. These results suggest that air pollution distorts the distribution of cardiovascular risk factors, and that, for several outcomes, effects may be greatest among individuals who are already at high risk

  15. Inferential Models for Linear Regression

    Directory of Open Access Journals (Sweden)

    Zuoyi Zhang

    2011-09-01

    Full Text Available Linear regression is arguably one of the most widely used statistical methods in applications.  However, important problems, especially variable selection, remain a challenge for classical modes of inference.  This paper develops a recently proposed framework of inferential models (IMs in the linear regression context.  In general, an IM is able to produce meaningful probabilistic summaries of the statistical evidence for and against assertions about the unknown parameter of interest and, moreover, these summaries are shown to be properly calibrated in a frequentist sense.  Here we demonstrate, using simple examples, that the IM framework is promising for linear regression analysis --- including model checking, variable selection, and prediction --- and for uncertain inference in general.

  16. Testing in a Random Effects Panel Data Model with Spatially Correlated Error Components and Spatially Lagged Dependent Variables

    Directory of Open Access Journals (Sweden)

    Ming He

    2015-11-01

    Full Text Available We propose a random effects panel data model with both spatially correlated error components and spatially lagged dependent variables. We focus on diagnostic testing procedures and derive Lagrange multiplier (LM test statistics for a variety of hypotheses within this model. We first construct the joint LM test for both the individual random effects and the two spatial effects (spatial error correlation and spatial lag dependence. We then provide LM tests for the individual random effects and for the two spatial effects separately. In addition, in order to guard against local model misspecification, we derive locally adjusted (robust LM tests based on the Bera and Yoon principle (Bera and Yoon, 1993. We conduct a small Monte Carlo simulation to show the good finite sample performances of these LM test statistics and revisit the cigarette demand example in Baltagi and Levin (1992 to illustrate our testing procedures.

  17. A new method for obtaining sharp compound Poisson approximation error estimates for sums of locally dependent random variables

    CERN Document Server

    Boutsikas, Michael V; 10.3150/09-BEJ201

    2010-01-01

    Let $X_1,X_2,...,X_n$ be a sequence of independent or locally dependent random variables taking values in $\\mathbb{Z}_+$. In this paper, we derive sharp bounds, via a new probabilistic method, for the total variation distance between the distribution of the sum $\\sum_{i=1}^nX_i$ and an appropriate Poisson or compound Poisson distribution. These bounds include a factor which depends on the smoothness of the approximating Poisson or compound Poisson distribution. This "smoothness factor" is of order $\\mathrm{O}(\\sigma ^{-2})$, according to a heuristic argument, where $\\sigma ^2$ denotes the variance of the approximating distribution. In this way, we offer sharp error estimates for a large range of values of the parameters. Finally, specific examples concerning appearances of rare runs in sequences of Bernoulli trials are presented by way of illustration.

  18. Principal component regression analysis with SPSS.

    Science.gov (United States)

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  19. Modeling Time-Dependent Behavior of Concrete Affected by Alkali Silica Reaction in Variable Environmental Conditions

    Science.gov (United States)

    Alnaggar, Mohammed; Di Luzio, Giovanni; Cusatis, Gianluca

    2017-01-01

    Alkali Silica Reaction (ASR) is known to be a serious problem for concrete worldwide, especially in high humidity and high temperature regions. ASR is a slow process that develops over years to decades and it is influenced by changes in environmental and loading conditions of the structure. The problem becomes even more complicated if one recognizes that other phenomena like creep and shrinkage are coupled with ASR. This results in synergistic mechanisms that can not be easily understood without a comprehensive computational model. In this paper, coupling between creep, shrinkage and ASR is modeled within the Lattice Discrete Particle Model (LDPM) framework. In order to achieve this, a multi-physics formulation is used to compute the evolution of temperature, humidity, cement hydration, and ASR in both space and time, which is then used within physics-based formulations of cracking, creep and shrinkage. The overall model is calibrated and validated on the basis of experimental data available in the literature. Results show that even during free expansions (zero macroscopic stress), a significant degree of coupling exists because ASR induced expansions are relaxed by meso-scale creep driven by self-equilibriated stresses at the meso-scale. This explains and highlights the importance of considering ASR and other time dependent aging and deterioration phenomena at an appropriate length scale in coupled modeling approaches. PMID:28772829

  20. Dark focus of accommodation as dependent and independent variables in visual display technology

    Science.gov (United States)

    Jones, Sherrie; Kennedy, Robert; Harm, Deborah

    1992-01-01

    When independent stimuli are available for accommodation, as in the dark or under low contrast conditions, the lens seeks its resting position. Individual differences in resting positions are reliable, under autonomic control, and can change with visual task demands. We hypothesized that motion sickness in a flight simulator might result in dark focus changes. Method: Subjects received training flights in three different Navy flight simulators. Two were helicopter simulators entailed CRT presentation using infinity optics, one involved a dome presentation of a computer graphic visual projection system. Results: In all three experiments there were significant differences between dark focus activity before and after simulator exposure when comparisons were made between sick and not-sick pilot subjects. In two of these experiments, the average shift in dark focus for the sick subjects was toward increased myopia when each subject was compared to his own baseline. In the third experiment, the group showed an average shift outward of small amount and the subjects who were sick showed significantly less outward movement than those who were symptom free. Conclusions: Although the relationship is not a simple one, dark focus changes in simulator sickness imply parasympathetic activity. Because changes can occur in relation to endogenous and exogenous events, such measurement may have useful applications as dependent measures in studies of visually coupled systems, virtual reality systems, and space adaptation syndrome.

  1. Assumptions of Multiple Regression: Correcting Two Misconceptions

    Directory of Open Access Journals (Sweden)

    Matt N. Williams

    2013-09-01

    Full Text Available In 2002, an article entitled - Four assumptions of multiple regression that researchers should always test- by.Osborne and Waters was published in PARE. This article has gone on to be viewed more than 275,000 times.(as of August 2013, and it is one of the first results displayed in a Google search for - regression.assumptions- . While Osborne and Waters' efforts in raising awareness of the need to check assumptions.when using regression are laudable, we note that the original article contained at least two fairly important.misconceptions about the assumptions of multiple regression: Firstly, that multiple regression requires the.assumption of normally distributed variables; and secondly, that measurement errors necessarily cause.underestimation of simple regression coefficients. In this article, we clarify that multiple regression models.estimated using ordinary least squares require the assumption of normally distributed errors in order for.trustworthy inferences, at least in small samples, but not the assumption of normally distributed response or.predictor variables. Secondly, we point out that regression coefficients in simple regression models will be.biased (toward zero estimates of the relationships between variables of interest when measurement error is.uncorrelated across those variables, but that when correlated measurement error is present, regression.coefficients may be either upwardly or downwardly biased. We conclude with a brief corrected summary of.the assumptions of multiple regression when using ordinary least squares.

  2. Reduced Rank Regression

    DEFF Research Database (Denmark)

    Johansen, Søren

    2008-01-01

    The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating e...

  3. High interindividual variability in dose-dependent reduction in speed of movement after exposing C. elegans to shock waves.

    Science.gov (United States)

    Angstman, Nicholas B; Kiessling, Maren C; Frank, Hans-Georg; Schmitz, Christoph

    2015-01-01

    In blast-related mild traumatic brain injury (br-mTBI) little is known about the connections between initial trauma and expression of individual clinical symptoms. Partly due to limitations of current in vitro and in vivo models of br-mTBI, reliable prediction of individual short- and long-term symptoms based on known blast input has not yet been possible. Here we demonstrate a dose-dependent effect of shock wave exposure on C. elegans using shock waves that share physical characteristics with those hypothesized to induce br-mTBI in humans. Increased exposure to shock waves resulted in decreased mean speed of movement while increasing the proportion of worms rendered paralyzed. Recovery of these two behavioral symptoms was observed during increasing post-traumatic waiting periods. Although effects were observed on a population-wide basis, large interindividual variability was present between organisms exposed to the same highly controlled conditions. Reduction of cavitation by exposing worms to shock waves in polyvinyl alcohol resulted in reduced effect, implicating primary blast effects as damaging components in shock wave induced trauma. Growing worms on NGM agar plates led to the same general results in initial shock wave effect in a standard medium, namely dose-dependence and high interindividual variability, as raising worms in liquid cultures. Taken together, these data indicate that reliable prediction of individual clinical symptoms based on known blast input as well as drawing conclusions on blast input from individual clinical symptoms is not feasible in br-mTBI.

  4. Chemical variability and biological activities of Brassica rapa var. rapifera parts essential oils depending on geographic variation and extraction technique.

    Science.gov (United States)

    Saka, Boualem; Djouahri, Abderrahmane; Djerrad, Zineb; Souhila, Terfi; Aberrane, Sihem; Sabaou, Nasserdine; Baaliouamer, Aoumeur; Boudarene, Lynda

    2017-02-01

    In the present work, the Brassica rapa var. rapifera parts essential oils and their antioxidant and antimicrobial activities were investigated for the first time depending on geographic origin and extraction technique. GC and GC-MS analyses showed several constituents, including alcohols, aldehydes, esters, ketones, norisoprenoids, terpenic, nitrogen and sulphur compounds, totalizing 38 and 41 compounds in leaves and root essential oils, respectively. Nitrogen compounds were the main volatiles in leaves essential oils and sulphur compounds were the main volatiles in root essential oils. Qualitative and quantitative differences were found among B. rapa var. rapifera parts essential oils collected from different locations and extracted by hydrodistillation (HD) and microwave-assisted hydrodistillation (MAHD) techniques. Furthermore, our findings showed a high variability for both antioxidant and antimicrobial activities. The highlighted variability reflects the high impact of plant part, geographic variation and extraction technique on chemical composition and biological activities, which led to conclude that we should select essential oils to be investigated carefully depending on these factors, in order to isolate the bioactive components or to have the best quality of essential oil in terms of biological activities and preventive effects in food. This article is protected by copyright. All rights reserved.

  5. Time dependent simulations of multiwavelength variability of the blazar Mrk 421 with a Monte Carlo multi-zone code

    CERN Document Server

    Chen, Xuhui; Liang, Edison; Boettcher, Markus

    2011-01-01

    (abridged) We present a new time-dependent multi-zone radiative transfer code and its application to study the SSC emission of Mrk 421. The code couples Fokker-Planck and Monte Carlo methods, in a 2D geometry. For the first time all the light travel time effects (LCTE) are fully considered, along with a proper treatment of Compton cooling, which depends on them. We study a set of simple scenarios where the variability is produced by injection of relativistic electrons as a `shock front' crosses the emission region. We consider emission from two components, with the second one either being pre-existing and co-spatial and participating in the evolution of the active region, or spatially separated and independent, only diluting the observed variability. Temporal and spectral results of the simulation are compared to the multiwavelength observations of Mrk 421 in March 2001. We find parameters that can adequately fit the observed SEDs and multiwavelength light curves and correlations. There remain however a few o...

  6. Temperature and field-dependent transport measurements in continuously tunable tantalum oxide memristors expose the dominant state variable

    Science.gov (United States)

    Graves, Catherine E.; Dávila, Noraica; Merced-Grafals, Emmanuelle J.; Lam, Si-Ty; Strachan, John Paul; Williams, R. Stanley

    2017-03-01

    Applications of memristor devices are quickly moving beyond computer memory to areas of analog and neuromorphic computation. These applications require the design of devices with different characteristics from binary memory, such as a large tunable range of conductance. A complete understanding of the conduction mechanisms and their corresponding state variable(s) is crucial for optimizing performance and designs in these applications. Here we present measurements of low bias I-V characteristics of 6 states in a Ta/ tantalum-oxide (TaOx)/Pt memristor spanning over 2 orders of magnitude in conductance and temperatures from 100 K to 500 K. Our measurements show that the 300 K device conduction is dominated by a temperature-insensitive current that varies with non-volatile memristor state, with an additional leakage contribution from a thermally-activated current channel that is nearly independent of the memristor state. We interpret these results with a parallel conduction model of Mott hopping and Schottky emission channels, fitting the voltage and temperature dependent experimental data for all memristor states with only two free parameters. The memristor conductance is linearly correlated with N, the density of electrons near EF participating in the Mott hopping conduction, revealing N to be the dominant state variable for low bias conduction in this system. Finally, we show that the Mott hopping sites can be ascribed to oxygen vacancies, where the local oxygen vacancy density responsible for critical hopping pathways controls the memristor conductance.

  7. Insulin-dependent glucose metabolism in dairy cows with variable fat mobilization around calving.

    Science.gov (United States)

    Weber, C; Schäff, C T; Kautzsch, U; Börner, S; Erdmann, S; Görs, S; Röntgen, M; Sauerwein, H; Bruckmaier, R M; Metges, C C; Kuhla, B; Hammon, H M

    2016-08-01

    clamps, pp nonesterified fatty acid concentrations did not reach the ap levels. The study demonstrated a minor influence of different degrees of body fat mobilization on insulin metabolism in cows during the transition period. The distinct decrease in the glucose-dependent release of insulin pp is the most striking finding that explains the impaired insulin action after calving, but does not explain differences in body fat mobilization between HLFC and LLFC cows.

  8. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    Science.gov (United States)

    Verdoolaege, G.; Shabbir, A.; Hornung, G.

    2016-11-01

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standard least squares.

  9. Hierarchical linear regression models for conditional quantiles

    Institute of Scientific and Technical Information of China (English)

    TIAN Maozai; CHEN Gemai

    2006-01-01

    The quantile regression has several useful features and therefore is gradually developing into a comprehensive approach to the statistical analysis of linear and nonlinear response models,but it cannot deal effectively with the data with a hierarchical structure.In practice,the existence of such data hierarchies is neither accidental nor ignorable,it is a common phenomenon.To ignore this hierarchical data structure risks overlooking the importance of group effects,and may also render many of the traditional statistical analysis techniques used for studying data relationships invalid.On the other hand,the hierarchical models take a hierarchical data structure into account and have also many applications in statistics,ranging from overdispersion to constructing min-max estimators.However,the hierarchical models are virtually the mean regression,therefore,they cannot be used to characterize the entire conditional distribution of a dependent variable given high-dimensional covariates.Furthermore,the estimated coefficient vector (marginal effects)is sensitive to an outlier observation on the dependent variable.In this article,a new approach,which is based on the Gauss-Seidel iteration and taking a full advantage of the quantile regression and hierarchical models,is developed.On the theoretical front,we also consider the asymptotic properties of the new method,obtaining the simple conditions for an n1/2-convergence and an asymptotic normality.We also illustrate the use of the technique with the real educational data which is hierarchical and how the results can be explained.

  10. Dependence of P-wave dispersion on mean arterial pressure as an independent hemodynamic variable in school children

    Directory of Open Access Journals (Sweden)

    Elibet Chávez González

    2013-09-01

    Full Text Available Introduction:The relationship between diastolic dysfunction and P-wave dispersion (PWD in the electrocardiogram has been studied for some time. In this regard, echocardiography is emerging as a diagnostic tool to improve risk stratification for mild hypertension.Objective:To determine the dependence of PWD on the electrocardiogram and on echocardiographic variables in a pediatric population.Methods: Five hundred and fifteen children from three elementary schools were studiedfrom a total of 565 children. Those whose parents did not want them to take part in the study, as well as those with known congenital diseases, were excluded. Tests including 12-lead surface ECGs and 4 blood pressure (BP measurements were performed. Maximum and minimum P-values were measured, and the PWD on the electrocardiogram was calculated. Echocardiography for structural measurements and the pulsed Doppler of mitral flow were also performed.Results: A significant correlation in statistical variables was found between PWD and mean BP for pre-hypertensive and hypertensive children, i.e., r= 0.32, p <0.01 and r= 0.33, p <0.01, respectively. There was a significant correlation found between PWD and the left atrial area (r= 0.45 and p <0.01.Conclusions: We highlight the dependency between PWD, the electrocardiogram and  mean  blood pressure. We also draw attention to the dependence of PWD on the left atrial area.  This result provides an explanation for earlier changes in atrial electrophysiological and hemodynamic characteristics in pediatric patients.

  11. Direction of Effects in Multiple Linear Regression Models.

    Science.gov (United States)

    Wiedermann, Wolfgang; von Eye, Alexander

    2015-01-01

    Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.

  12. Demographic models reveal the shape of density dependence for a specialist insect herbivore on variable host plants.

    Science.gov (United States)

    Miller, Tom E X

    2007-07-01

    1. It is widely accepted that density-dependent processes play an important role in most natural populations. However, persistent challenges in our understanding of density-dependent population dynamics include evaluating the shape of the relationship between density and demographic rates (linear, concave, convex), and identifying extrinsic factors that can mediate this relationship. 2. I studied the population dynamics of the cactus bug Narnia pallidicornis on host plants (Opuntia imbricata) that varied naturally in relative reproductive effort (RRE, the proportion of meristems allocated to reproduction), an important plant quality trait. I manipulated per-plant cactus bug densities, quantified subsequent dynamics, and fit stage-structured models to the experimental data to ask if and how density influences demographic parameters. 3. In the field experiment, I found that populations with variable starting densities quickly converged upon similar growth trajectories. In the model-fitting analyses, the data strongly supported a model that defined the juvenile cactus bug retention parameter (joint probability of surviving and not dispersing) as a nonlinear decreasing function of density. The estimated shape of this relationship shifted from concave to convex with increasing host-plant RRE. 4. The results demonstrate that host-plant traits are critical sources of variation in the strength and shape of density dependence in insects, and highlight the utility of integrated experimental-theoretical approaches for identifying processes underlying patterns of change in natural populations.

  13. Using rain-on-snow events to evaluate the quality of bias correction to represent complex inter-variable dependencies

    Science.gov (United States)

    Rössler, Ole; Bosshard, Thomas; Weingartner, Rolf

    2016-04-01

    A key issue for adaptation planning is the information of projections about changes of extremes. Climate projections of meteorological extremes and their downscaling are a challenge on their own. Yet - at least in hydrology - meteorological extremes are not necessarily hydrological extremes. These can also result from a sequence of days with only moderate meteorological conditions, too. This sequences are called "storylines". In climate change impact assess studies it is relevant to know, whether these meteorological storylines are represented in regional climate models, and how well can bias correction preserve or improve the representation. One storyline leading to hydrological extremes are rain-on-snow events, and more specifically rain-on-snowfall events. These events challenge the regional climate model and the bias correction in terms of representing absolute values and inter-variable dependences. This study makes use of the rain-on-snow-storylines to evaluate the performance of regional climate models and a bias correction method in reproducing complex inter-variable dependencies. At first, we applied a hydrological model to a mesoscale catchment in Switzerland that is known to be effected by rain-on-snow events. At second, the ERA-Interim driven regional climate model RCA4.5 - developed at SMHI - with a spatial resolution of 0.11 * 0.11 degree was used to drive the hydrological model. At third, bias-correction of the RCM was done applying the distribution based scaling (DBS) bias-correction method (Yang et al., 2010) developed at the SMHI. The bias-corrected data then also served as driving input data to the hydrological model. Based on the simulated runoff, as well as simulated precipitation, temperature, and snow pack data, an algorithm to detect rain-on-snow events was applied. Finally, the presence or absents of rain-on-snow events for the three different climate input data, ERA.RCA4.5, DBS corrected ERA.RC4 and observed climate, are evaluated within

  14. Low-Frequency Variability in the Northern Hemisphere Winter: Geographical Distribution, Structure and Time-Scale Dependence.

    Science.gov (United States)

    Kushnir, Yochanan; Wallace, John M.

    1989-10-01

    Low-frequency variability in wintertime 500 mb height is examined, with emphasis on its structure, geographical distribution, and frequency dependence. A 39-year record of 500 mb geopotential height fields from the NMC analyses is time filtered to partition the fluctuations into frequency bands corresponding to periods of 10-60 days, 60-180 days and > 180 days. Winter is defined as the six month period November through April. Variance, teleconnectivity, and anisotropy fields, and selected loading vectors derived from orthogonal and oblique rotations of the eigenvectors of the temporal correlation matrix for each band are shown and discussed.The variability in all frequency bands exhibits substantial anistropy, with meridionally elongated features arranged as zonally oriented wave trains prevailing over the continents and zonally elongated features organized in the form of north-south oriented dipole patterns prevailing over the oceanic sectors of the hemisphere. The wave trains are most pronounced in the 10-60 day variability, while the dipoles are most pronounced at lower frequencies. Eastward energy dispersion is apparent in the wave trains, but there is no evidence of phase propagation.Most of the `teleconnection patterns' identified in previous studies appear among the more prominent loading vectors. However, in most cases the loading vectors occur in pairs, in which the two patterns are in spatial quadrature with one another and account for comparable fractions of the hemispherically integrated variance. It is argued that such patterns should be interpreted as basis functions that can be linearly combined to form a continuum of anisotropic structures. Evidence of the existence of discrete `modal structures' is found only in the interannual (> 180-day period) variability, where two patterns stand out clearly above the background continuum: the Pacific-North American (PNA) pattern and the North Atlantic Oscillation (NAO). These patterns leave clear imprints upon

  15. Spatial and temporal variability of Arctic summer sea-ice albedo and its dependence on meltwater hydraulics

    Science.gov (United States)

    Eicken, H.; Perovich, D. K.; Grenfell, T. C.; Richter-Menge, J. A.; Frey, K.

    2001-12-01

    Next to ice extent and thickness, the area-averaged albedo of the summer sea-ice cover is a key parameter in determining the large-scale heat exchange over the Arctic Ocean. Various remote sensing applications have yielded a substantial data base for the former two parameters, not least due to the efforts of the National Snow and Ice Data Center (NSIDC) over the past 25 years. In contrast, the spatial and temporal variability of Arctic summer sea-ice albedo is much less well described. Despite its importance (incl. for ice-albedo feedback processes), few if any large-scale sea-ice and global circulation models actually predict summer ice based on the underlying physical processes. Most models employ simple parameterization schemes instead. Remote sensing of surface ice albedo also faces substantial challenges, some of which still need to be addressed in more detail. Here, we report on albedo measurements completed over first- and multi-year sea ice in the summers of 1998, 2000 and 2001 in the North American at the SHEBA drifting ice camp and in fast ice near Barrow, Alaska. As has been established in a number of studies, spatial and temporal variability in summer sea-ice albedo is mostly determined by the areal extent of meltwater ponding at the ice surface. Given the importance of this process, a comprehensive ice hydrological program (meltwater distribution, surface topography, meltwater flow and discharge, ice permeability) has been carried out in conjunction with the optical measurements. Measurements demonstrate that Arctic summer sea-ice albedo is critically dependent on the hydrology of surface melt ponds, as controlled by meltwater production rate, ice permeability and topography. Both, remarkable short-term variability (a reduction of albedo by 43% within two days) as well as the seasonal evolution of the pond fraction and hence area-averaged albedo are forced by changes in pond water level on the order of a few centimeters. While some of these forcing

  16. Implementation of a multi-variable regression analysis in the assessment of the generation rate and composition of hospital solid waste for the design of a sustainable management system in developing countries.

    Science.gov (United States)

    Al-Khatib, Issam A; Abu Fkhidah, Ismail; Khatib, Jumana I; Kontogianni, Stamatia

    2016-03-01

    Forecasting of hospital solid waste generation is a critical challenge for future planning. The composition and generation rate of hospital solid waste in hospital units was the field where the proposed methodology of the present article was applied in order to validate the results and secure the outcomes of the management plan in national hospitals. A set of three multiple-variable regression models has been derived for estimating the daily total hospital waste, general hospital waste, and total hazardous waste as a function of number of inpatients, number of total patients, and number of beds. The application of several key indicators and validation procedures indicates the high significance and reliability of the developed models in predicting the hospital solid waste of any hospital. Methodology data were drawn from existent scientific literature. Also, useful raw data were retrieved from international organisations and the investigated hospitals' personnel. The primal generation outcomes are compared with other local hospitals and also with hospitals from other countries. The main outcome, which is the developed model results, are presented and analysed thoroughly. The goal is this model to act as leverage in the discussions among governmental authorities on the implementation of a national plan for safe hospital waste management in Palestine.

  17. A Dirty Model for Multiple Sparse Regression

    CERN Document Server

    Jalali, Ali; Sanghavi, Sujay

    2011-01-01

    Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \\ell_1/\\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theore...

  18. VARIABLE STEP-SIZE IMPLICIT-EXPLICIT LINEAR MULTISTEP METHODS FOR TIME-DEPENDENT PARTIAL DIFFERENTIAL EQUATIONS

    Institute of Scientific and Technical Information of China (English)

    Dong Wang; Steven J. Ruuth

    2008-01-01

    Implicit-explicit (IMEX) linear multistep methods are popular techniques for solving partial differential equations (PDEs) with terms of different types. While fixed time-step versions of such schemes have been developed and studied, implicit-explicit schemes also naturally arise in general situations where the temporal smoothness of the solution changes. In this paper we consider easily implementable variable step-size implicit-explicit (VSIMEX) linear multistep methods for time-dependent PDEs. Families of order-p, p-step VSIMEX schemes are constructed and analyzed, where p ranges from 1 to 4. The corresponding schemes are simple to implement and have the property that they reduce to the classical IMEX schemes whenever constant time step-sizes are imposed. The methods are validated on the Burgers' equation. These results demonstrate that by varying the time step-size, VSIMEX methods can outperform their fixed time step counterparts while still maintaining good numerical behavior.

  19. A numerical study of comparison of two one-state-variable, rate- and state-dependent friction evolution laws

    Institute of Scientific and Technical Information of China (English)

    Jeen-Hwa Wang

    2009-01-01

    The two one-state-variable, rate- and state-dependent friction laws, i.e., the slip and slowness laws, are compared on the basis of dynamical behavior of a one-degree-of-freedom spring-slider model through numerical simulations. Results show that two (normalized) model parameters, i.e., △(the normalized characteristic slip distance) and β-α (the difference in two normalized parameters of friction laws), control the solutions. From given values of △, β, and α, for the slowness laws, the solution exists and the unique non-zero fixed point is stable when △>(β-α), yet not when △<β-α). For the slip law, the solution exists for large ranges of model parameters and the number and stability of the non-zero fixed points change from one case to another. Results suggest that the slip law is more appropriate for controlling earthquake dynamics than the slowness law.

  20. Short-time Variability of Blazars via Non-linear, Time-dependent Synchrotron-Self Compton Radiative Losses

    CERN Document Server

    Röken, Christian; Schöneberg, Sebastian; Schuppan, Florian

    2016-01-01

    A leptonic one-zone model accounting for the radiation emission of blazars is presented. This model describes multiple successive injections of mono-energetic, ultra-relativistic, interacting electron populations, which are subjected to synchrotron and synchrotron-self Compton radiative losses. The electron number density is computed analytically by solving a time-dependent, relativistic transport equation. Moreover, the synchrotron and synchrotron-self Compton intensities as well as the corresponding total fluences are explicitly calculated. The lightcurves and fluences are plotted for realistic parameter values, showing that the model can simultaneously explain both the specific short-time variability in the flaring of blazars and the characteristic broad-band fluence behavior.

  1. High interindividual variability in dose-dependent reduction in speed of movement after exposing C. elegans to shock waves

    Directory of Open Access Journals (Sweden)

    Nicholas Baker Angstman

    2015-02-01

    Full Text Available In blast-related mild traumatic brain injury (br-mTBI little is known about the connections between initial trauma and expression of individual clinical symptoms. Partly due to limitations of current in vitro and in vivo models of br-mTBI, reliable prediction of individual short- and long-term symptoms based on known blast input has not yet been possible. Here we demonstrate a dose-dependent effect of shock wave exposure on C. elegans using shock waves that share physical characteristics with those hypothesized to induce br-mTBI in humans. Increased exposure to shock waves resulted in decreased mean speed of movement while increasing the proportion of worms rendered paralyzed. Recovery of these two behavioral symptoms was observed during increasing post-traumatic waiting periods. Although effects were observed on a population-wide basis, large interindividual variability was present between organisms exposed to the same highly controlled conditions. Reduction of cavitation by exposing worms to shock waves in polyvinyl alcohol resulted in reduced effect, implicating primary blast effects as damaging components in shock wave induced trauma. Growing worms on NGM agar plates led to the same general results in initial shock wave effect in a standard medium, namely dose-dependence and high interindividual variability, as raising worms in liquid cultures. Taken together, these data indicate that reliable prediction of individual clinical symptoms based on known blast input as well as drawing conclusions on blast input from individual clinical symptoms is not feasible in br-mTBI.

  2. Effects of Shear Dependent Viscosity and Variable Thermal Conductivity on the Flow and Heat Transfer in a Slurry

    Directory of Open Access Journals (Sweden)

    Ling Miao

    2015-10-01

    Full Text Available In this paper we study the effects of variable viscosity and thermal conductivity on the heat transfer in the pressure-driven fully developed flow of a slurry (suspension between two horizontal flat plates. The fluid is assumed to be described by a constitutive relation for a generalized second grade fluid where the shear viscosity is a function of the shear rate, temperature and concentration. The heat flux vector for the slurry is assumed to follow a generalized form of the Fourier’s equation where the thermal conductivity k depends on the temperature as well as the shear rate. We numerically solve the governing equations of motion in the non-dimensional form and perform a parametric study to see the effects of various dimensionless numbers on the velocity, volume fraction and temperature profiles. The different cases of shear thinning and thickening, and the effect of the exponent in the Reynolds viscosity model, for the temperature variation in viscosity, are also considered. The results indicate that the variable thermal conductivity can play an important role in controlling the temperature variation in the flow.

  3. Flexible survival regression modelling

    DEFF Research Database (Denmark)

    Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben

    2009-01-01

    Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...

  4. A comparison of regression and regression-kriging for soil characterization using remote sensing imagery

    Science.gov (United States)

    In precision agriculture regression has been used widely to quality the relationship between soil attributes and other environmental variables. However, spatial correlation existing in soil samples usually makes the regression model suboptimal. In this study, a regression-kriging method was attemp...

  5. State-dependent variability of dynamic functional connectivity between frontoparietal and default networks relates to cognitive flexibility.

    Science.gov (United States)

    Douw, Linda; Wakeman, Daniel G; Tanaka, Naoaki; Liu, Hesheng; Stufflebeam, Steven M

    2016-12-17

    The brain is a dynamic, flexible network that continuously reconfigures. However, the neural underpinnings of how state-dependent variability of dynamic functional connectivity (vdFC) relates to cognitive flexibility are unclear. We therefore investigated flexible functional connectivity during resting-state and task-state functional magnetic resonance imaging (rs-fMRI and t-fMRI, resp.) and performed separate, out-of-scanner neuropsychological testing. We hypothesize that state-dependent vdFC between the frontoparietal network (FPN) and the default mode network (DMN) relates to cognitive flexibility. Seventeen healthy subjects performed the Stroop color word test and underwent t-fMRI (Stroop computerized version) and rs-fMRI. Time series were extracted from a cortical atlas, and a sliding window approach was used to obtain a number of correlation matrices per subject. vdFC was defined as the standard deviation of connectivity strengths over these windows. Higher task-state FPN-DMN vdFC was associated with greater out-of-scanner cognitive flexibility, while the opposite relationship was present for resting-state FPN-DMN vdFC. Moreover, greater contrast between task-state and resting-state vdFC related to better cognitive performance. In conclusion, our results suggest that not only the dynamics of connectivity between these networks is seminal for optimal functioning, but also that the contrast between dynamics across states reflects cognitive performance.

  6. Time-dependent simulations of emission from FSRQ PKS1510-089: multiwavelength variability of external Compton and SSC models

    CERN Document Server

    Chen, Xuhui; Liang, Edison; Boettcher, Markus

    2012-01-01

    [abridged] We present results of modeling the SED and multiwavelength variability of the bright FSRQ PKS1510-089 with our time-dependent multizone Monte Carlo/Fokker-Planck code (Chen et al. 2001). As primary source of seed photons for inverse Compton scattering, we consider radiation from the broad line region (BLR), from the molecular torus, and the local synchrotron radiation (SSC). Different scenarios are assessed by comparing simulated light curves and SEDs with one of the best flares by PKS1510-089, in March 2009. The time-dependence of our code and its correct handling of light travel time effects allow us to fully take into account the effect of the finite size of the active region, and in turn to fully exploit the information carried by time resolved observed SEDs, increasingly available since the launch of Fermi. We confirm that the spectrum adopted for the external radiation has an important impact on the modeling of the SED, in particular for the lower energy end of the Compton component, observed...

  7. Logistic regression in estimates of femoral neck fracture by fall

    Directory of Open Access Journals (Sweden)

    Jaroslava Wendlová

    2010-04-01

    Full Text Available Jaroslava WendlováDerer’s University Hospital and Policlinic, Osteological Unit, Bratislava, SlovakiaAbstract: The latest methods in estimating the probability (absolute risk of osteoporotic fractures include several logistic regression models, based on qualitative risk factors plus bone mineral density (BMD, and the probability estimate of fracture in the future. The Slovak logistic regression model, in contrast to other models, is created from quantitative variables of the proximal femur (in International System of Units and estimates the probability of fracture by fall.Objectives: The first objective of this study was to order selected independent variables according to the intensity of their influence (statistical significance upon the occurrence of values of the dependent variable: femur strength index (FSI. The second objective was to determine, using logistic regression, whether the odds of FSI acquiring a pathological value (femoral neck fracture by fall increased or declined if the value of the variables (T–score total hip, BMI, alpha angle, theta angle and HAL were raised by one unit.Patients and methods: Bone densitometer measurements using dual energy X–ray absorptiometry (DXA, (Prodigy, Primo, GE, USA of the left proximal femur were obtained from 3 216 East Slovak women with primary or secondary osteoporosis or osteopenia, aged 20–89 years (mean age 58.9; 95% CI: −58.42; 59.38. The following variables were measured: FSI, T-score total hip BMD, body mass index (BMI, as were the geometrical variables of proximal femur alpha angle (α angle, theta angle (θ angle, and hip axis length (HAL.Statistical analysis: Logistic regression was used to measure the influence of the independent variables (T-score total hip, alpha angle, theta angle, HAL, BMI upon the dependent variable (FSI.Results: The order of independent variables according to the intensity of their influence (greatest to least upon the occurrence of values of the

  8. Regression for economics

    CERN Document Server

    Naghshpour, Shahdad

    2012-01-01

    Regression analysis is the most commonly used statistical method in the world. Although few would characterize this technique as simple, regression is in fact both simple and elegant. The complexity that many attribute to regression analysis is often a reflection of their lack of familiarity with the language of mathematics. But regression analysis can be understood even without a mastery of sophisticated mathematical concepts. This book provides the foundation and will help demystify regression analysis using examples from economics and with real data to show the applications of the method. T

  9. Clustered regression with unknown clusters

    CERN Document Server

    Barman, Kishor

    2011-01-01

    We consider a collection of prediction experiments, which are clustered in the sense that groups of experiments ex- hibit similar relationship between the predictor and response variables. The experiment clusters as well as the regres- sion relationships are unknown. The regression relation- ships define the experiment clusters, and in general, the predictor and response variables may not exhibit any clus- tering. We call this prediction problem clustered regres- sion with unknown clusters (CRUC) and in this paper we focus on linear regression. We study and compare several methods for CRUC, demonstrate their applicability to the Yahoo Learning-to-rank Challenge (YLRC) dataset, and in- vestigate an associated mathematical model. CRUC is at the crossroads of many prior works and we study several prediction algorithms with diverse origins: an adaptation of the expectation-maximization algorithm, an approach in- spired by K-means clustering, the singular value threshold- ing approach to matrix rank minimization u...

  10. Gender effects in gaming research: a case for regression residuals?

    Science.gov (United States)

    Pfister, Roland

    2011-10-01

    Numerous recent studies have examined the impact of video gaming on various dependent variables, including the players' affective reactions, positive as well as detrimental cognitive effects, and real-world aggression. These target variables are typically analyzed as a function of game characteristics and player attributes-especially gender. However, findings on the uneven distribution of gaming experience between males and females, on the one hand, and the effect of gaming experience on several target variables, on the other hand, point at a possible confound when gaming experiments are analyzed with a standard analysis of variance. This study uses simulated data to exemplify analysis of regression residuals as a potentially beneficial data analysis strategy for such datasets. As the actual impact of gaming experience on each of the various dependent variables differs, the ultimate benefits of analysis of regression residuals entirely depend on the research question, but it offers a powerful statistical approach to video game research whenever gaming experience is a confounding factor.

  11. Coefficient shifts in geographical ecology: an empirical evaluation of spatial and non-spatial regression

    OpenAIRE

    Bini, L. Mauricio; Diniz-Filho, J. Alexandre F.; Rangel, Thiago F. L. V. B.; Akre, Thomas S. B.; Albaladejo, Rafael G.; Albuquerque, Fabio S.; Aparicio, Abelardo; Araújo, Miguel B.; Baselga, Andrés; Beck, Jan; Bellocq, M. Isabel; Böhning-Gaese, Katrin; Paulo A V Borges; Castro-Parga, Isabel; Chey, Vun Khen

    2009-01-01

    A major focus of geographical ecology and macroecology is to understand the causes of spatially structured ecological patterns. However, achieving this understanding can be complicated when using multiple regression, because the relative importance of explanatory variables, as measured by regression coefficients, can shift depending on whether spatially explicit or non-spatial modeling is used. However, the extent to which coefficients may shift and why shifts occur are unclear. H...

  12. GIS-Based Analytical Tools for Transport Planning: Spatial Regression Models for Transportation Demand Forecast

    Directory of Open Access Journals (Sweden)

    Simone Becker Lopes

    2014-04-01

    Full Text Available Considering the importance of spatial issues in transport planning, the main objective of this study was to analyze the results obtained from different approaches of spatial regression models. In the case of spatial autocorrelation, spatial dependence patterns should be incorporated in the models, since that dependence may affect the predictive power of these models. The results obtained with the spatial regression models were also compared with the results of a multiple linear regression model that is typically used in trips generation estimations. The findings support the hypothesis that the inclusion of spatial effects in regression models is important, since the best results were obtained with alternative models (spatial regression models or the ones with spatial variables included. This was observed in a case study carried out in the city of Porto Alegre, in the state of Rio Grande do Sul, Brazil, in the stages of specification and calibration of the models, with two distinct datasets.

  13. Variabilidade espacial de Planossolo e produtividade de soja em várzea sistematizada: análise geoestatística e análise de regressão Spatial variability of a Planosol and soybean yield on a land-leveled paddy soil: geoestatistical and regression analysis

    Directory of Open Access Journals (Sweden)

    José Miguel Reichert

    2008-08-01

    Full Text Available Atributos do solo e da planta, espacialmente definidos, contribuem para o planejamento de lavouras comerciais e a locação de experimentos. O presente trabalho teve por objetivo estudar a variabilidade espacial de alguns atributos físicos e químicos do solo e sua relação com a produtividade de soja em várzea sistematizada. O experimento foi realizado no ano agrícola de 2000, na área experimental do Departamento de Solos da Universidade Federal de Santa Maria, Santa Maria, RS, em um Planossolo Hidromórfico distrófico típico. Uma área de 160 x 88m foi amostrada em grade de 8 x 8m, totalizando 240 pontos. A produtividade e a altura de plantas de soja e atributos químicos e físicos do solo superficial (0 a 0,15m e subsuperficial (0,15 a 0,30m foram avaliadas. Para os atributos edáficos da camada de solo de 0-0,15m de profundidade, que apresentaram correlação com os atributos de planta, usaram-se técnicas de geoestatística, sendo a dependência espacial dos atributos avaliada por semivariogramas escalonados. Com exceção da densidade de partículas e do grau de floculação, todas as demais variáveis apresentaram dependência espacial de moderada (0,64 a 0,75 a forte (>0,75. A variabilidade espacial dos atributos físicos e químicos do solo afetou a produtividade de soja. Os atributos do solo analisados foram divididos, quanto ao alcance e ao modelo de semivariograma, em dois grupos: um grupo com modelo exponencial e alcance menor que 40m e outro com modelo gaussiano e alcance menor que 67m. A variabilidade espacial da produtividade de soja foi descrita pelo modelo gaussiano com alcance menor que 45m.Spatially-defined soil and plant properties contribute to better planning of experiments and commercial fields. This research had as objective the study of spatial variability of some physical and chemical properties of the soil and their relationship with soybean productivity. The experiment was conducted in the agricultural year

  14. GAUSSIAN COPULA MARGINAL REGRESSION FOR MODELING EXTREME DATA WITH APPLICATION

    Directory of Open Access Journals (Sweden)

    Sutikno

    2014-01-01

    Full Text Available Regression is commonly used to determine the relationship between the response variable and the predictor variable, where the parameters are estimated by Ordinary Least Square (OLS. This method can be used with an assumption that residuals are normally distributed (0, σ2. However, the assumption of normality of the data is often violated due to extreme observations, which are often found in the climate data. Modeling of rice harvested area with rainfall predictor variables allows extreme observations. Therefore, another approximation is necessary to be applied in order to overcome the presence of extreme observations. The method used to solve this problem is a Gaussian Copula Marginal Regression (GCMR, the regression-based Copula. As a case study, the method is applied to model rice harvested area of rice production centers in East Java, Indonesia, covering District: Banyuwangi, Lamongan, Bojonegoro, Ngawi and Jember. Copula is chosen because this method is not strict against the assumption distribution, especially the normal distribution. Moreover, this method can describe dependency on extreme point clearly. The GCMR performance will be compared with OLS and Generalized Linear Models (GLM. The identification result of the dependencies structure between the Rice Harvest per period (RH and monthly rainfall showed a dependency in all areas of research. It is shown that the real test copula type mostly follows the Gumbel distribution. While the comparison of the model goodness for rice harvested area in the modeling showed that the method used to model the exact GCMR in five districts RH1 and RH2 in Jember district since its lowest AICc. Looking at the data distribution pattern of response variables, it can be concluded that the GCMR good for modeling the response variable that is not normally distributed and tend to have a large skew.

  15. Modelling the interannual variability of extreme wave climate combining a time-dependent GEV model and Self-Organizing Maps

    Science.gov (United States)

    Izaguirre, Cristina; Mendez, Fernando J.; Camus, Paula; Minguez, Roberto; Menendez, Melisa; Losada, Iñigo J.

    2010-05-01

    It is well known that the seasonal-to-interannual variability of extreme wave climate is linked to the anomalies of the atmosphere circulation. In this work, we analyze the relationships between extreme significant wave height at a particular site and the synoptic-scale weather type. We combine a time-dependent Generalized Extreme Value (GEV) model for monthly maxima and self-organizing maps (SOM) applied to monthly mean sea level pressure field (SLP) anomalies. These time-varying SLP anomalies are encoded using principal component analysis, obtaining the corresponding spatial patterns (Empirical Orthogonal Functions, EOFs) and the temporal modes (PC, principal components). The location, scale and shape parameters of the GEV distribution are parameterized in terms of harmonic functions (seasonality) and linear covariates for the PCs (interannual variability) and the model is fitted using standard likelihood theory and an automatic parameter selection procedure, which avoids overparameterization. Thus, the resulting anomalies of the location and scale parameters with respect to the seasonality are projected to the SOM lattice obtaining the influence of every weather type on the extreme wave height probability distribution (and subsequently, return-level quantiles). The use of Self-organizing maps allows an easy visualization of the results. The application of the method to different areas in the North Atlantic Ocean helps us to quantify the importance of the North Atlantic Oscillation and the East Atlantic pattern in the location and scale parameters of the GEV probability distribution. Additionally, this work opens new forecasting possibilities for the probabilities of extreme events based on synoptic-scale patterns.

  16. A Cautionary Tale on the Inclusion of Variable Posttranslational Modifications in Database-Dependent Searches of Mass Spectrometry Data.

    Science.gov (United States)

    Svozil, J; Baerenfaller, K

    2017-01-01

    Mass spectrometry-based proteomics allows in principle the identification of unknown target proteins of posttranslational modifications and the sites of attachment. Including a variety of posttranslational modifications in database-dependent searches of high-throughput mass spectrometry data holds the promise to gain spectrum assignments to modified peptides, thereby increasing the number of assigned spectra, and to identify potentially interesting modification events. However, these potential benefits come for the price of an increased search space, which can lead to reduced scores, increased score thresholds, and erroneous peptide spectrum matches. We have assessed here the advantages and disadvantages of including the variable posttranslational modifications methionine oxidation, protein N-terminal acetylation, cysteine carbamidomethylation, transformation of N-terminal glutamine to pyroglutamic acid (Gln→pyro-Glu), and deamidation of asparagine and glutamine. Based on calculations of local false discovery rates and comparisons to known features of the respective modifications, we recommend for searches of samples that were not enriched for specific posttranslational modifications to only include methionine oxidation, protein N-terminal acetylation, and peptide N-terminal Gln→pyro-Glu as variable modifications. The principle of the validation strategy adopted here can also be applied for assessing the inclusion of posttranslational modifications for differently prepared samples, or for additional modifications. In addition, we have reassessed the special properties of the ubiquitin footprint, which is the remainder of ubiquitin moieties attached to lysines after tryptic digest. We show here that the ubiquitin footprint often breaks off as neutral loss and that it can be distinguished from dicarbamidomethylation events.

  17. Mapping geogenic radon potential by regression kriging.

    Science.gov (United States)

    Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos

    2016-02-15

    Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. Copyright © 2015 Elsevier B.V. All rights

  18. Earthquake cycles on rate-state faults: how does recurrence interval and its variability depend on fault length?

    Science.gov (United States)

    Cattania, C.; Segall, P.

    2016-12-01

    The concept of earthquake cycles is often invoked when discussing seismic risk. However, large faults exhibit more complex behavior than periodic stick-slip cycles. Some events, such as the 2004 Parkfield earthquake, are delayed relative to the mean recurrence interval; in other cases, ruptures are larger or smaller than expected. In contrast, small earthquakes can be very predictable: locked patches surrounded by aseismic creep can rupture periodically in events with similar waveforms. We use numerical tools and ideas from fracture mechanics to study the factors determining recurrence interval (T), rupture size and their variability at different scales. T has been estimated by assuming a constant stress drop and stressing rate inversely proportional to fault length (D). However, Werner & Rubin (2013) found that an energy criterion better explains the scaling of T vs. D in numerical models: on faults loaded from below, full ruptures occur when the elastic energy release rate at the top of the fault reaches the fracture energy. We run simulations of seismic cycles on rate state faults including dynamic weakening from thermal pressurization. A fault composed of a velocity weakening part over a velocity strengthening one is loaded from below at constant slip rate. We find that T increases with thermal pressurization, and verify that the energy argument, modified to account for the fracture energy from thermal pressurization, provides a good estimate of T and its scaling with D. We suggest that the recurrence interval is determined by two timescales: the time required to accumulate sufficient elastic energy for full rupture (tf), and the nucleation time, controlled by the propagation of a creep front into the velocity weakening region (tn). Both timescales depend on fault length: tf increases with D, and tn decreases. The latter is due to faster afterslip in the velocity strengthening region on larger faults. If tn < tf, partial ruptures occur; for large faults, tn

  19. Autistic epileptiform regression.

    Science.gov (United States)

    Canitano, Roberto; Zappella, Michele

    2006-01-01

    Autistic regression is a well known condition that occurs in one third of children with pervasive developmental disorders, who, after normal development in the first year of life, undergo a global regression during the second year that encompasses language, social skills and play. In a portion of these subjects, epileptiform abnormalities are present with or without seizures, resembling, in some respects, other epileptiform regressions of language and behaviour such as Landau-Kleffner syndrome. In these cases, for a more accurate definition of the clinical entity, the term autistic epileptifom regression has been suggested. As in other epileptic syndromes with regression, the relationships between EEG abnormalities, language and behaviour, in autism, are still unclear. We describe two cases of autistic epileptiform regression selected from a larger group of children with autistic spectrum disorders, with the aim of discussing the clinical features of the condition, the therapeutic approach and the outcome.

  20. Scaled Sparse Linear Regression

    CERN Document Server

    Sun, Tingni

    2011-01-01

    Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual squares and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs nearly nothing beyond the computation of a path of the sparse regression estimator for penalty levels above a threshold. For the scaled Lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the method yields simultaneously an estimator for the noise level and an estimated coefficient vector in the Lasso path satisfying certain oracle inequalities for the estimation of the noise level, prediction, and the estimation of regression coefficients. These oracle inequalities provide sufficient conditions for the consistency and asymptotic...

  1. Rolling Regressions with Stata

    OpenAIRE

    Kit Baum

    2004-01-01

    This talk will describe some work underway to add a "rolling regression" capability to Stata's suite of time series features. Although commands such as "statsby" permit analysis of non-overlapping subsamples in the time domain, they are not suited to the analysis of overlapping (e.g. "moving window") samples. Both moving-window and widening-window techniques are often used to judge the stability of time series regression relationships. We will present an implementation of a rolling regression...

  2. Unbiased Quasi-regression

    Institute of Scientific and Technical Information of China (English)

    Guijun YANG; Lu LIN; Runchu ZHANG

    2007-01-01

    Quasi-regression, motivated by the problems arising in the computer experiments, focuses mainly on speeding up evaluation. However, its theoretical properties are unexplored systemically. This paper shows that quasi-regression is unbiased, strong convergent and asymptotic normal for parameter estimations but it is biased for the fitting of curve. Furthermore, a new method called unbiased quasi-regression is proposed. In addition to retaining the above asymptotic behaviors of parameter estimations, unbiased quasi-regression is unbiased for the fitting of curve.

  3. Introduction to regression graphics

    CERN Document Server

    Cook, R Dennis

    2009-01-01

    Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava

  4. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2005-01-01

    Master linear regression techniques with a new edition of a classic text Reviews of the Second Edition: ""I found it enjoyable reading and so full of interesting material that even the well-informed reader will probably find something new . . . a necessity for all of those who do linear regression."" -Technometrics, February 1987 ""Overall, I feel that the book is a valuable addition to the now considerable list of texts on applied linear regression. It should be a strong contender as the leading text for a first serious course in regression analysis."" -American Scientist, May-June 1987

  5. Three Ingredients for Improved Global Aftershock Forecasts: Tectonic Region, Time-Dependent Catalog Incompleteness, and Inter-Sequence Variability

    Science.gov (United States)

    Page, M. T.; Hardebeck, J.; Felzer, K. R.; Michael, A. J.; van der Elst, N.

    2015-12-01

    Following a large earthquake, seismic hazard can be orders of magnitude higher than the long-term average as a result of aftershock triggering. Due to this heightened hazard, there is a demand from emergency managers and the public for rapid, authoritative, and reliable aftershock forecasts. In the past, USGS aftershock forecasts following large, global earthquakes have been released on an ad-hoc basis with inconsistent methods, and in some cases, aftershock parameters adapted from California. To remedy this, we are currently developing an automated aftershock product that will generate more accurate forecasts based on the Reasenberg and Jones (Science, 1989) method. To better capture spatial variations in aftershock productivity and decay, we estimate regional aftershock parameters for sequences within the Garcia et al. (BSSA, 2012) tectonic regions. We find that regional variations for mean aftershock productivity exceed a factor of 10. The Reasenberg and Jones method combines modified-Omori aftershock decay, Utsu productivity scaling, and the Gutenberg-Richter magnitude distribution. We additionally account for a time-dependent magnitude of completeness following large events in the catalog. We generalize the Helmstetter et al. (2005) equation for short-term aftershock incompleteness and solve for incompleteness levels in the global NEIC catalog following large mainshocks. In addition to estimating average sequence parameters within regions, we quantify the inter-sequence parameter variability. This allows for a more complete quantification of the forecast uncertainties and Bayesian updating of the forecast as sequence-specific information becomes available.

  6. A numerical model for density-and-viscosity-dependent flows in two-dimensional variably saturated porous media

    Science.gov (United States)

    Boufadel, Michel C.; Suidan, Makram T.; Venosa, Albert D.

    1999-04-01

    We present a formulation for water flow and solute transport in two-dimensional variably saturated media that accounts for the effects of the solute on water density and viscosity. The governing equations are cast in a dimensionless form that depends on six dimensionless groups of parameters. These equations are discretized in space using the Galerkin finite element formulation and integrated in time using the backward Euler scheme with mass lumping. The modified Picard method is used to linearize the water flow equation. The resulting numerical model, the MARUN model, is verified by comparison to published numerical results. It is then used to investigate beach hydraulics at seawater concentration (about 30 g l -1) in the context of nutrients delivery for bioremediation of oil spills on beaches. Numerical simulations that we conducted in a rectangular section of a hypothetical beach revealed that buoyancy in the unsaturated zone is significant in soils that are fine textured, with low anisotropy ratio, and/or exhibiting low physical dispersion. In such situations, application of dissolved nutrients to a contaminated beach in a freshwater solution is superior to their application in a seawater solution. Concentration-engendered viscosity effects were negligible with respect to concentration-engendered density effects for the cases that we considered.

  7. Using XMM-Newton to study the energy dependent variability of H 1743-322 during its 2014 outburst

    CERN Document Server

    Stiele, H

    2016-01-01

    Black hole transients during bright outbursts show distinct changes of their spectral and variability properties as they evolve during an outburst, that are interpreted as evidence for changes in the accretion flow and X-ray emitting regions. We obtained an anticipated XMM-Newton ToO observation of H 1743-322 during its outburst in September 2014. Based on data of eight outbursts observed in the last 10 years we expected to catch the start of the hard-to-soft state transition. The fact that neither the general shape of the observed power density spectrum nor the characteristic frequency show an energy dependence implies that the source still stays in the low-hard state at the time of our observation near outburst peak. The spectral properties agree with the source being in the low-hard state and a Swift/XRT monitoring of the outburst reveals that H 1743-322 stays in the low-hard state during the entire outburst (a. k. a. 'failed outburst'). We derive the averaged QPO waveform and obtain phase-resolved spectra...

  8. A Different Approach to Dependence Analysis.

    Science.gov (United States)

    Ferrari, Pier Alda; Raffinetti, Emanuela

    2015-01-01

    This article focuses on a statistical tool for dependence analysis in scientific research. Starting from a recent index of concordance for a multiple linear regression model, a coefficient suitable in catching any monotonic dependence relationship between a dependent variable and an independent variable is derived and discussed. Given its interpretation in terms of monotonic dependence, it is called monotonic dependence coefficient (MDC). It is appropriate to all contexts where the dependent variable is quantitative (continuous or discrete) and the independent variable is at least of ordinal nature; tied data are also allowed. MDC's adequacy is validated through Monte Carlo simulations led by taking into account different scenarios of dependence. Finally, an application to real data is provided to stress MDC's capability of detecting dependence relationships between two variables, even if some pieces of information about original data are lost.

  9. Logistic回归模型及其应用%Logistic regression model and its application

    Institute of Scientific and Technical Information of China (English)

    常振海; 刘薇

    2012-01-01

    为了利用Logistic模型提高多分类定性因变量的预测准确率,在二分类Logistic回归模型的基础上,对实际统计数据建立三类别的Logistic模型.采用似然比检验法对自变量的显著性进行检验,剔除了不显著的变量;对每个类别的因变量都确定了1个线性回归函数,并进行了模型检验.分析结果表明,在处理因变量为定性变量的回归分析中,Logistic模型具有很好的预测准确度和实用推广性.%To improve the forecasting accuracy of the multinomial qualitative dependent variable by using logistic model,ternary logistic model is established for actual statistical data based on binary logistic regression model.The significance of independent variables is tested by using the likelihood ratio test method to remove the non-significant variable.A linear regression function is determined for each category dependent variable,and the models are tested.The analysis results show that logistic regression model has good predictive accuracy and practical promotional value in handling regression analysis of qualitative dependent variable.

  10. Morse–Smale Regression

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Samuel [Univ. of Utah, Salt Lake City, UT (United States); Rubel, Oliver [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bremer, Peer -Timo [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Whitaker, Ross T. [Univ. of Utah, Salt Lake City, UT (United States)

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  11. Applied Regression Modeling A Business Approach

    CERN Document Server

    Pardoe, Iain

    2012-01-01

    An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a

  12. Regression to Causality

    DEFF Research Database (Denmark)

    Bordacconi, Mats Joe; Larsen, Martin Vinæs

    2014-01-01

    Humans are fundamentally primed for making causal attributions based on correlations. This implies that researchers must be careful to present their results in a manner that inhibits unwarranted causal attribution. In this paper, we present the results of an experiment that suggests regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...

  13. Heart Rate Variability for Early Detection of Cardiac Iron Deposition in Patients with Transfusion-Dependent Thalassemia

    Science.gov (United States)

    Silvilairat, Suchaya; Charoenkwan, Pimlak; Saekho, Suwit; Tantiworawit, Adisak; Phrommintikul, Arintaya; Srichairatanakool, Somdet; Chattipakorn, Nipon

    2016-01-01

    Background Iron overload cardiomyopathy remains the major cause of death in patients with transfusion-dependent thalassemia. Cardiac T2* magnetic resonance imaging is costly yet effective in detecting cardiac iron accumulation in the heart. Heart rate variability (HRV) has been used to evaluate cardiac autonomic function and is depressed in cases of thalassemia. We evaluated whether HRV could be used as an indicator for early identification of cardiac iron deposition. Methods One hundred and one patients with transfusion-dependent thalassemia were enrolled in this study. The correlation between recorded HRV and hemoglobin, non-transferrin bound iron (NTBI), serum ferritin and cardiac T2* were evaluated. Results The median age was 18 years (range 8–59 years). The patient group with a 5-year mean serum ferritin >5,000 ng/mL included significantly more homozygous β-thalassemia and splenectomized patients, had lower hemoglobin levels, and had more cardiac iron deposit than all other groups. Anemia strongly influenced all domains of HRV. After adjusting for anemia, neither serum ferritin nor NTBI impacted the HRV. However cardiac T2* was an independent predictor of HRV, even after adjusting for anemia. For receiver operative characteristic (ROC) curve analysis of cardiac T2* ≤20 ms, only mean ferritin in the last 12 months and the average of the standard deviation of all R-R intervals for all five-minute segments in the 24-hour recording were predictors for cardiac T2* ≤20 ms, with area under the ROC curve of 0.961 (p<0.0001) and 0.701 (p = 0.05), respectively. Conclusions Hemoglobin and cardiac T2* as significant predictors for HRV indicate that anemia and cardiac iron deposition result in cardiac autonomic imbalance. The mean ferritin in the last 12 months could be useful as the best indicator for further evaluation of cardiac risk. The ability of serum ferritin to predict cardiac risk is stronger than observed in other thalassemia cohorts. HRV might be a

  14. Predicting Social Trust with Binary Logistic Regression

    Science.gov (United States)

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  15. Selecting a Regression Saturated by Indicators

    DEFF Research Database (Denmark)

    Hendry, David F.; Johansen, Søren; Santos, Carlos

    We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain...

  16. Selecting a Regression Saturated by Indicators

    DEFF Research Database (Denmark)

    Hendry, David F.; Johansen, Søren; Santos, Carlos

    We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain...

  17. The M Word: Multicollinearity in Multiple Regression.

    Science.gov (United States)

    Morrow-Howell, Nancy

    1994-01-01

    Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…

  18. Polynomial Regressions and Nonsense Inference

    Directory of Open Access Journals (Sweden)

    Daniel Ventosa-Santaulària

    2013-11-01

    Full Text Available Polynomial specifications are widely used, not only in applied economics, but also in epidemiology, physics, political analysis and psychology, just to mention a few examples. In many cases, the data employed to estimate such specifications are time series that may exhibit stochastic nonstationary behavior. We extend Phillips’ results (Phillips, P. Understanding spurious regressions in econometrics. J. Econom. 1986, 33, 311–340. by proving that an inference drawn from polynomial specifications, under stochastic nonstationarity, is misleading unless the variables cointegrate. We use a generalized polynomial specification as a vehicle to study its asymptotic and finite-sample properties. Our results, therefore, lead to a call to be cautious whenever practitioners estimate polynomial regressions.

  19. Influence of hypoxia and hypercapnia on sleep state-dependent heart rate variability behavior in newborn lambs.

    Science.gov (United States)

    Beuchée, Alain; Hernández, Alfredo I; Duvareille, Charles; Daniel, David; Samson, Nathalie; Pladys, Patrick; Praud, Jean-Paul

    2012-11-01

    Although hypercapnia and/or hypoxia are frequently present during chronic lung disease of infancy and have also been implicated in sudden infant death syndrome (SIDS), their effect on cardiac autonomic regulation remains unclear. The authors' goal is to test that hypercapnia and hypoxia alter sleep-wake cycle-dependent heart rate variability (HRV) in the neonatal period. Experimental study measuring HRV during sleep states in lambs randomly exposed to hypercapnia, hypoxia, or air. University center for perinatal research in ovines (Sherbrooke, Canada). INSERM-university research unit for signal processing (Rennes, France). Six nonsedated, full-term lambs. Each lamb underwent polysomnographic recordings while in a chamber flowed with either air or 21% O(2) + 5% CO(2) (hypercapnia) or 10% O(2) + 0% CO(2) (hypoxia) on day 3, 4, and 5 of postnatal age. Hypercapnia increased the time spent in wakefulness and hypoxia the time spent in quiet sleep (QS). The state of alertness was the major determinant of HRV characterized with linear or nonlinear methods. Compared with QS, active sleep (AS) was associated with an overall increase in HRV magnitude and short-term self-similarity and a decrease in entropy of cardiac cycle length in air. This AS-related HRV pattern persisted in hypercapnia and was even more pronounced in hypoxia. Enhancement of AS-related sympathovagal coactivation in hypoxia, together with increased heart rate regularity, may be evidence that AS + hypoxia represent a particularly vulnerable state in early life. This should be kept in mind when deciding the optimal arterial oxygenation target in newborns and when investigating the potential involvement of hypoxia in SIDS pathogenesis.

  20. Applied logistic regression

    CERN Document Server

    Hosmer, David W; Sturdivant, Rodney X

    2013-01-01

     A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-

  1. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2013-01-01

    Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus

  2. Spatial Quantile Regression In Analysis Of Healthy Life Years In The European Union Countries

    Directory of Open Access Journals (Sweden)

    Trzpiot Grażyna

    2016-12-01

    Full Text Available The paper investigates the impact of the selected factors on the healthy life years of men and women in the EU countries. The multiple quantile spatial autoregression models are used in order to account for substantial differences in the healthy life years and life quality across the EU members. Quantile regression allows studying dependencies between variables in different quantiles of the response distribution. Moreover, this statistical tool is robust against violations of the classical regression assumption about the distribution of the error term. Parameters of the models were estimated using instrumental variable method (Kim, Muller 2004, whereas the confidence intervals and p-values were bootstrapped.

  3. PREDICTIONG OF EUCALYPTUS WOOD BY COKRIGING, KRIGING AND REGRESSION

    Directory of Open Access Journals (Sweden)

    Wellington Jorge Cavalcanti Lundgren

    2015-06-01

    Full Text Available In the Gypsum Pole of Araripe, semiarid zone of Pernambuco, where is produces 97% of the plaster consumed in Brazil, a forest experiment with 1875 eucalyptus was cut off and all the trees were rigorously cubed by the Smalian method. The location of each tree was marked on a Cartesian plane, and a sample of 200 trees was removed by entirely random process. In the 200 sample trees, three estimation methods for variable volume timber, regression analysis, kriging and cokriging were used. To cokriging method, the secondary variable was the DBH (Diameter at Breast Height, and for the regression model of Spurr or the combined variable, it uses two explanatory variables: total height of the tree (H and the DBH. The variables volume and DBH showed spatial dependency. To compare de methods it was used the coefficient of determination (R2 and the residual distribution of the errors (real x estimated data. The best results were achieved with the Spurr equation R2 = 0.82 and total volume estimated 166.25 m3. The cokriging provided and R2 = 0.72 with total volume estimated of 164.14 m3 and kriging had R2 = 0.32 and the total volume estimated of 163.21 m3. The real volume of the experiment was 166.14 m3. Key words: Forest inventory, Volume of timber, Geostatistics.

  4. Exploring nonlinear relations: models of clinical decision making by regression with optimal scaling.

    Science.gov (United States)

    Hartmann, Armin; Van Der Kooij, Anita J; Zeeck, Almut

    2009-07-01

    In explorative regression studies, linear models are often applied without questioning the linearity of the relations between the predictor variables and the dependent variable, or linear relations are taken as an approximation. In this study, the method of regression with optimal scaling transformations is demonstrated. This method does not require predefined nonlinear functions and results in easy-to-interpret transformations that will show the form of the relations. The method is illustrated using data from a German multicenter project on the indication criteria for inpatient or day clinic psychotherapy treatment. The indication criteria to include in the regression model were selected with the Lasso, which is a tool for predictor selection that overcomes the disadvantages of stepwise regression methods. The resulting prediction model indicates that treatment status is (approximately) linearly related to some criteria and nonlinearly related to others.

  5. Transductive Ordinal Regression

    CERN Document Server

    Seah, Chun-Wei; Ong, Yew-Soon

    2011-01-01

    Ordinal regression is commonly formulated as a multi-class problem with ordinal constraints. The challenge of designing accurate classifiers for ordinal regression generally increases with the number of classes involved, due to the large number of labeled patterns that are needed. The availability of ordinal class labels, however, are often costly to calibrate or difficult to obtain. Unlabeled patterns, on the other hand, often exist in much greater abundance and are freely available. To take benefits from the abundance of unlabeled patterns, we present a novel transductive learning paradigm for ordinal regression in this paper, namely Transductive Ordinal Regression (TOR). The key challenge of the present study lies in the precise estimation of both the ordinal class label of the unlabeled data and the decision functions of the ordinal classes, simultaneously. The core elements of the proposed TOR include an objective function that caters to several commonly used loss functions casted in transductive setting...

  6. The Role of Data Range in Linear Regression

    Science.gov (United States)

    da Silva, M. A. Salgueiro; Seixas, T. M.

    2017-09-01

    Measuring one physical quantity as a function of another often requires making some choices prior to the measurement process. Two of these choices are: the data range where measurements should focus and the number (n) of data points to acquire in the chosen data range. Here, we consider data range as the interval of variation of the independent variable (x) that is associated with a given interval of variation of the dependent variable (y). We analyzed the role of the width and lower endpoint of measurement data range on parameter estimation by linear regression. We show that, when feasible, increasing data range width is more effective than increasing the number of data points on the same data range in reducing the uncertainty in the slope of a regression line. Moreover, the uncertainty in the intercept of a regression line depends not only on the number of data points but also on the ratio between the lower endpoint and the width of the measurement data range, reaching its minimum when the dataset is centered at the ordinate axis. Since successful measurement methodologies require a good understanding of factors ruling data analysis, it is pedagogically justified and highly recommended to teach these two subjects alongside each other.

  7. Nonparametric Predictive Regression

    OpenAIRE

    Ioannis Kasparis; Elena Andreou; Phillips, Peter C.B.

    2012-01-01

    A unifying framework for inference is developed in predictive regressions where the predictor has unknown integration properties and may be stationary or nonstationary. Two easily implemented nonparametric F-tests are proposed. The test statistics are related to those of Kasparis and Phillips (2012) and are obtained by kernel regression. The limit distribution of these predictive tests holds for a wide range of predictors including stationary as well as non-stationary fractional and near unit...

  8. Mapping geogenic radon potential by regression kriging

    Energy Technology Data Exchange (ETDEWEB)

    Pásztor, László [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Szabó, Katalin Zsuzsanna, E-mail: sz_k_zs@yahoo.de [Department of Chemistry, Institute of Environmental Science, Szent István University, Páter Károly u. 1, Gödöllő 2100 (Hungary); Szatmári, Gábor; Laborczi, Annamária [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Horváth, Ákos [Department of Atomic Physics, Eötvös University, Pázmány Péter sétány 1/A, 1117 Budapest (Hungary)

    2016-02-15

    Radon ({sup 222}Rn) gas is produced in the radioactive decay chain of uranium ({sup 238}U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method

  9. A consistent framework for Horton regression statistics that leads to a modified Hack's law

    Science.gov (United States)

    Furey, Peter R.; Troutman, Brent M.

    2008-12-01

    A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ω. Data show that ω plays a statistically significant role in the modified Hack's law expression.

  10. A consistent framework for Horton regression statistics that leads to a modified Hack's law

    Science.gov (United States)

    Furey, P.R.; Troutman, B.M.

    2008-01-01

    A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.

  11. General Trimmed Estimation : Robust Approach to Nonlinear and Limited Dependent Variable Models (Replaced by DP 2007-65)

    NARCIS (Netherlands)

    Cizek, P.

    2007-01-01

    High breakdown-point regression estimators protect against large errors and data con- tamination. Motivated by some { the least trimmed squares and maximum trimmed like- lihood estimators { we propose a general trimmed estimator, which uni¯es and extends many existing robust procedures. We derive

  12. General Trimmed Estimation : Robust Approach to Nonlinear and Limited Dependent Variable Models (Replaces DP 2007-1)

    NARCIS (Netherlands)

    Cizek, P.

    2007-01-01

    High breakdown-point regression estimators protect against large errors and data con- tamination. We generalize the concept of trimming used by many of these robust estima- tors, such as the least trimmed squares and maximum trimmed likelihood, and propose a general trimmed estimator, which renders

  13. General Trimmed Estimation : Robust Approach to Nonlinear and Limited Dependent Variable Models (Replaces DP 2007-1)

    NARCIS (Netherlands)

    Cizek, P.

    2007-01-01

    High breakdown-point regression estimators protect against large errors and data con- tamination. We generalize the concept of trimming used by many of these robust estima- tors, such as the least trimmed squares and maximum trimmed likelihood, and propose a general trimmed estimator, which renders

  14. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates...... and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...

  15. Regression modeling of streamflow, baseflow, and runoff using geographic information systems.

    Science.gov (United States)

    Zhu, Yuanhong; Day, Rick L

    2009-02-01

    Regression models for predicting total streamflow (TSF), baseflow (TBF), and storm runoff (TRO) are needed for water resource planning and management. This study used 54 streams with >20 years of streamflow gaging station records during the period October 1971 to September 2001 in Pennsylvania and partitioned TSF into TBF and TRO. TBF was considered a surrogate of groundwater recharge for basins. Regression models for predicting basin-wide TSF, TBF, and TRO were developed under three scenarios that varied in regression variables used for model development. Regression variables representing basin geomorphological, geological, soil, and climatic characteristics were estimated using geographic information systems. All regression models for TSF, TBF, and TRO had R(2) values >0.94 and reasonable prediction errors. The two best TSF models developed under scenarios 1 and 2 had similar absolute prediction errors. The same was true for the two best TBF models. Therefore, any one of the two best TSF and TBF models could be used for respective flow prediction depending on variable availability. The TRO model developed under scenario 1 had smaller absolute prediction errors than that developed under scenario 2. Simplified Area-alone models developed under scenario 3 might be used when variables for using best models are not available, but had lower R(2) values and higher or more variable prediction errors than the best models.

  16. Analysis of the Influence of Quantile Regression Model on Mainland Tourists’ Service Satisfaction Performance

    Directory of Open Access Journals (Sweden)

    Wen-Cheng Wang

    2014-01-01

    Full Text Available It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  17. Analysis of the Influence of Quantile Regression Model on Mainland Tourists' Service Satisfaction Performance

    Science.gov (United States)

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  18. An Analysis of Some Variables Affecting the Internet Dependency Level of Turkish Adolescents by Using Decision Tree Methods

    Science.gov (United States)

    Kayri, Murat; Gunuc, Selim

    2010-01-01

    Internet dependency is going to expand into social life in wide area whereas it has been accepted as a pathological and psychological disease. Knowing the basic effects of internet dependency is an inevitable approach to use the internet technology healthy. In this study, internet dependency levels of 754 students were examined with the Internet…

  19. Can weighting compensate for nonresponse bias in a dependent variable? An evaluation of weighting methods to correct for substantive bias in a mail survey among Dutch municipalities

    NARCIS (Netherlands)

    van Goor, H; Stuiver, B

    1998-01-01

    Due to a lack of pertinent data, little is known about nonresponse in substantive, generally "dependent" variables and its consequences. However, in a study on policy performance of Dutch municipalities, we were fortunately able to gather performance data fur respondents and nonrespondents from

  20. Convergence Analysis of Semi-Implicit Euler Methods for Solving Stochastic Age-Dependent Capital System with Variable Delays and Random Jump Magnitudes

    Directory of Open Access Journals (Sweden)

    Qinghui Du

    2014-01-01

    Full Text Available We consider semi-implicit Euler methods for stochastic age-dependent capital system with variable delays and random jump magnitudes, and investigate the convergence of the numerical approximation. It is proved that the numerical approximate solutions converge to the analytical solutions in the mean-square sense under given conditions.

  1. A Simultaneous Confidence Corridor for Varying Coefficient Regression with Sparse Functional Data

    OpenAIRE

    Gu, Lijie; Li WANG; Härdle, Wolfgang Karl; Yang, Lijian

    2014-01-01

    We consider a varying coefficient regression model for sparse functional data, with time varying response variable depending linearly on some time independent covariates with coefficients as functions of time dependent covariates. Based on spline smoothing, we propose data driven simultaneous confidence corridors for the coefficient functions with asymptotically correct confidence level. Such confidence corridors are useful benchmarks for statistical inference on the global shapes of coeffici...

  2. Overcoming multicollinearity in multiple regression using correlation coefficient

    Science.gov (United States)

    Zainodin, H. J.; Yap, S. J.

    2013-09-01

    Multicollinearity happens when there are high correlations among independent variables. In this case, it would be difficult to distinguish between the contributions of these independent variables to that of the dependent variable as they may compete to explain much of the similar variance. Besides, the problem of multicollinearity also violates the assumption of multiple regression: that there is no collinearity among the possible independent variables. Thus, an alternative approach is introduced in overcoming the multicollinearity problem in achieving a well represented model eventually. This approach is accomplished by removing the multicollinearity source variables on the basis of the correlation coefficient values based on full correlation matrix. Using the full correlation matrix can facilitate the implementation of Excel function in removing the multicollinearity source variables. It is found that this procedure is easier and time-saving especially when dealing with greater number of independent variables in a model and a large number of all possible models. Hence, in this paper detailed insight of the procedure is shown, compared and implemented.

  3. Should metacognition be measured by logistic regression?

    Science.gov (United States)

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Constrained Sparse Galerkin Regression

    CERN Document Server

    Loiseau, Jean-Christophe

    2016-01-01

    In this work, we demonstrate the use of sparse regression techniques from machine learning to identify nonlinear low-order models of a fluid system purely from measurement data. In particular, we extend the sparse identification of nonlinear dynamics (SINDy) algorithm to enforce physical constraints in the regression, leading to energy conservation. The resulting models are closely related to Galerkin projection models, but the present method does not require the use of a full-order or high-fidelity Navier-Stokes solver to project onto basis modes. Instead, the most parsimonious nonlinear model is determined that is consistent with observed measurement data and satisfies necessary constraints. The constrained Galerkin regression algorithm is implemented on the fluid flow past a circular cylinder, demonstrating the ability to accurately construct models from data.

  5. Practical Session: Logistic Regression

    Science.gov (United States)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  6. Minimax Regression Quantiles

    DEFF Research Database (Denmark)

    Bache, Stefan Holst

    A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....

  7. ON THE EFFECTS OF THE PRESENCE AND METHODS OF THE ELIMINATION HETEROSCEDASTICITY AND AUTOCORRELATION IN THE REGRESSION MODEL

    Directory of Open Access Journals (Sweden)

    Nina L. Timofeeva

    2014-01-01

    Full Text Available The article presents the methodological and technical bases for the creation of regression models that adequately reflect reality. The focus is on methods of removing residual autocorrelation in models. Algorithms eliminating heteroscedasticity and autocorrelation of the regression model residuals: reweighted least squares method, the method of Cochran-Orkutta are given. A model of "pure" regression is build, as well as to compare the effect on the dependent variable of the different explanatory variables when the latter are expressed in different units, a standardized form of the regression equation. The scheme of abatement techniques of heteroskedasticity and autocorrelation for the creation of regression models specific to the social and cultural sphere is developed.

  8. Flows of Carreau fluid with pressure dependent viscosity in a variable porous medium: Application of polymer melt

    Directory of Open Access Journals (Sweden)

    M.Y. Malik

    2014-06-01

    Full Text Available The present work concerns the pressure dependent viscosity in Carreau fluid through porous medium. Four different combinations of pressure dependent viscosity and pressure dependent porous medium parameters are considered for two types of flow situations namely (i Poiseuille flow and (ii Couette flow. The solutions of non-linear equations have been evaluated numerically by Shooting method along with Runge-Kutta Fehlberg method. The physical features of pertinent parameters have been discussed through graphs.

  9. Fuzzy multiple linear regression: A computational approach

    Science.gov (United States)

    Juang, C. H.; Huang, X. H.; Fleming, J. W.

    1992-01-01

    This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.

  10. Quantile regression modeling for Malaysian automobile insurance premium data

    Science.gov (United States)

    Fuzi, Mohd Fadzli Mohd; Ismail, Noriszura; Jemain, Abd Aziz

    2015-09-01

    Quantile regression is a robust regression to outliers compared to mean regression models. Traditional mean regression models like Generalized Linear Model (GLM) are not able to capture the entire distribution of premium data. In this paper we demonstrate how a quantile regression approach can be used to model net premium data to study the effects of change in the estimates of regression parameters (rating classes) on the magnitude of response variable (pure premium). We then compare the results of quantile regression model with Gamma regression model. The results from quantile regression show that some rating classes increase as quantile increases and some decrease with decreasing quantile. Further, we found that the confidence interval of median regression (τ = O.5) is always smaller than Gamma regression in all risk factors.

  11. Nonlinear Regression with R

    CERN Document Server

    Ritz, Christian; Parmigiani, Giovanni

    2009-01-01

    R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.

  12. Software Regression Verification

    Science.gov (United States)

    2013-12-11

    of recursive procedures. Acta Informatica , 45(6):403 – 439, 2008. [GS11] Benny Godlin and Ofer Strichman. Regression verifica- tion. Technical Report...functions. Therefore, we need to rede - fine m-term. – Mutual termination. If either function f or function f ′ (or both) is non- deterministic, then their

  13. Linear Regression Analysis

    CERN Document Server

    Seber, George A F

    2012-01-01

    Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

  14. Significance tests to determine the direction of effects in linear regression models.

    Science.gov (United States)

    Wiedermann, Wolfgang; Hagmann, Michael; von Eye, Alexander

    2015-02-01

    Previous studies have discussed asymmetric interpretations of the Pearson correlation coefficient and have shown that higher moments can be used to decide on the direction of dependence in the bivariate linear regression setting. The current study extends this approach by illustrating that the third moment of regression residuals may also be used to derive conclusions concerning the direction of effects. Assuming non-normally distributed variables, it is shown that the distribution of residuals of the correctly specified regression model (e.g., Y is regressed on X) is more symmetric than the distribution of residuals of the competing model (i.e., X is regressed on Y). Based on this result, 4 one-sample tests are discussed which can be used to decide which variable is more likely to be the response and which one is more likely to be the explanatory variable. A fifth significance test is proposed based on the differences of skewness estimates, which leads to a more direct test of a hypothesis that is compatible with direction of dependence. A Monte Carlo simulation study was performed to examine the behaviour of the procedures under various degrees of associations, sample sizes, and distributional properties of the underlying population. An empirical example is given which illustrates the application of the tests in practice.

  15. Modeling the Philippines' real gross domestic product: A normal estimation equation for multiple linear regression

    Science.gov (United States)

    Urrutia, Jackie D.; Tampis, Razzcelle L.; Mercado, Joseph; Baygan, Aaron Vito M.; Baccay, Edcon B.

    2016-02-01

    The objective of this research is to formulate a mathematical model for the Philippines' Real Gross Domestic Product (Real GDP). The following factors are considered: Consumers' Spending (x1), Government's Spending (x2), Capital Formation (x3) and Imports (x4) as the Independent Variables that can actually influence in the Real GDP in the Philippines (y). The researchers used a Normal Estimation Equation using Matrices to create the model for Real GDP and used α = 0.01.The researchers analyzed quarterly data from 1990 to 2013. The data were acquired from the National Statistical Coordination Board (NSCB) resulting to a total of 96 observations for each variable. The data have undergone a logarithmic transformation particularly the Dependent Variable (y) to satisfy all the assumptions of the Multiple Linear Regression Analysis. The mathematical model for Real GDP was formulated using Matrices through MATLAB. Based on the results, only three of the Independent Variables are significant to the Dependent Variable namely: Consumers' Spending (x1), Capital Formation (x3) and Imports (x4), hence, can actually predict Real GDP (y). The regression analysis displays that 98.7% (coefficient of determination) of the Independent Variables can actually predict the Dependent Variable. With 97.6% of the result in Paired T-Test, the Predicted Values obtained from the model showed no significant difference from the Actual Values of Real GDP. This research will be essential in appraising the forthcoming changes to aid the Government in implementing policies for the development of the economy.

  16. Texting Dependence, iPod Dependence, and Delay Discounting.

    Science.gov (United States)

    Ferraro, F Richard; Weatherly, Jeffrey N

    2016-01-01

    We gave 127 undergraduates questionnaires about their iPod and texting dependence and 2 hypothetical delay discounting scenarios related to free downloaded songs and free texting for life. Using regression analyses we found that when iPod dependence was the dependent variable, Text2-excessive use, Text4-psychological and behavioral symptoms, iPod2-excessive use, and iPod3-relationship disruption were significant predictors of discounting. When texting dependence was the dependent variable, Text4-psychological and behavioral symptoms and iPod3-relationship disruption were significant predictors of discounting. These are the first data to show that delay discounting relates to certain aspects of social media, namely iPod and texting dependence. These data also show that across these 2 dependencies, both psychological and behavioral symptoms and relationship disruptions are affected.

  17. Approximation of conditional densities by smooth mixtures of regressions

    CERN Document Server

    Norets, Andriy

    2010-01-01

    This paper shows that large nonparametric classes of conditional multivariate densities can be approximated in the Kullback--Leibler distance by different specifications of finite mixtures of normal regressions in which normal means and variances and mixing probabilities can depend on variables in the conditioning set (covariates). These models are a special case of models known as "mixtures of experts" in statistics and computer science literature. Flexible specifications include models in which only mixing probabilities, modeled by multinomial logit, depend on the covariates and, in the univariate case, models in which only means of the mixed normals depend flexibly on the covariates. Modeling the variance of the mixed normals by flexible functions of the covariates can weaken restrictions on the class of the approximable densities. Obtained results can be generalized to mixtures of general location scale densities. Rates of convergence and easy to interpret bounds are also obtained for different model spec...

  18. Regression Verification Using Impact Summaries

    Science.gov (United States)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    versions [19]. These techniques compare two programs with a large degree of syntactic similarity to prove that portions of one program version are equivalent to the other. Regression verification can be used for guaranteeing backward compatibility, and for showing behavioral equivalence in programs with syntactic differences, e.g., when a program is refactored to improve its performance, maintainability, or readability. Existing regression verification techniques leverage similarities between program versions by using abstraction and decomposition techniques to improve scalability of the analysis [10, 12, 19]. The abstractions and decomposition in the these techniques, e.g., summaries of unchanged code [12] or semantically equivalent methods [19], compute an over-approximation of the program behaviors. The equivalence checking results of these techniques are sound but not complete-they may characterize programs as not functionally equivalent when, in fact, they are equivalent. In this work we describe a novel approach that leverages the impact of the differences between two programs for scaling regression verification. We partition program behaviors of each version into (a) behaviors impacted by the changes and (b) behaviors not impacted (unimpacted) by the changes. Only the impacted program behaviors are used during equivalence checking. We then prove that checking equivalence of the impacted program behaviors is equivalent to checking equivalence of all program behaviors for a given depth bound. In this work we use symbolic execution to generate the program behaviors and leverage control- and data-dependence information to facilitate the partitioning of program behaviors. The impacted program behaviors are termed as impact summaries. The dependence analyses that facilitate the generation of the impact summaries, we believe, could be used in conjunction with other abstraction and decomposition based approaches, [10, 12], as a complementary reduction technique. An

  19. Low rank Multivariate regression

    CERN Document Server

    Giraud, Christophe

    2010-01-01

    We consider in this paper the multivariate regression problem, when the target regression matrix $A$ is close to a low rank matrix. Our primary interest in on the practical case where the variance of the noise is unknown. Our main contribution is to propose in this setting a criterion to select among a family of low rank estimators and prove a non-asymptotic oracle inequality for the resulting estimator. We also investigate the easier case where the variance of the noise is known and outline that the penalties appearing in our criterions are minimal (in some sense). These penalties involve the expected value of the Ky-Fan quasi-norm of some random matrices. These quantities can be evaluated easily in practice and upper-bounds can be derived from recent results in random matrix theory.

  20. Subset selection in regression

    CERN Document Server

    Miller, Alan

    2002-01-01

    Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...

  1. Classification and regression trees

    CERN Document Server

    Breiman, Leo; Olshen, Richard A; Stone, Charles J

    1984-01-01

    The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

  2. TWO REGRESSION CREDIBILITY MODELS

    Directory of Open Access Journals (Sweden)

    Constanţa-Nicoleta BODEA

    2010-03-01

    Full Text Available In this communication we will discuss two regression credibility models from Non – Life Insurance Mathematics that can be solved by means of matrix theory. In the first regression credibility model, starting from a well-known representation formula of the inverse for a special class of matrices a risk premium will be calculated for a contract with risk parameter θ. In the next regression credibility model, we will obtain a credibility solution in the form of a linear combination of the individual estimate (based on the data of a particular state and the collective estimate (based on aggregate USA data. To illustrate the solution with the properties mentioned above, we shall need the well-known representation theorem for a special class of matrices, the properties of the trace for a square matrix, the scalar product of two vectors, the norm with respect to a positive definite matrix given in advance and the complicated mathematical properties of conditional expectations and of conditional covariances.

  3. On concurvity in nonlinear and nonparametric regression models

    Directory of Open Access Journals (Sweden)

    Sonia Amodio

    2014-12-01

    Full Text Available When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM. The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.

  4. Sparse Regression by Projection and Sparse Discriminant Analysis

    KAUST Repository

    Qi, Xin

    2015-04-03

    © 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.

  5. Deriving statistical significance maps for support vector regression using medical imaging data.

    Science.gov (United States)

    Gaonkar, Bilwaj; Sotiras, Aristeidis; Davatzikos, Christos

    2013-01-01

    Regression analysis involves predicting a continuous variable using imaging data. The Support Vector Regression (SVR) algorithm has previously been used in addressing regression analysis in neuroimaging. However, identifying the regions of the image that the SVR uses to model the dependence of a target variable remains an open problem. It is an important issue when one wants to biologically interpret the meaning of a pattern that predicts the variable(s) of interest, and therefore to understand normal or pathological process. One possible approach to the identification of these regions is the use of permutation testing. Permutation testing involves 1) generation of a large set of 'null SVR models' using randomly permuted sets of target variables, and 2) comparison of the SVR model trained using the original labels to the set of null models. These permutation tests often require prohibitively long computational time. Recent work in support vector classification shows that it is possible to analytically approximate the results of permutation testing in medical image analysis. We propose an analogous approach to approximate permutation testing based analysis for support vector regression with medical imaging data. In this paper we present 1) the theory behind our approximation, and 2) experimental results using two real datasets.

  6. Planting strategies of maize farmers in Kenya: a simultaneous equations analysis in the presence of discrete dependent variables

    CSIR Research Space (South Africa)

    Hassan, RM

    1996-11-01

    Full Text Available - rectly through its influence on cropping intensity (Yl). Female farmers may have preferences for cer- tain varietal traits, that may be different from those of male farmers. Again, experience and knowledge (age, education... of family size and available farm land. 4. The sex, education, and age of the farmer. Whereas the age variable is measured on a continuous scale, dichotomous indices are used to code sex (male, female) and education (none...

  7. An Effect Size for Regression Predictors in Meta-Analysis

    Science.gov (United States)

    Aloe, Ariel M.; Becker, Betsy Jane

    2012-01-01

    A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

  8. Prediction accuracy and stability of regression with optimal scaling transformations

    NARCIS (Netherlands)

    Kooij, van der Anita J.

    2007-01-01

    The central topic of this thesis is the CATREG approach to nonlinear regression. This approach finds optimal quantifications for categorical variables and/or nonlinear transformations for numerical variables in regression analysis. (CATREG is implemented in SPSS Categories by the author of the thesi

  9. Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations.

    Science.gov (United States)

    Hayes, Andrew F; Matthes, Jörg

    2009-08-01

    Researchers often hypothesize moderated effects, in which the effect of an independent variable on an outcome variable depends on the value of a moderator variable. Such an effect reveals itself statistically as an interaction between the independent and moderator variables in a model of the outcome variable. When an interaction is found, it is important to probe the interaction, for theories and hypotheses often predict not just interaction but a specific pattern of effects of the focal independent variable as a function of the moderator. This article describes the familiar pick-a-point approach and the much less familiar Johnson-Neyman technique for probing interactions in linear models and introduces macros for SPSS and SAS to simplify the computations and facilitate the probing of interactions in ordinary least squares and logistic regression. A script version of the SPSS macro is also available for users who prefer a point-and-click user interface rather than command syntax.

  10. Logistic regression to estimate the welfare of broiler breeders in relation to environmental and behavioral variables Regressão logística para estimativa do bem-estar de matrizes pesadas em função de variáveis comportamentais e ambientais

    Directory of Open Access Journals (Sweden)

    Danilo F Pereira

    2011-02-01

    Full Text Available The increasing demand of consumer markets for the welfare of birds in poultry house has motivated many scientific researches to monitor and classify the welfare according to the production environment. Given the complexity between the birds and the environment of the aviary, the correct interpretation of the conduct becomes an important way to estimate the welfare of these birds. This study obtained multiple logistic regression models with capacity of estimating the welfare of broiler breeders in relation to the environment of the aviaries and behaviors expressed by the birds. In the experiment, were observed several behaviors expressed by breeders housed in a climatic chamber under controlled temperatures and three different ammonia concentrations from the air monitored daily. From the analysis of the data it was obtained two logistic regression models, of which the first model uses a value of ammonia concentration measured by unit and the second model uses a binary value to classify the ammonia concentration that is assigned by a person through his olfactory perception. The analysis showed that both models classified the broiler breeder's welfare successfully.As crescentes demandas e exigências dos mercados consumidores pelo bem-estar das aves nos aviários têm motivado diversas pesquisas científicas a monitorar e a classificar o bem-estar em função do ambiente de criação. Diante da complexidade com que as aves interagem com o ambiente do aviário, a correta interpretação dos comportamentos torna-se uma importante maneira para estimar o bem-estar dessas aves. Este trabalho criou modelos de regressão logística múltipla capazes de estimar o bem-estar de matrizes pesadas em função do ambiente do aviário e dos comportamentos expressos pelas aves. No experimento, foram observados diversos comportamentos expressos por matrizes pesadas alojadas em câmara climática sob três temperaturas controladas e diferentes concentrações de am

  11. Meteorological variables affect fertility rate after intrauterine artificial insemination in sheep in a seasonal-dependent manner: a 7-year study

    Science.gov (United States)

    Palacios, C.; Abecia, J. A.

    2015-05-01

    A total number of 48,088 artificial inseminations (AIs) have been controlled during seven consecutive years in 79 dairy sheep Spanish farms (41° N). Mean, maximum and minimum ambient temperatures ( Ts), temperature amplitude (TA), mean relative humidity (RH), mean solar radiation (SR) and total rainfall of each insemination day and 15 days later were recorded. Temperature-humidity index (THI) and effective temperature (ET) have been calculated. A binary logistic regression model to estimate the risk of not getting pregnant compared to getting pregnant, through the odds ratio (OR), was performed. Successful winter inseminations were carried out under higher SR ( P 1 (maximum T, ET and rainfall on AI day, and ET and rainfall on day 15), and two variables presented OR reverse their effects in the hot or cold seasons. A forecast of the meteorological conditions could be a useful tool when AI dates are being scheduled.

  12. Multivariate parametric random effect regression models for fecundability studies.

    Science.gov (United States)

    Ecochard, R; Clayton, D G

    2000-12-01

    Delay until conception is generally described by a mixture of geometric distributions. Weinberg and Gladen (1986, Biometrics 42, 547-560) proposed a regression generalization of the beta-geometric mixture model where covariates effects were expressed in terms of contrasts of marginal hazards. Scheike and Jensen (1997, Biometrics 53, 318-329) developed a frailty model for discrete event times data based on discrete-time analogues of Hougaard's results (1984, Biometrika 71, 75-83). This paper is on a generalization to a three-parameter family distribution and an extension to multivariate cases. The model allows the introduction of explanatory variables, including time-dependent variables at the subject-specific level, together with a choice from a flexible family of random effect distributions. This makes it possible, in the context of medically assisted conception, to include data sources with multiple pregnancies (or attempts at pregnancy) per couple.

  13. [Dependence of effects of weak combined low-frequency variable and constant magnetic fields on the intensity of asexual reproduction of planarians Dugesia tigrina on the magnitude of the variable field].

    Science.gov (United States)

    Novikov, V V; Sheĭman, I M; Lisitsyn, A S; Kliubin, A V; Fesenko, E E

    2002-01-01

    It was shown that the stimulating effect of weak combined magnetic fields (constant component 42 microT, frequency of the variable component 3.7 Hz) on the division of planarians depends on the amplitude of the variable component of the field. The effect is particularly pronounced at 40 (the main maximum), 120, 160, and 640 nT. Narrow ranges of effective amplitudes alternate in some cases with equally narrow ranges in which the system does not respond to he treatment. In the range of super weak amplitudes of the variable field (0.1 and 1 nT), the stimulating effect is poorly pronounced. The data obtained indicate the presence of narrow amplitude windows in the response of the biological systems to weak and super weak magnetic fields. In a special series of experiments, it was shown that the effect of fields on planarians is partially mediated via aqueous medium preliminarily treated with weak magnetic fields. It is noteworthy that in experiments with water treated with weak magnetic fields, there were no pronounced maxima and minima in the magnitude of the effect in the range of amplitude of the variable magnetic field from 40 to 320 nT.

  14. Including long-range dependence in integrate-and-fire models of the high interspike-interval variability of cortical neurons.

    Science.gov (United States)

    Jackson, B Scott

    2004-10-01

    Many different types of integrate-and-fire models have been designed in order to explain how it is possible for a cortical neuron to integrate over many independent inputs while still producing highly variable spike trains. Within this context, the variability of spike trains has been almost exclusively measured using the coefficient of variation of interspike intervals. However, another important statistical property that has been found in cortical spike trains and is closely associated with their high firing variability is long-range dependence. We investigate the conditions, if any, under which such models produce output spike trains with both interspike-interval variability and long-range dependence similar to those that have previously been measured from actual cortical neurons. We first show analytically that a large class of high-variability integrate-and-fire models is incapable of producing such outputs based on the fact that their output spike trains are always mathematically equivalent to renewal processes. This class of models subsumes a majority of previously published models, including those that use excitation-inhibition balance, correlated inputs, partial reset, or nonlinear leakage to produce outputs with high variability. Next, we study integrate-and-fire models that have (nonPoissonian) renewal point process inputs instead of the Poisson point process inputs used in the preceding class of models. The confluence of our analytical and simulation results implies that the renewal-input model is capable of producing high variability and long-range dependence comparable to that seen in spike trains recorded from cortical neurons, but only if the interspike intervals of the inputs have infinite variance, a physiologically unrealistic condition. Finally, we suggest a new integrate-and-fire model that does not suffer any of the previously mentioned shortcomings. By analyzing simulation results for this model, we show that it is capable of producing output

  15. Variability of photovoltaic panels efficiency depending on the value of the angle of their inclination relative to the horizon

    Directory of Open Access Journals (Sweden)

    Majdak Marek

    2017-01-01

    Full Text Available The objective of this paper was to determine the relationship between the efficiency of photovoltaic panels and the value of the angle of their inclination relative to the horizon. For the purpose of experimental research have been done tests on the photovoltaic modules made of monocrystalline, polycrystalline and amorphous silicon. The experiment consisted of measurement of the voltage and current generated by photovoltaic panels at a known value of solar radiation and a specified resistance value determined by using resistor with variable value of resistance and known value of the angle of their inclination relative to the horizon.

  16. Estimation of Logistic Regression Models in Small Samples. A Simulation Study Using a Weakly Informative Default Prior Distribution

    Science.gov (United States)

    Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel

    2012-01-01

    In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…

  17. Multiple Regression with Varying Levels of Correlation among Predictors: Monte Carlo Sampling from Normal and Non-Normal Populations.

    Science.gov (United States)

    Vasu, Ellen Storey

    1978-01-01

    The effects of the violation of the assumption of normality in the conditional distributions of the dependent variable, coupled with the condition of multicollinearity upon the outcome of testing the hypothesis that the regression coefficient equals zero, are investigated via a Monte Carlo study. (Author/JKS)

  18. Regression-based air temperature spatial prediction models: an example from Poland

    Directory of Open Access Journals (Sweden)

    Mariusz Szymanowski

    2013-10-01

    Full Text Available A Geographically Weighted Regression ? Kriging (GWRK algorithm, based on the local Geographically Weighted Regression (GWR, is applied for spatial prediction of air temperature in Poland. Hengl's decision tree for selecting a suitable prediction model is extended for varying spatial relationships between the air temperature and environmental predictors with an assumption of existing environmental dependence of analyzed temperature variables. The procedure includes the potential choice of a local GWR instead of the global Multiple Linear Regression (MLR method for modeling the deterministic part of spatial variation, which is usual in the standard regression (residual kriging model (MLRK. The analysis encompassed: testing for environmental correlation, selecting an appropriate regression model, testing for spatial autocorrelation of the residual component, and validating the prediction accuracy. The proposed approach was performed for 69 air temperature cases, with time aggregation ranging from daily to annual average air temperatures. The results show that, irrespective of the level of data aggregation, the spatial distribution of temperature is better fitted by local models, and hence is the reason for choosing a GWR instead of the MLR for all variables analyzed. Additionally, in most cases (78% there is spatial autocorrelation in the residuals of the deterministic part, which suggests that the GWR model should be extended by ordinary kriging of residuals to the GWRK form. The decision tree used in this paper can be considered as universal as it encompasses either spatially varying relationships of modeled and explanatory variables or random process that can be modeled by a stochastic extension of the regression model (residual kriging. Moreover, for all cases analyzed, the selection of a method based on the local regression model (GWRK or GWR does not depend on the data aggregation level, showing the potential versatility of the technique.

  19. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    Energy Technology Data Exchange (ETDEWEB)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam [Pusat Pengajian Sains Matematik, Universiti Sains Malaysia, 11800 USM, Pulau Pinang, Malaysia amirul@unisel.edu.my, zalila@cs.usm.my, norlida@usm.my, adam@usm.my (Malaysia)

    2015-10-22

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  20. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    Science.gov (United States)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.