WorldWideScience

Sample records for hierarchical multivariate regression

  1. Multivariate and semiparametric kernel regression

    OpenAIRE

    Härdle, Wolfgang; Müller, Marlene

    1997-01-01

    The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...

  2. Multivariate Regression Analysis and Slaughter Livestock,

    Science.gov (United States)

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  3. Hierarchical regression analysis in structural Equation Modeling

    NARCIS (Netherlands)

    de Jong, P.F.

    1999-01-01

    In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main

  4. Regression Models For Multivariate Count Data.

    Science.gov (United States)

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2017-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.

  5. Bayesian Inference of a Multivariate Regression Model

    Directory of Open Access Journals (Sweden)

    Marick S. Sinay

    2014-01-01

    Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.

  6. Retro-regression--another important multivariate regression improvement.

    Science.gov (United States)

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  7. Regularized multivariate regression models with skew-t error distributions

    KAUST Repository

    Chen, Lianfu

    2014-06-01

    We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector. © 2014 Elsevier B.V.

  8. AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION

    OpenAIRE

    Krzyśko, Mirosław; Smaga, Łukasz

    2017-01-01

    In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...

  9. Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure.

    Science.gov (United States)

    Li, Yanming; Nan, Bin; Zhu, Ji

    2015-06-01

    We propose a multivariate sparse group lasso variable selection and estimation method for data with high-dimensional predictors as well as high-dimensional response variables. The method is carried out through a penalized multivariate multiple linear regression model with an arbitrary group structure for the regression coefficient matrix. It suits many biology studies well in detecting associations between multiple traits and multiple predictors, with each trait and each predictor embedded in some biological functional groups such as genes, pathways or brain regions. The method is able to effectively remove unimportant groups as well as unimportant individual coefficients within important groups, particularly for large p small n problems, and is flexible in handling various complex group structures such as overlapping or nested or multilevel hierarchical structures. The method is evaluated through extensive simulations with comparisons to the conventional lasso and group lasso methods, and is applied to an eQTL association study. © 2015, The International Biometric Society.

  10. Entrepreneurial intention modeling using hierarchical multiple regression

    Directory of Open Access Journals (Sweden)

    Marina Jeger

    2014-12-01

    Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.

  11. Hierarchical multivariate covariance analysis of metabolic connectivity.

    Science.gov (United States)

    Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J

    2014-12-01

    Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI).

  12. Regularized multivariate regression models with skew-t error distributions

    KAUST Repository

    Chen, Lianfu; Pourahmadi, Mohsen; Maadooliat, Mehdi

    2014-01-01

    We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both

  13. Stepwise versus Hierarchical Regression: Pros and Cons

    Science.gov (United States)

    Lewis, Mitzi

    2007-01-01

    Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…

  14. Sunspot Cycle Prediction Using Multivariate Regression and Binary ...

    Indian Academy of Sciences (India)

    49

    Multivariate regression model has been derived based on the available cycles 1 .... The flare index correlates well with various parameters of the solar activity. ...... 32) Sabarinath A and Anilkumar A K 2011 A stochastic prediction model for the.

  15. A Scalable Local Algorithm for Distributed Multivariate Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm can be used for distributed...

  16. Multivariate Regression of Liver on Intestine of Mice: A ...

    African Journals Online (AJOL)

    Multivariate Regression of Liver on Intestine of Mice: A Chemotherapeutic Evaluation of Plant ... Using an analysis of covariance model, the effects ... The findings revealed, with the aid of likelihood-ratio statistic, a marked improvement in

  17. An Efficient Local Algorithm for Distributed Multivariate Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm is designed for distributed...

  18. Hierarchical Neural Regression Models for Customer Churn Prediction

    Directory of Open Access Journals (Sweden)

    Golshan Mohammadi

    2013-01-01

    Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.

  19. A joint model for multivariate hierarchical semicontinuous data with replications.

    Science.gov (United States)

    Kassahun-Yimer, Wondwosen; Albert, Paul S; Lipsky, Leah M; Nansel, Tonja R; Liu, Aiyi

    2017-01-01

    Longitudinal data are often collected in biomedical applications in such a way that measurements on more than one response are taken from a given subject repeatedly overtime. For some problems, these multiple profiles need to be modeled jointly to get insight on the joint evolution and/or association of these responses over time. In practice, such longitudinal outcomes may have many zeros that need to be accounted for in the analysis. For example, in dietary intake studies, as we focus on in this paper, some food components are eaten daily by almost all subjects, while others are consumed episodically, where individuals have time periods where they do not eat these components followed by periods where they do. These episodically consumed foods need to be adequately modeled to account for the many zeros that are encountered. In this paper, we propose a joint model to analyze multivariate hierarchical semicontinuous data characterized by many zeros and more than one replicate observations at each measurement occasion. This approach allows for different probability mechanisms for describing the zero behavior as compared with the mean intake given that the individual consumes the food. To deal with the potentially large number of multivariate profiles, we use a pairwise model fitting approach that was developed in the context of multivariate Gaussian random effects models with large number of multivariate components. The novelty of the proposed approach is that it incorporates: (1) multivariate, possibly correlated, response variables; (2) within subject correlation resulting from repeated measurements taken from each subject; (3) many zero observations; (4) overdispersion; and (5) replicate measurements at each visit time.

  20. Asymptotics of Multivariate Regression with Consecutively Added Dependent Varibles

    NARCIS (Netherlands)

    Raats, V.M.; van der Genugten, B.B.; Moors, J.J.A.

    2004-01-01

    We consider multivariate regression where new dependent variables are consecutively added during the experiment (or in time).So, viewed at the end of the experiment, the number of observations decreases with each added variable. The explanatory variables are observed throughout.In a previous paper

  1. Multivariate Local Polynomial Regression with Application to Shenzhen Component Index

    Directory of Open Access Journals (Sweden)

    Liyun Su

    2011-01-01

    Full Text Available This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.

  2. Production optimisation in the petrochemical industry by hierarchical multivariate modelling

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, Magnus; Furusjoe, Erik; Jansson, Aasa

    2004-06-01

    This project demonstrates the advantages of applying hierarchical multivariate modelling in the petrochemical industry in order to increase knowledge of the total process. The models indicate possible ways to optimise the process regarding the use of energy and raw material, which is directly linked to the environmental impact of the process. The refinery of Nynaes Refining AB (Goeteborg, Sweden) has acted as a demonstration site in this project. The models developed for the demonstration site resulted in: Detection of an unknown process disturbance and suggestions of possible causes; Indications on how to increase the yield in combination with energy savings; The possibility to predict product quality from on-line process measurements, making the results available at a higher frequency than customary laboratory analysis; Quantification of the gradually lowered efficiency of heat transfer in the furnace and increased fuel consumption as an effect of soot build-up on the furnace coils; Increased knowledge of the relation between production rate and the efficiency of the heat exchangers. This report is one of two reports from the project. It contains a technical discussion of the result with some degree of detail. A shorter and more easily accessible report is also available, see IVL report B1586-A.

  3. Preference learning with evolutionary Multivariate Adaptive Regression Spline model

    DEFF Research Database (Denmark)

    Abou-Zleikha, Mohamed; Shaker, Noor; Christensen, Mads Græsbøll

    2015-01-01

    This paper introduces a novel approach for pairwise preference learning through combining an evolutionary method with Multivariate Adaptive Regression Spline (MARS). Collecting users' feedback through pairwise preferences is recommended over other ranking approaches as this method is more appealing...... for function approximation as well as being relatively easy to interpret. MARS models are evolved based on their efficiency in learning pairwise data. The method is tested on two datasets that collectively provide pairwise preference data of five cognitive states expressed by users. The method is analysed...

  4. REGSTEP - stepwise multivariate polynomial regression with singular extensions

    International Nuclear Information System (INIS)

    Davierwalla, D.M.

    1977-09-01

    The program REGSTEP determines a polynomial approximation, in the least squares sense, to tabulated data. The polynomial may be univariate or multivariate. The computational method is that of stepwise regression. A variable is inserted into the regression basis if it is significant with respect to an appropriate F-test at a preselected risk level. In addition, should a variable already in the basis, become nonsignificant (again with respect to an appropriate F-test) after the entry of a new variable, it is expelled from the model. Thus only significant variables are retained in the model. Although written expressly to be incorporated into CORCOD, a code for predicting nuclear cross sections for given values of power, temperature, void fractions, Boron content etc. there is nothing to limit the use of REGSTEP to nuclear applications, as the examples demonstrate. A separate version has been incorporated into RSYST for the general user. (Auth.)

  5. Collision prediction models using multivariate Poisson-lognormal regression.

    Science.gov (United States)

    El-Basyouny, Karim; Sayed, Tarek

    2009-07-01

    This paper advocates the use of multivariate Poisson-lognormal (MVPLN) regression to develop models for collision count data. The MVPLN approach presents an opportunity to incorporate the correlations across collision severity levels and their influence on safety analyses. The paper introduces a new multivariate hazardous location identification technique, which generalizes the univariate posterior probability of excess that has been commonly proposed and applied in the literature. In addition, the paper presents an alternative approach for quantifying the effect of the multivariate structure on the precision of expected collision frequency. The MVPLN approach is compared with the independent (separate) univariate Poisson-lognormal (PLN) models with respect to model inference, goodness-of-fit, identification of hot spots and precision of expected collision frequency. The MVPLN is modeled using the WinBUGS platform which facilitates computation of posterior distributions as well as providing a goodness-of-fit measure for model comparisons. The results indicate that the estimates of the extra Poisson variation parameters were considerably smaller under MVPLN leading to higher precision. The improvement in precision is due mainly to the fact that MVPLN accounts for the correlation between the latent variables representing property damage only (PDO) and injuries plus fatalities (I+F). This correlation was estimated at 0.758, which is highly significant, suggesting that higher PDO rates are associated with higher I+F rates, as the collision likelihood for both types is likely to rise due to similar deficiencies in roadway design and/or other unobserved factors. In terms of goodness-of-fit, the MVPLN model provided a superior fit than the independent univariate models. The multivariate hazardous location identification results demonstrated that some hazardous locations could be overlooked if the analysis was restricted to the univariate models.

  6. Multivariate Frequency-Severity Regression Models in Insurance

    Directory of Open Access Journals (Sweden)

    Edward W. Frees

    2016-02-01

    Full Text Available In insurance and related industries including healthcare, it is common to have several outcome measures that the analyst wishes to understand using explanatory variables. For example, in automobile insurance, an accident may result in payments for damage to one’s own vehicle, damage to another party’s vehicle, or personal injury. It is also common to be interested in the frequency of accidents in addition to the severity of the claim amounts. This paper synthesizes and extends the literature on multivariate frequency-severity regression modeling with a focus on insurance industry applications. Regression models for understanding the distribution of each outcome continue to be developed yet there now exists a solid body of literature for the marginal outcomes. This paper contributes to this body of literature by focusing on the use of a copula for modeling the dependence among these outcomes; a major advantage of this tool is that it preserves the body of work established for marginal models. We illustrate this approach using data from the Wisconsin Local Government Property Insurance Fund. This fund offers insurance protection for (i property; (ii motor vehicle; and (iii contractors’ equipment claims. In addition to several claim types and frequency-severity components, outcomes can be further categorized by time and space, requiring complex dependency modeling. We find significant dependencies for these data; specifically, we find that dependencies among lines are stronger than the dependencies between the frequency and average severity within each line.

  7. A generalized multivariate regression model for modelling ocean wave heights

    Science.gov (United States)

    Wang, X. L.; Feng, Y.; Swail, V. R.

    2012-04-01

    In this study, a generalized multivariate linear regression model is developed to represent the relationship between 6-hourly ocean significant wave heights (Hs) and the corresponding 6-hourly mean sea level pressure (MSLP) fields. The model is calibrated using the ERA-Interim reanalysis of Hs and MSLP fields for 1981-2000, and is validated using the ERA-Interim reanalysis for 2001-2010 and ERA40 reanalysis of Hs and MSLP for 1958-2001. The performance of the fitted model is evaluated in terms of Pierce skill score, frequency bias index, and correlation skill score. Being not normally distributed, wave heights are subjected to a data adaptive Box-Cox transformation before being used in the model fitting. Also, since 6-hourly data are being modelled, lag-1 autocorrelation must be and is accounted for. The models with and without Box-Cox transformation, and with and without accounting for autocorrelation, are inter-compared in terms of their prediction skills. The fitted MSLP-Hs relationship is then used to reconstruct historical wave height climate from the 6-hourly MSLP fields taken from the Twentieth Century Reanalysis (20CR, Compo et al. 2011), and to project possible future wave height climates using CMIP5 model simulations of MSLP fields. The reconstructed and projected wave heights, both seasonal means and maxima, are subject to a trend analysis that allows for non-linear (polynomial) trends.

  8. Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.

    Science.gov (United States)

    Houpt, Joseph W; Bittner, Jennifer L

    2018-05-10

    Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. Multivariate Regression of Liver on Intestine of Mice: A ...

    African Journals Online (AJOL)

    FIRST LADY

    pairs recovered. Linear, semi-logarithmic and logarithmic-logarithmic (log- log) regressions were performed. He chose the log-log curves because its variance was more uniform. The statistical comparison of .... E(U1| U2 = u2) is the regression function of U1 on U2, and Var (U1|U2 = u2) is the conditional covariance matrix.

  10. Real estate value prediction using multivariate regression models

    Science.gov (United States)

    Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

    2017-11-01

    The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.

  11. Hierarchical Decompositions for the Computation of High-Dimensional Multivariate Normal Probabilities

    KAUST Repository

    Genton, Marc G.

    2017-09-07

    We present a hierarchical decomposition scheme for computing the n-dimensional integral of multivariate normal probabilities that appear frequently in statistics. The scheme exploits the fact that the formally dense covariance matrix can be approximated by a matrix with a hierarchical low rank structure. It allows the reduction of the computational complexity per Monte Carlo sample from O(n2) to O(mn+knlog(n/m)), where k is the numerical rank of off-diagonal matrix blocks and m is the size of small diagonal blocks in the matrix that are not well-approximated by low rank factorizations and treated as dense submatrices. This hierarchical decomposition leads to substantial efficiencies in multivariate normal probability computations and allows integrations in thousands of dimensions to be practical on modern workstations.

  12. Hierarchical Decompositions for the Computation of High-Dimensional Multivariate Normal Probabilities

    KAUST Repository

    Genton, Marc G.; Keyes, David E.; Turkiyyah, George

    2017-01-01

    We present a hierarchical decomposition scheme for computing the n-dimensional integral of multivariate normal probabilities that appear frequently in statistics. The scheme exploits the fact that the formally dense covariance matrix can be approximated by a matrix with a hierarchical low rank structure. It allows the reduction of the computational complexity per Monte Carlo sample from O(n2) to O(mn+knlog(n/m)), where k is the numerical rank of off-diagonal matrix blocks and m is the size of small diagonal blocks in the matrix that are not well-approximated by low rank factorizations and treated as dense submatrices. This hierarchical decomposition leads to substantial efficiencies in multivariate normal probability computations and allows integrations in thousands of dimensions to be practical on modern workstations.

  13. Nonparametric Regression Estimation for Multivariate Null Recurrent Processes

    Directory of Open Access Journals (Sweden)

    Biqing Cai

    2015-04-01

    Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.

  14. Ultracentrifuge separative power modeling with multivariate regression using covariance matrix

    International Nuclear Information System (INIS)

    Migliavacca, Elder

    2004-01-01

    In this work, the least-squares methodology with covariance matrix is applied to determine a data curve fitting to obtain a performance function for the separative power δU of a ultracentrifuge as a function of variables that are experimentally controlled. The experimental data refer to 460 experiments on the ultracentrifugation process for uranium isotope separation. The experimental uncertainties related with these independent variables are considered in the calculation of the experimental separative power values, determining an experimental data input covariance matrix. The process variables, which significantly influence the δU values are chosen in order to give information on the ultracentrifuge behaviour when submitted to several levels of feed flow rate F, cut θ and product line pressure P p . After the model goodness-of-fit validation, a residual analysis is carried out to verify the assumed basis concerning its randomness and independence and mainly the existence of residual heteroscedasticity with any explained regression model variable. The surface curves are made relating the separative power with the control variables F, θ and P p to compare the fitted model with the experimental data and finally to calculate their optimized values. (author)

  15. Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

    Science.gov (United States)

    Murtagh, Fionn

    2017-06-01

    This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or `photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

  16. Creating Hierarchical Pores by Controlled Linker Thermolysis in Multivariate Metal-Organic Frameworks.

    Science.gov (United States)

    Feng, Liang; Yuan, Shuai; Zhang, Liang-Liang; Tan, Kui; Li, Jia-Luo; Kirchon, Angelo; Liu, Ling-Mei; Zhang, Peng; Han, Yu; Chabal, Yves J; Zhou, Hong-Cai

    2018-02-14

    Sufficient pore size, appropriate stability, and hierarchical porosity are three prerequisites for open frameworks designed for drug delivery, enzyme immobilization, and catalysis involving large molecules. Herein, we report a powerful and general strategy, linker thermolysis, to construct ultrastable hierarchically porous metal-organic frameworks (HP-MOFs) with tunable pore size distribution. Linker instability, usually an undesirable trait of MOFs, was exploited to create mesopores by generating crystal defects throughout a microporous MOF crystal via thermolysis. The crystallinity and stability of HP-MOFs remain after thermolabile linkers are selectively removed from multivariate metal-organic frameworks (MTV-MOFs) through a decarboxylation process. A domain-based linker spatial distribution was found to be critical for creating hierarchical pores inside MTV-MOFs. Furthermore, linker thermolysis promotes the formation of ultrasmall metal oxide nanoparticles immobilized in an open framework that exhibits high catalytic activity for Lewis acid-catalyzed reactions. Most importantly, this work provides fresh insights into the connection between linker apportionment and vacancy distribution, which may shed light on probing the disordered linker apportionment in multivariate systems, a long-standing challenge in the study of MTV-MOFs.

  17. Creating Hierarchical Pores by Controlled Linker Thermolysis in Multivariate Metal-Organic Frameworks

    KAUST Repository

    Feng, Liang

    2018-01-18

    Sufficient pore size, appropriate stability and hierarchical porosity are three prerequisites for open frameworks designed for drug delivery, enzyme immobilization and catalysis involving large molecules. Herein, we report a powerful and general strate-gy, linker thermolysis, to construct ultra-stable hierarchically porous metal−organic frameworks (HP-MOFs) with tunable pore size distribution. Linker instability, usually an undesirable trait of MOFs, was exploited to create mesopores by generating crystal defects throughout a microporous MOF crystal via thermolysis. The crystallinity and stability of HP-MOFs remain after thermolabile linkers are selectively removed from multivariate metal-organic frameworks (MTV-MOFs) through a decarboxyla-tion process. A domain-based linker spatial distribution was found to be critical for creating hierarchical pores inside MTV-MOFs. Furthermore, linker thermolysis promotes the formation of ultra-small metal oxide (MO) nanoparticles immobilized in an open framework that exhibits high catalytic activity for Lewis acid catalyzed reactions. Most importantly, this work pro-vides fresh insights into the connection between linker apportionment and vacancy distribution, which may shed light on prob-ing the disordered linker apportionment in multivariate systems, a long-standing challenge in the study of MTV-MOFs.

  18. Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression

    NARCIS (Netherlands)

    Yoo, W.W.; Ghosal, S

    2016-01-01

    In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a

  19. Fourier transform infrared spectroscopic imaging and multivariate regression for prediction of proteoglycan content of articular cartilage.

    Directory of Open Access Journals (Sweden)

    Lassi Rieppo

    Full Text Available Fourier Transform Infrared (FT-IR spectroscopic imaging has been earlier applied for the spatial estimation of the collagen and the proteoglycan (PG contents of articular cartilage (AC. However, earlier studies have been limited to the use of univariate analysis techniques. Current analysis methods lack the needed specificity for collagen and PGs. The aim of the present study was to evaluate the suitability of partial least squares regression (PLSR and principal component regression (PCR methods for the analysis of the PG content of AC. Multivariate regression models were compared with earlier used univariate methods and tested with a sample material consisting of healthy and enzymatically degraded steer AC. Chondroitinase ABC enzyme was used to increase the variation in PG content levels as compared to intact AC. Digital densitometric measurements of Safranin O-stained sections provided the reference for PG content. The results showed that multivariate regression models predict PG content of AC significantly better than earlier used absorbance spectrum (i.e. the area of carbohydrate region with or without amide I normalization or second derivative spectrum univariate parameters. Increased molecular specificity favours the use of multivariate regression models, but they require more knowledge of chemometric analysis and extended laboratory resources for gathering reference data for establishing the models. When true molecular specificity is required, the multivariate models should be used.

  20. Prognostic factorsin inoperable adenocarcinoma of the lung: A multivariate regression analysis of 259 patiens

    DEFF Research Database (Denmark)

    Sørensen, Jens Benn; Badsberg, Jens Henrik; Olsen, Jens

    1989-01-01

    The prognostic factors for survival in advanced adenocarcinoma of the lung were investigated in a consecutive series of 259 patients treated with chemotherapy. Twenty-eight pretreatment variables were investigated by use of Cox's multivariate regression model, including histological subtypes and ...

  1. Depth-weighted robust multivariate regression with application to sparse data

    KAUST Repository

    Dutta, Subhajit; Genton, Marc G.

    2017-01-01

    A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.

  2. Correcting for multivariate measurement error by regression calibration in meta-analyses of epidemiological studies.

    NARCIS (Netherlands)

    Kromhout, D.

    2009-01-01

    Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of the

  3. Comparing treatment effects after adjustment with multivariable Cox proportional hazards regression and propensity score methods

    NARCIS (Netherlands)

    Martens, Edwin P; de Boer, Anthonius; Pestman, Wiebe R; Belitser, Svetlana V; Stricker, Bruno H Ch; Klungel, Olaf H

    PURPOSE: To compare adjusted effects of drug treatment for hypertension on the risk of stroke from propensity score (PS) methods with a multivariable Cox proportional hazards (Cox PH) regression in an observational study with censored data. METHODS: From two prospective population-based cohort

  4. Depth-weighted robust multivariate regression with application to sparse data

    KAUST Repository

    Dutta, Subhajit

    2017-04-05

    A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.

  5. Neighborhood social capital and crime victimization: comparison of spatial regression analysis and hierarchical regression analysis.

    Science.gov (United States)

    Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

    2012-11-01

    Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. Copyright

  6. Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data.

    Science.gov (United States)

    Jeon, Jihyoun; Hsu, Li; Gorfine, Malka

    2012-07-01

    Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.

  7. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    Science.gov (United States)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  8. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    Science.gov (United States)

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  9. Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

    Science.gov (United States)

    MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

    2005-01-01

    Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.

  10. Multivariate nonparametric regression and visualization with R and applications to finance

    CERN Document Server

    Klemelä, Jussi

    2014-01-01

    A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio

  11. Regression Analysis for Multivariate Dependent Count Data Using Convolved Gaussian Processes

    OpenAIRE

    Sofro, A'yunin; Shi, Jian Qing; Cao, Chunzheng

    2017-01-01

    Research on Poisson regression analysis for dependent data has been developed rapidly in the last decade. One of difficult problems in a multivariate case is how to construct a cross-correlation structure and at the meantime make sure that the covariance matrix is positive definite. To address the issue, we propose to use convolved Gaussian process (CGP) in this paper. The approach provides a semi-parametric model and offers a natural framework for modeling common mean structure and covarianc...

  12. Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

    Science.gov (United States)

    Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

    2016-03-01

    From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.

  13. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS

    Directory of Open Access Journals (Sweden)

    Soyoung Park

    2017-07-01

    Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.

  14. Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

    Science.gov (United States)

    Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

    2018-05-01

    Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.

  15. Correcting for multivariate measurement error by regression calibration in meta-analyses of epidemiological studies

    DEFF Research Database (Denmark)

    Tybjærg-Hansen, Anne

    2009-01-01

    Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements...... of the risk factors are observed on a subsample. We extend the multivariate RC techniques to a meta-analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...

  16. PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

    Science.gov (United States)

    Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

    2014-10-01

    The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of

  17. Latent Variable Regression 4-Level Hierarchical Model Using Multisite Multiple-Cohorts Longitudinal Data. CRESST Report 801

    Science.gov (United States)

    Choi, Kilchan

    2011-01-01

    This report explores a new latent variable regression 4-level hierarchical model for monitoring school performance over time using multisite multiple-cohorts longitudinal data. This kind of data set has a 4-level hierarchical structure: time-series observation nested within students who are nested within different cohorts of students. These…

  18. Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

    Science.gov (United States)

    Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

    2017-06-01

    The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.

  19. Structural brain connectivity and cognitive ability differences: A multivariate distance matrix regression analysis.

    Science.gov (United States)

    Ponsoda, Vicente; Martínez, Kenia; Pineda-Pardo, José A; Abad, Francisco J; Olea, Julio; Román, Francisco J; Barbey, Aron K; Colom, Roberto

    2017-02-01

    Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp 38:803-816, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  20. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    Science.gov (United States)

    Ulbrich, Norbert Manfred

    2013-01-01

    A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.

  1. Multivariate linear regression of high-dimensional fMRI data with multiple target variables.

    Science.gov (United States)

    Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia

    2014-05-01

    Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets. Copyright © 2013 Wiley Periodicals, Inc.

  2. Principal Covariates Clusterwise Regression (PCCR): Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data.

    Science.gov (United States)

    Wilderjans, Tom Frans; Vande Gaer, Eva; Kiers, Henk A L; Van Mechelen, Iven; Ceulemans, Eva

    2017-03-01

    In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea's behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1-3):155-164, 1992) and CR (Späth in Computing 22(4):367-373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.

  3. Multivariate Multiple Regression Models for a Big Data-Empowered SON Framework in Mobile Wireless Networks

    Directory of Open Access Journals (Sweden)

    Yoonsu Shin

    2016-01-01

    Full Text Available In the 5G era, the operational cost of mobile wireless networks will significantly increase. Further, massive network capacity and zero latency will be needed because everything will be connected to mobile networks. Thus, self-organizing networks (SON are needed, which expedite automatic operation of mobile wireless networks, but have challenges to satisfy the 5G requirements. Therefore, researchers have proposed a framework to empower SON using big data. The recent framework of a big data-empowered SON analyzes the relationship between key performance indicators (KPIs and related network parameters (NPs using machine-learning tools, and it develops regression models using a Gaussian process with those parameters. The problem, however, is that the methods of finding the NPs related to the KPIs differ individually. Moreover, the Gaussian process regression model cannot determine the relationship between a KPI and its various related NPs. In this paper, to solve these problems, we proposed multivariate multiple regression models to determine the relationship between various KPIs and NPs. If we assume one KPI and multiple NPs as one set, the proposed models help us process multiple sets at one time. Also, we can find out whether some KPIs are conflicting or not. We implement the proposed models using MapReduce.

  4. Non-proportional odds multivariate logistic regression of ordinal family data.

    Science.gov (United States)

    Zaloumis, Sophie G; Scurrah, Katrina J; Harrap, Stephen B; Ellis, Justine A; Gurrin, Lyle C

    2015-03-01

    Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. On the degrees of freedom of reduced-rank estimators in multivariate regression.

    Science.gov (United States)

    Mukherjee, A; Chen, K; Wang, N; Zhu, J

    We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example.

  6. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    Science.gov (United States)

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

  7. Creating Hierarchical Pores by Controlled Linker Thermolysis in Multivariate Metal-Organic Frameworks

    KAUST Repository

    Feng, Liang; Yuan, Shuai; Zhang, Liang-Liang; Tan, Kui; Li, Jia-Luo; Kirchon, Angelo; Liu, Ling-Mei; Zhang, Peng; Han, Yu; Chabal, Yves J.; Zhou, Hong-Cai

    2018-01-01

    strate-gy, linker thermolysis, to construct ultra-stable hierarchically porous metal−organic frameworks (HP-MOFs) with tunable pore size distribution. Linker instability, usually an undesirable trait of MOFs, was exploited to create mesopores

  8. Endpoint in plasma etch process using new modified w-multivariate charts and windowed regression

    Science.gov (United States)

    Zakour, Sihem Ben; Taleb, Hassen

    2017-09-01

    Endpoint detection is very important undertaking on the side of getting a good understanding and figuring out if a plasma etching process is done in the right way, especially if the etched area is very small (0.1%). It truly is a crucial part of supplying repeatable effects in every single wafer. When the film being etched has been completely cleared, the endpoint is reached. To ensure the desired device performance on the produced integrated circuit, the high optical emission spectroscopy (OES) sensor is employed. The huge number of gathered wavelengths (profiles) is then analyzed and pre-processed using a new proposed simple algorithm named Spectra peak selection (SPS) to select the important wavelengths, then we employ wavelet analysis (WA) to enhance the performance of detection by suppressing noise and redundant information. The selected and treated OES wavelengths are then used in modified multivariate control charts (MEWMA and Hotelling) for three statistics (mean, SD and CV) and windowed polynomial regression for mean. The employ of three aforementioned statistics is motivated by controlling mean shift, variance shift and their ratio (CV) if both mean and SD are not stable. The control charts show their performance in detecting endpoint especially W-mean Hotelling chart and the worst result is given by CV statistic. As the best detection of endpoint is given by the W-Hotelling mean statistic, this statistic will be used to construct a windowed wavelet Hotelling polynomial regression. This latter can only identify the window containing endpoint phenomenon.

  9. Risk factors for pedicled flap necrosis in hand soft tissue reconstruction: a multivariate logistic regression analysis.

    Science.gov (United States)

    Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun

    2018-03-01

    Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.

  10. Multivariate regression analysis for determining short-term values of radon and its decay products from filter measurements

    International Nuclear Information System (INIS)

    Kraut, W.; Schwarz, W.; Wilhelm, A.

    1994-01-01

    A multivariate regression analysis is applied to decay measurements of α-resp. β-filter activcity. Activity concentrations for Po-218, Pb-214 and Bi-214, resp. for the Rn-222 equilibrium equivalent concentration are obtained explicitly. The regression analysis takes into account properly the variances of the measured count rates and their influence on the resulting activity concentrations. (orig.) [de

  11. Regional trends in short-duration precipitation extremes: a flexible multivariate monotone quantile regression approach

    Science.gov (United States)

    Cannon, Alex

    2017-04-01

    univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.

  12. Simultaneous chemometric determination of pyridoxine hydrochloride and isoniazid in tablets by multivariate regression methods.

    Science.gov (United States)

    Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru

    2010-08-01

    The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.

  13. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    Science.gov (United States)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP

  14. Reporting quality of multivariable logistic regression in selected Indian medical journals.

    Science.gov (United States)

    Kumar, R; Indrayan, A; Chhabra, P

    2012-01-01

    Use of multivariable logistic regression (MLR) modeling has steeply increased in the medical literature over the past few years. Testing of model assumptions and adequate reporting of MLR allow the reader to interpret results more accurately. To review the fulfillment of assumptions and reporting quality of MLR in selected Indian medical journals using established criteria. Analysis of published literature. Medknow.com publishes 68 Indian medical journals with open access. Eight of these journals had at least five articles using MLR between the years 1994 to 2008. Articles from each of these journals were evaluated according to the previously established 10-point quality criteria for reporting and to test the MLR model assumptions. SPSS 17 software and non-parametric test (Kruskal-Wallis H, Mann Whitney U, Spearman Correlation). One hundred and nine articles were finally found using MLR for analyzing the data in the selected eight journals. The number of such articles gradually increased after year 2003, but quality score remained almost similar over time. P value, odds ratio, and 95% confidence interval for coefficients in MLR was reported in 75.2% and sufficient cases (>10) per covariate of limiting sample size were reported in the 58.7% of the articles. No article reported the test for conformity of linear gradient for continuous covariates. Total score was not significantly different across the journals. However, involvement of statistician or epidemiologist as a co-author improved the average quality score significantly (P=0.014). Reporting of MLR in many Indian journals is incomplete. Only one article managed to score 8 out of 10 among 109 articles under review. All others scored less. Appropriate guidelines in instructions to authors, and pre-publication review of articles using MLR by a qualified statistician may improve quality of reporting.

  15. Using Multivariate Adaptive Regression Spline and Artificial Neural Network to Simulate Urbanization in Mumbai, India

    Science.gov (United States)

    Ahmadlou, M.; Delavar, M. R.; Tayyebi, A.; Shafizadeh-Moghadam, H.

    2015-12-01

    Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM+) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.

  16. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    Science.gov (United States)

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for

  17. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR is an efficient tool for metamodelling of nonlinear dynamic models

    Directory of Open Access Journals (Sweden)

    Omholt Stig W

    2011-06-01

    Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback

  18. TYPE Ia SUPERNOVA COLORS AND EJECTA VELOCITIES: HIERARCHICAL BAYESIAN REGRESSION WITH NON-GAUSSIAN DISTRIBUTIONS

    International Nuclear Information System (INIS)

    Mandel, Kaisey S.; Kirshner, Robert P.; Foley, Ryan J.

    2014-01-01

    We investigate the statistical dependence of the peak intrinsic colors of Type Ia supernovae (SNe Ia) on their expansion velocities at maximum light, measured from the Si II λ6355 spectral feature. We construct a new hierarchical Bayesian regression model, accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust, and implement a Gibbs sampler and deviance information criteria to estimate the correlation. The method is applied to the apparent colors from BVRI light curves and Si II velocity data for 79 nearby SNe Ia. The apparent color distributions of high-velocity (HV) and normal velocity (NV) supernovae exhibit significant discrepancies for B – V and B – R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B band, rather than dust reddening. The mean intrinsic B – V and B – R color differences between HV and NV groups are 0.06 ± 0.02 and 0.09 ± 0.02 mag, respectively. A linear model finds significant slopes of –0.021 ± 0.006 and –0.030 ± 0.009 mag (10 3 km s –1 ) –1 for intrinsic B – V and B – R colors versus velocity, respectively. Because the ejecta velocity distribution is skewed toward high velocities, these effects imply non-Gaussian intrinsic color distributions with skewness up to +0.3. Accounting for the intrinsic-color-velocity correlation results in corrections to A V extinction estimates as large as –0.12 mag for HV SNe Ia and +0.06 mag for NV events. Velocity measurements from SN Ia spectra have the potential to diminish systematic errors from the confounding of intrinsic colors and dust reddening affecting supernova distances

  19. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    Science.gov (United States)

    2017-09-01

    application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , )  (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations

  20. Hierarchical Hidden Markov Models for Multivariate Integer-Valued Time-Series

    DEFF Research Database (Denmark)

    Catania, Leopoldo; Di Mari, Roberto

    2018-01-01

    We propose a new flexible dynamic model for multivariate nonnegative integer-valued time-series. Observations are assumed to depend on the realization of two additional unobserved integer-valued stochastic variables which control for the time-and cross-dependence of the data. An Expectation......-Maximization algorithm for maximum likelihood estimation of the model's parameters is derived. We provide conditional and unconditional (cross)-moments implied by the model, as well as the limiting distribution of the series. A Monte Carlo experiment investigates the finite sample properties of our estimation...

  1. The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

    Science.gov (United States)

    Warton, David I; Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.

  2. Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic.

    Science.gov (United States)

    McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S

    2017-12-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.

  3. Production optimisation in the petrochemical industry by hierarchical multivariate modelling. Phase 2: On-line implementation

    Energy Technology Data Exchange (ETDEWEB)

    Nilsson, Aasa; Persson, Fredrik; Andersson, Magnus

    2009-07-15

    IVL, together with Emerson Process Management, has developed a decision support system (DSS) based on multivariate statistical process models. The system was implemented at Nynas AB's refinery in order to provide real-time TBP curves and to enable the operator to optimise the process with regards to product quality and energy consumption. The project resulted in the following proven benefits at the industrial reference site, Nynas Refinery in Gothenburg: - Increased yield with up to 14 % (relative terms) for the most valuable product - Decreased energy consumption of 8 %. Validation of model predictions compared to the laboratory analysis showed that the prediction error lay within 1 deg C throughout the whole test period

  4. Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models.

    Science.gov (United States)

    Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo

    2015-09-01

    Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to "correct" for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP. Copyright © 2015 by the Genetics Society of America.

  5. Order Selection for General Expression of Nonlinear Autoregressive Model Based on Multivariate Stepwise Regression

    Science.gov (United States)

    Shi, Jinfei; Zhu, Songqing; Chen, Ruwen

    2017-12-01

    An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.

  6. Dental age assessment of young Iranian adults using third molars: A multivariate regression study.

    Science.gov (United States)

    Bagherpour, Ali; Anbiaee, Najmeh; Partovi, Parnia; Golestani, Shayan; Afzalinasab, Shakiba

    2012-10-01

    In recent years, a noticeable increase in forensic age estimations of living individuals has been observed. Radiologic assessment of the mineralisation stage of third molars is of particular importance, with regard to the relevant age group. To attain a referral database and regression equations for dental age estimation of unaccompanied minors in an Iranian population was the goal of this study. Moreover, determination was made concerning the probability of an individual being over the age of 18 in case of full third molar(s) development. Using the scoring system of Gleiser and Hunt, modified by Köhler, an investigation of a cross-sectional sample of 1274 orthopantomograms of 885 females and 389 males aged between 15 and 22 years was carried out. Using kappa statistics, intra-observer reliability was tested. With Spearman correlation coefficient, correlation between the scores of all four wisdom teeth, was evaluated. We also carried out the Wilcoxon signed-rank test on asymmetry and calculated the regression formulae. A strong intra-observer agreement was displayed by the kappa value. No significant difference (p-value for upper and lower jaws were 0.07 and 0.59, respectively) was discovered by Wilcoxon signed-rank test for left and right asymmetry. The developmental stage of upper right and upper left third molars yielded the greatest correlation coefficient. The probability of an individual being over the age of 18 is 95.6% for males and 100.0% for females in case four fully developed third molars are present. Taking into consideration gender, location and number of wisdom teeth, regression formulae were arrived at. Use of population-specific standards is recommended as a means of improving the accuracy of forensic age estimates based on third molars mineralisation. To obtain more exact regression formulae, wider age range studies are recommended. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  7. Application of Multivariate Adaptive Regression Splines to Sheet Metal Bending Process for Springback Compensation

    Directory of Open Access Journals (Sweden)

    Dilan Rasim Aşkın

    2016-01-01

    Full Text Available An intelligent regression technique is applied for sheet metal bending processes to improve bending performance. This study is a part of another extensive study, automated sheet bending assistance for press brakes. Data related to material properties of sheet metal is collected in an online manner and fed to an intelligent system for determining the most accurate punch displacement without any offline iteration or calibration. The overall system aims to reduce the production time while increasing the performance of press brakes.

  8. Economic viability in concrete dams by multivariable regression tool for implantation of small hydroelectric plants

    International Nuclear Information System (INIS)

    Lima, Reginaldo Agapito de; Ribeiro Junior, Leopoldo Uberto

    2010-01-01

    For implantation of a SHP, the barrage is the main structure where its sizing represents from 30% - 50% of general cost of civil works. Considering this it is very important to have a fast, didactic and accurate tool for elaborating a budget, also allowing a quantitative analysis of inherent cost for civil building of barrages concrete made for small hydropower plants. In face of this, the multi changing regression tool is very important as it allows a fast and correct establishing of preliminary costs, even approximate, for estimates of barrages in concrete cost, enabling to ease the budget, guiding feasibility decisions for selecting or neglecting new alternatives of fall. (author)

  9. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

    Science.gov (United States)

    Delwiche, Stephen R; Reeves, James B

    2010-01-01

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various

  10. Principal Covariates Clusterwise Regression (PCCR) : Accounting for multicollinearity and population heterogeneity in hierarchically organized data.

    NARCIS (Netherlands)

    Wilderjans, Tom F.; Van de Gaer, E.; Kiers, H.A.L.; Van Mechelen, Iven; Ceulemans, Eva

    In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three

  11. Determination of boiling point of petrochemicals by gas chromatography-mass spectrometry and multivariate regression analysis of structural activity relationship.

    Science.gov (United States)

    Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A

    2014-08-01

    Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. Hierarchical probabilistic regionalization of volcanism for Sengan region in Japan using multivariate statistical techniques and geostatistical interpolation techniques

    International Nuclear Information System (INIS)

    Park, Jinyong; Balasingham, P.; McKenna, Sean Andrew; Kulatilake, Pinnaduwa H. S. W.

    2004-01-01

    Sandia National Laboratories, under contract to Nuclear Waste Management Organization of Japan (NUMO), is performing research on regional classification of given sites in Japan with respect to potential volcanic disruption using multivariate statistics and geo-statistical interpolation techniques. This report provides results obtained for hierarchical probabilistic regionalization of volcanism for the Sengan region in Japan by applying multivariate statistical techniques and geostatistical interpolation techniques on the geologic data provided by NUMO. A workshop report produced in September 2003 by Sandia National Laboratories (Arnold et al., 2003) on volcanism lists a set of most important geologic variables as well as some secondary information related to volcanism. Geologic data extracted for the Sengan region in Japan from the data provided by NUMO revealed that data are not available at the same locations for all the important geologic variables. In other words, the geologic variable vectors were found to be incomplete spatially. However, it is necessary to have complete geologic variable vectors to perform multivariate statistical analyses. As a first step towards constructing complete geologic variable vectors, the Universal Transverse Mercator (UTM) zone 54 projected coordinate system and a 1 km square regular grid system were selected. The data available for each geologic variable on a geographic coordinate system were transferred to the aforementioned grid system. Also the recorded data on volcanic activity for Sengan region were produced on the same grid system. Each geologic variable map was compared with the recorded volcanic activity map to determine the geologic variables that are most important for volcanism. In the regionalized classification procedure, this step is known as the variable selection step. The following variables were determined as most important for volcanism: geothermal gradient, groundwater temperature, heat discharge, groundwater

  13. Diagnostic accuracy of atypical p-ANCA in autoimmune hepatitis using ROC- and multivariate regression analysis.

    Science.gov (United States)

    Terjung, B; Bogsch, F; Klein, R; Söhne, J; Reichel, C; Wasmuth, J-C; Beuers, U; Sauerbruch, T; Spengler, U

    2004-09-29

    Antineutrophil cytoplasmic antibodies (atypical p-ANCA) are detected at high prevalence in sera from patients with autoimmune hepatitis (AIH), but their diagnostic relevance for AIH has not been systematically evaluated so far. Here, we studied sera from 357 patients with autoimmune (autoimmune hepatitis n=175, primary sclerosing cholangitis (PSC) n=35, primary biliary cirrhosis n=45), non-autoimmune chronic liver disease (alcoholic liver cirrhosis n=62; chronic hepatitis C virus infection (HCV) n=21) or healthy controls (n=19) for the presence of various non-organ specific autoantibodies. Atypical p-ANCA, antinuclear antibodies (ANA), antibodies against smooth muscles (SMA), antibodies against liver/kidney microsomes (anti-Lkm1) and antimitochondrial antibodies (AMA) were detected by indirect immunofluorescence microscopy, antibodies against the M2 antigen (anti-M2), antibodies against soluble liver antigen (anti-SLA/LP) and anti-Lkm1 by using enzyme linked immunosorbent assays. To define the diagnostic precision of the autoantibodies, results of autoantibody testing were analyzed by receiver operating characteristics (ROC) and forward conditional logistic regression analysis. Atypical p-ANCA were detected at high prevalence in sera from patients with AIH (81%) and PSC (94%). ROC- and logistic regression analysis revealed atypical p-ANCA and SMA, but not ANA as significant diagnostic seromarkers for AIH (atypical p-ANCA: AUC 0.754+/-0.026, odds ratio [OR] 3.4; SMA: 0.652+/-0.028, OR 4.1). Atypical p-ANCA also emerged as the only diagnostically relevant seromarker for PSC (AUC 0.690+/-0.04, OR 3.4). None of the tested antibodies yielded a significant diagnostic accuracy for patients with alcoholic liver cirrhosis, HCV or healthy controls. Atypical p-ANCA along with SMA represent a seromarker with high diagnostic accuracy for AIH and should be explicitly considered in a revised version of the diagnostic score for AIH.

  14. Boosted regression trees, multivariate adaptive regression splines and their two-step combinations with multiple linear regression or partial least squares to predict blood-brain barrier passage: a case study.

    Science.gov (United States)

    Deconinck, E; Zhang, M H; Petitet, F; Dubus, E; Ijjaali, I; Coomans, D; Vander Heyden, Y

    2008-02-18

    The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches.

  15. Multivariate regression applied to the performance optimization of a countercurrent ultracentrifuge - a preliminary study

    International Nuclear Information System (INIS)

    Migliavacca, Elder; Andrade, Delvonei Alves de

    2004-01-01

    In this work, the least-squares methodology with covariance matrix is applied to determine a data curve fitting in order to obtain a performance function for the separative power δU of a ultracentrifuge as a function of variables that are experimentally controlled. The experimental data refer to 173 experiments on the ultracentrifugation process for uranium isotope separation. The experimental uncertainties related with these independent variables are considered in the calculation of the experimental separative power values, determining an experimental data input covariance matrix. The process control variables, which significantly influence the δU values, are chosen in order to give information on the ultracentrifuge behaviour when submitted to several levels of feed flow F and cut θ . After the model goodness-of-fit validation, a residual analysis is carried out to verify the assumed basis concerning its randomness and independence and mainly the existence of residual heterocedasticity with any regression model variable. The response curves are made relating the separative power with the control variables F and θ, to compare the fitted model with the experimental data and finally to calculate their optimized values. (author)

  16. Prediction of diffuse solar irradiance using machine learning and multivariable regression

    International Nuclear Information System (INIS)

    Lou, Siwei; Li, Danny H.W.; Lam, Joseph C.; Chan, Wilco W.H.

    2016-01-01

    Highlights: • 54.9% of the annual global irradiance is composed by its diffuse part in HK. • Hourly diffuse irradiance was predicted by accessible variables. • The importance of variable in prediction was assessed by machine learning. • Simple prediction equations were developed with the knowledge of variable importance. - Abstract: The paper studies the horizontal global, direct-beam and sky-diffuse solar irradiance data measured in Hong Kong from 2008 to 2013. A machine learning algorithm was employed to predict the horizontal sky-diffuse irradiance and conduct sensitivity analysis for the meteorological variables. Apart from the clearness index (horizontal global/extra atmospheric solar irradiance), we found that predictors including solar altitude, air temperature, cloud cover and visibility are also important in predicting the diffuse component. The mean absolute error (MAE) of the logistic regression using the aforementioned predictors was less than 21.5 W/m"2 and 30 W/m"2 for Hong Kong and Denver, USA, respectively. With the systematic recording of the five variables for more than 35 years, the proposed model would be appropriate to estimate of long-term diffuse solar radiation, study climate change and develope typical meteorological year in Hong Kong and places with similar climates.

  17. Multivariate Regression Analysis and Statistical Modeling for Summer Extreme Precipitation over the Yangtze River Basin, China

    Directory of Open Access Journals (Sweden)

    Tao Gao

    2014-01-01

    Full Text Available Extreme precipitation is likely to be one of the most severe meteorological disasters in China; however, studies on the physical factors affecting precipitation extremes and corresponding prediction models are not accurately available. From a new point of view, the sensible heat flux (SHF and latent heat flux (LHF, which have significant impacts on summer extreme rainfall in Yangtze River basin (YRB, have been quantified and then selections of the impact factors are conducted. Firstly, a regional extreme precipitation index was applied to determine Regions of Significant Correlation (RSC by analyzing spatial distribution of correlation coefficients between this index and SHF, LHF, and sea surface temperature (SST on global ocean scale; then the time series of SHF, LHF, and SST in RSCs during 1967–2010 were selected. Furthermore, other factors that significantly affect variations in precipitation extremes over YRB were also selected. The methods of multiple stepwise regression and leave-one-out cross-validation (LOOCV were utilized to analyze and test influencing factors and statistical prediction model. The correlation coefficient between observed regional extreme index and model simulation result is 0.85, with significant level at 99%. This suggested that the forecast skill was acceptable although many aspects of the prediction model should be improved.

  18. Multivariate regression models for the simultaneous quantitative analysis of calcium and magnesium carbonates and magnesium oxide through drifts data

    Directory of Open Access Journals (Sweden)

    Marder Luciano

    2006-01-01

    Full Text Available In the present work multivariate regression models were developed for the quantitative analysis of ternary systems using Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS to determine the concentration in weight of calcium carbonate, magnesium carbonate and magnesium oxide. Nineteen spectra of standard samples previously defined in ternary diagram by mixture design were prepared and mid-infrared diffuse reflectance spectra were recorded. The partial least squares (PLS regression method was applied to the model. The spectra set was preprocessed by either mean-centered and variance-scaled (model 2 or mean-centered only (model 1. The results based on the prediction performance of the external validation set expressed by RMSEP (root mean square error of prediction demonstrated that it is possible to develop good models to simultaneously determine calcium carbonate, magnesium carbonate and magnesium oxide content in powdered samples that can be used in the study of the thermal decomposition of dolomite rocks.

  19. Perioperative factors predicting poor outcome in elderly patients following emergency general surgery: a multivariate regression analysis

    Science.gov (United States)

    Lees, Mackenzie C.; Merani, Shaheed; Tauh, Keerit; Khadaroo, Rachel G.

    2015-01-01

    Background Older adults (≥ 65 yr) are the fastest growing population and are presenting in increasing numbers for acute surgical care. Emergency surgery is frequently life threatening for older patients. Our objective was to identify predictors of mortality and poor outcome among elderly patients undergoing emergency general surgery. Methods We conducted a retrospective cohort study of patients aged 65–80 years undergoing emergency general surgery between 2009 and 2010 at a tertiary care centre. Demographics, comorbidities, in-hospital complications, mortality and disposition characteristics of patients were collected. Logistic regression analysis was used to identify covariate-adjusted predictors of in-hospital mortality and discharge of patients home. Results Our analysis included 257 patients with a mean age of 72 years; 52% were men. In-hospital mortality was 12%. Mortality was associated with patients who had higher American Society of Anesthesiologists (ASA) class (odds ratio [OR] 3.85, 95% confidence interval [CI] 1.43–10.33, p = 0.008) and in-hospital complications (OR 1.93, 95% CI 1.32–2.83, p = 0.001). Nearly two-thirds of patients discharged home were younger (OR 0.92, 95% CI 0.85–0.99, p = 0.036), had lower ASA class (OR 0.45, 95% CI 0.27–0.74, p = 0.002) and fewer in-hospital complications (OR 0.69, 95% CI 0.53–0.90, p = 0.007). Conclusion American Society of Anesthesiologists class and in-hospital complications are perioperative predictors of mortality and disposition in the older surgical population. Understanding the predictors of poor outcome and the importance of preventing in-hospital complications in older patients will have important clinical utility in terms of preoperative counselling, improving health care and discharging patients home. PMID:26204143

  20. A Logistic Regression Model with a Hierarchical Random Error Term for Analyzing the Utilization of Public Transport

    Directory of Open Access Journals (Sweden)

    Chong Wei

    2015-01-01

    Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.

  1. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Guo Junqiao

    2008-09-01

    Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.

  2. Forecasting the daily power output of a grid-connected photovoltaic system based on multivariate adaptive regression splines

    International Nuclear Information System (INIS)

    Li, Yanting; He, Yong; Su, Yan; Shu, Lianjie

    2016-01-01

    Highlights: • Suggests a nonparametric model based on MARS for output power prediction. • Compare the MARS model with a wide variety of prediction models. • Show that the MARS model is able to provide an overall good performance in both the training and testing stages. - Abstract: Both linear and nonlinear models have been proposed for forecasting the power output of photovoltaic systems. Linear models are simple to implement but less flexible. Due to the stochastic nature of the power output of PV systems, nonlinear models tend to provide better forecast than linear models. Motivated by this, this paper suggests a fairly simple nonlinear regression model known as multivariate adaptive regression splines (MARS), as an alternative to forecasting of solar power output. The MARS model is a data-driven modeling approach without any assumption about the relationship between the power output and predictors. It maintains simplicity of the classical multiple linear regression (MLR) model while possessing the capability of handling nonlinearity. It is simpler in format than other nonlinear models such as ANN, k-nearest neighbors (KNN), classification and regression tree (CART), and support vector machine (SVM). The MARS model was applied on the daily output of a grid-connected 2.1 kW PV system to provide the 1-day-ahead mean daily forecast of the power output. The comparisons with a wide variety of forecast models show that the MARS model is able to provide reliable forecast performance.

  3. [Multivariate ordinal logistic regression analysis on the association between consumption of fried food and both esophageal cancer and precancerous lesions].

    Science.gov (United States)

    Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B

    2017-12-10

    Objective: To investigate the effect of fried food intake on the pathogenesis of esophageal cancer and precancerous lesions. Methods: From 2005 to 2013, all the residents aged 40-69 years from 11 counties (cities) where cancer screening of upper gastrointestinal cancer had been conducted in rural areas of Henan province, were recruited as the subjects of study. Information on demography and lifestyle was collected. The residents under study were screened with iodine staining endoscopic examination and biopsy samples were diagnosed pathologically, under standardized criteria. Subjects with high risk were divided into the groups based on their different pathological degrees. Multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and esophageal cancer and precancerous lesions. Results: A total number of 8 792 cases with normal esophagus, 3 680 with mild hyperplasia, 972 with moderate hyperplasia, 413 with severe hyperplasia carcinoma in situ, and 336 cases of esophageal cancer were recruited. Results from multivariate logistic regression analysis showed that, when compared with those who did not eat fried food, the intake of fried food (food appeared a risk factor for both esophageal cancer and precancerous lesions.

  4. Predicting multi-level drug response with gene expression profile in multiple myeloma using hierarchical ordinal regression.

    Science.gov (United States)

    Zhang, Xinyan; Li, Bingzong; Han, Huiying; Song, Sha; Xu, Hongxia; Hong, Yating; Yi, Nengjun; Zhuang, Wenzhuo

    2018-05-10

    Multiple myeloma (MM), like other cancers, is caused by the accumulation of genetic abnormalities. Heterogeneity exists in the patients' response to treatments, for example, bortezomib. This urges efforts to identify biomarkers from numerous molecular features and build predictive models for identifying patients that can benefit from a certain treatment scheme. However, previous studies treated the multi-level ordinal drug response as a binary response where only responsive and non-responsive groups are considered. It is desirable to directly analyze the multi-level drug response, rather than combining the response to two groups. In this study, we present a novel method to identify significantly associated biomarkers and then develop ordinal genomic classifier using the hierarchical ordinal logistic model. The proposed hierarchical ordinal logistic model employs the heavy-tailed Cauchy prior on the coefficients and is fitted by an efficient quasi-Newton algorithm. We apply our hierarchical ordinal regression approach to analyze two publicly available datasets for MM with five-level drug response and numerous gene expression measures. Our results show that our method is able to identify genes associated with the multi-level drug response and to generate powerful predictive models for predicting the multi-level response. The proposed method allows us to jointly fit numerous correlated predictors and thus build efficient models for predicting the multi-level drug response. The predictive model for the multi-level drug response can be more informative than the previous approaches. Thus, the proposed approach provides a powerful tool for predicting multi-level drug response and has important impact on cancer studies.

  5. Application of multivariate adaptive regression spine-assisted objective function on optimization of heat transfer rate around a cylinder

    Energy Technology Data Exchange (ETDEWEB)

    Dey, Prasenjit; Dad, Ajoy K. [Mechanical Engineering Department, National Institute of Technology, Agartala (India)

    2016-12-15

    The present study aims to predict the heat transfer characteristics around a square cylinder with different corner radii using multivariate adaptive regression splines (MARS). Further, the MARS-generated objective function is optimized by particle swarm optimization. The data for the prediction are taken from the recently published article by the present authors [P. Dey, A. Sarkar, A.K. Das, Development of GEP and ANN model to predict the unsteady forced convection over a cylinder, Neural Comput. Appl. (2015). Further, the MARS model is compared with artificial neural network and gene expression programming. It has been found that the MARS model is very efficient in predicting the heat transfer characteristics. It has also been found that MARS is more efficient than artificial neural network and gene expression programming in predicting the forced convection data, and also particle swarm optimization can efficiently optimize the heat transfer rate.

  6. Research on refugees and immigrants social integration in Yunnan Border Area: An empirical analysis on the multivariable linear regression model

    Directory of Open Access Journals (Sweden)

    Peng Nai

    2016-03-01

    Full Text Available A great number of immigration populations resident permanently in Yunnan Border Area of China. To some extent, these people belong to refugees or immigrants in accordance with International Rules, which significantly features the social diversity of this area. However, this kind of social diversity always impairs the social order. Therefore, there will be a positive influence to the local society governance by a research on local immigration integration. This essay hereby attempts to acquire the data of the living situation of these border area immigration and refugees. The analysis of the social integration of refugees and immigration in Yunnan border area in China will be deployed through the modeling of multivariable linear regression based on these data in order to propose some more achievable resolutions.

  7. Modelling lecturer performance index of private university in Tulungagung by using survival analysis with multivariate adaptive regression spline

    Science.gov (United States)

    Hasyim, M.; Prastyo, D. D.

    2018-03-01

    Survival analysis performs relationship between independent variables and survival time as dependent variable. In fact, not all survival data can be recorded completely by any reasons. In such situation, the data is called censored data. Moreover, several model for survival analysis requires assumptions. One of the approaches in survival analysis is nonparametric that gives more relax assumption. In this research, the nonparametric approach that is employed is Multivariate Regression Adaptive Spline (MARS). This study is aimed to measure the performance of private university’s lecturer. The survival time in this study is duration needed by lecturer to obtain their professional certificate. The results show that research activities is a significant factor along with developing courses material, good publication in international or national journal, and activities in research collaboration.

  8. Modeling the potential risk factors of bovine viral diarrhea prevalence in Egypt using univariable and multivariable logistic regression analyses

    Directory of Open Access Journals (Sweden)

    Abdelfattah M. Selim

    2018-03-01

    Full Text Available Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR, univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the -2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.

  9. Geoelectrical parameter-based multivariate regression borehole yield model for predicting aquifer yield in managing groundwater resource sustainability

    Directory of Open Access Journals (Sweden)

    Kehinde Anthony Mogaji

    2016-07-01

    Full Text Available This study developed a GIS-based multivariate regression (MVR yield rate prediction model of groundwater resource sustainability in the hard-rock geology terrain of southwestern Nigeria. This model can economically manage the aquifer yield rate potential predictions that are often overlooked in groundwater resources development. The proposed model relates the borehole yield rate inventory of the area to geoelectrically derived parameters. Three sets of borehole yield rate conditioning geoelectrically derived parameters—aquifer unit resistivity (ρ, aquifer unit thickness (D and coefficient of anisotropy (λ—were determined from the acquired and interpreted geophysical data. The extracted borehole yield rate values and the geoelectrically derived parameter values were regressed to develop the MVR relationship model by applying linear regression and GIS techniques. The sensitivity analysis results of the MVR model evaluated at P ⩽ 0.05 for the predictors ρ, D and λ provided values of 2.68 × 10−05, 2 × 10−02 and 2.09 × 10−06, respectively. The accuracy and predictive power tests conducted on the MVR model using the Theil inequality coefficient measurement approach, coupled with the sensitivity analysis results, confirmed the model yield rate estimation and prediction capability. The MVR borehole yield prediction model estimates were processed in a GIS environment to model an aquifer yield potential prediction map of the area. The information on the prediction map can serve as a scientific basis for predicting aquifer yield potential rates relevant in groundwater resources sustainability management. The developed MVR borehole yield rate prediction mode provides a good alternative to other methods used for this purpose.

  10. Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: A case of the Belice River basin (western Sicily, Italy)

    Science.gov (United States)

    Conoscenti, Christian; Ciaccio, Marilena; Caraballo-Arias, Nathalie Almaru; Gómez-Gutiérrez, Álvaro; Rotigliano, Edoardo; Agnesi, Valerio

    2015-08-01

    In this paper, terrain susceptibility to earth-flow occurrence was evaluated by using geographic information systems (GIS) and two statistical methods: Logistic regression (LR) and multivariate adaptive regression splines (MARS). LR has been already demonstrated to provide reliable predictions of earth-flow occurrence, whereas MARS, as far as we know, has never been used to generate earth-flow susceptibility models. The experiment was carried out in a basin of western Sicily (Italy), which extends for 51 km2 and is severely affected by earth-flows. In total, we mapped 1376 earth-flows, covering an area of 4.59 km2. To explore the effect of pre-failure topography on earth-flow spatial distribution, we performed a reconstruction of topography before the landslide occurrence. This was achieved by preparing a digital terrain model (DTM) where altitude of areas hosting landslides was interpolated from the adjacent undisturbed land surface by using the algorithm topo-to-raster. This DTM was exploited to extract 15 morphological and hydrological variables that, in addition to outcropping lithology, were employed as explanatory variables of earth-flow spatial distribution. The predictive skill of the earth-flow susceptibility models and the robustness of the procedure were tested by preparing five datasets, each including a different subset of landslides and stable areas. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC) curves and by calculating the area under the ROC curve (AUC). The results demonstrate that the overall accuracy of LR and MARS earth-flow susceptibility models is from excellent to outstanding. However, AUC values of the validation datasets attest to a higher predictive power of MARS-models (AUC between 0.881 and 0.912) with respect to LR-models (AUC between 0.823 and 0.870). The adopted procedure proved to be resistant to overfitting and stable when changes of the learning and validation samples are

  11. Trochanteric entry femoral nails yield better femoral version and lower revision rates-A large cohort multivariate regression analysis.

    Science.gov (United States)

    Yoon, Richard S; Gage, Mark J; Galos, David K; Donegan, Derek J; Liporace, Frank A

    2017-06-01

    Intramedullary nailing (IMN) has become the standard of care for the treatment of most femoral shaft fractures. Different IMN options include trochanteric and piriformis entry as well as retrograde nails, which may result in varying degrees of femoral rotation. The objective of this study was to analyze postoperative femoral version between three types of nails and to delineate any significant differences in femoral version (DFV) and revision rates. Over a 10-year period, 417 patients underwent IMN of a diaphyseal femur fracture (AO/OTA 32A-C). Of these patients, 316 met inclusion criteria and obtained postoperative computed tomography (CT) scanograms to calculate femoral version and were thus included in the study. In this study, our main outcome measure was the difference in femoral version (DFV) between the uninjured limb and the injured limb. The effect of the following variables on DFV and revision rates were determined via univariate, multivariate, and ordinal regression analyses: gender, age, BMI, ethnicity, mechanism of injury, operative side, open fracture, and table type/position. Statistical significance was set at pregression analysis revealed that a lower BMI was significantly associated with a lower DFV (p=0.006). Controlling for possible covariables, multivariate analysis yielded a significantly lower DFV for trochanteric entry nails than piriformis or retrograde nails (7.9±6.10 vs. 9.5±7.4 vs. 9.4±7.8°, pregression analysis. However, this is not to state that the other nail types exhibited abnormal DFV. Translation to the clinical impact of a few degrees of DFV is also unknown. Future studies to more in-depth study the intricacies of femoral version may lead to improved technology in addition to potentially improved clinical outcomes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. A retrospective study: Multivariate logistic regression analysis of the outcomes after pressure sores reconstruction with fasciocutaneous, myocutaneous, and perforator flaps.

    Science.gov (United States)

    Chiu, Yu-Jen; Liao, Wen-Chieh; Wang, Tien-Hsiang; Shih, Yu-Chung; Ma, Hsu; Lin, Chih-Hsun; Wu, Szu-Hsien; Perng, Cherng-Kang

    2017-08-01

    Despite significant advances in medical care and surgical techniques, pressure sore reconstruction is still prone to elevated rates of complication and recurrence. We conducted a retrospective study to investigate not only complication and recurrence rates following pressure sore reconstruction but also preoperative risk stratification. This study included 181 ulcers underwent flap operations between January 2002 and December 2013 were included in the study. We performed a multivariable logistic regression model, which offers a regression-based method accounting for the within-patient correlation of the success or failure of each flap. The overall complication and recurrence rates for all flaps were 46.4% and 16.0%, respectively, with a mean follow-up period of 55.4 ± 38.0 months. No statistically significant differences of complication and recurrence rates were observed among three different reconstruction methods. In subsequent analysis, albumin ≤3.0 g/dl and paraplegia were significantly associated with higher postoperative complication. The anatomic factor, ischial wound location, significantly trended toward the development of ulcer recurrence. In the fasciocutaneous group, paraplegia had significant correlation to higher complication and recurrence rates. In the musculocutaneous flap group, variables had no significant correlation to complication and recurrence rates. In the free-style perforator group, ischial wound location and malnourished status correlated with significantly higher complication rates; ischial wound location also correlated with significantly higher recurrence rate. Ultimately, our review of a noteworthy cohort with lengthy follow-up helped identify and confirm certain risk factors that can facilitate a more informed and thoughtful pre- and postoperative decision-making process for patients with pressure ulcers. Copyright © 2017 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All

  13. Using multiobjective tradeoff sets and Multivariate Regression Trees to identify critical and robust decisions for long term water utility planning

    Science.gov (United States)

    Smith, R.; Kasprzyk, J. R.; Balaji, R.

    2017-12-01

    In light of deeply uncertain factors like future climate change and population shifts, responsible resource management will require new types of information and strategies. For water utilities, this entails potential expansion and efficient management of water supply infrastructure systems for changes in overall supply; changes in frequency and severity of climate extremes such as droughts and floods; and variable demands, all while accounting for conflicting long and short term performance objectives. Multiobjective Evolutionary Algorithms (MOEAs) are emerging decision support tools that have been used by researchers and, more recently, water utilities to efficiently generate and evaluate thousands of planning portfolios. The tradeoffs between conflicting objectives are explored in an automated way to produce (often large) suites of portfolios that strike different balances of performance. Once generated, the sets of optimized portfolios are used to support relatively subjective assertions of priorities and human reasoning, leading to adoption of a plan. These large tradeoff sets contain information about complex relationships between decisions and between groups of decisions and performance that, until now, has not been quantitatively described. We present a novel use of Multivariate Regression Trees (MRTs) to analyze tradeoff sets to reveal these relationships and critical decisions. Additionally, when MRTs are applied to tradeoff sets developed for different realizations of an uncertain future, they can identify decisions that are robust across a wide range of conditions and produce fundamental insights about the system being optimized.

  14. A Model for Shovel Capital Cost Estimation, Using a Hybrid Model of Multivariate Regression and Neural Networks

    Directory of Open Access Journals (Sweden)

    Abdolreza Yazdani-Chamzini

    2017-12-01

    Full Text Available Cost estimation is an essential issue in feasibility studies in civil engineering. Many different methods can be applied to modelling costs. These methods can be divided into several main groups: (1 artificial intelligence, (2 statistical methods, and (3 analytical methods. In this paper, the multivariate regression (MVR method, which is one of the most popular linear models, and the artificial neural network (ANN method, which is widely applied to solving different prediction problems with a high degree of accuracy, have been combined to provide a cost estimate model for a shovel machine. This hybrid methodology is proposed, taking the advantages of MVR and ANN models in linear and nonlinear modelling, respectively. In the proposed model, the unique advantages of the MVR model in linear modelling are used first to recognize the existing linear structure in data, and, then, the ANN for determining nonlinear patterns in preprocessed data is applied. The results with three indices indicate that the proposed model is efficient and capable of increasing the prediction accuracy.

  15. Multivariate Adaptative Regression Splines (MARS, una alternativa para el análisis de series de tiempo

    Directory of Open Access Journals (Sweden)

    Jairo Vanegas

    2017-05-01

    Full Text Available Multivariate Adaptative Regression Splines (MARS es un método de modelación no paramétrico que extiende el modelo lineal incorporando no linealidades e interacciones de variables. Es una herramienta flexible que automatiza la construcción de modelos de predicción, seleccionando variables relevantes, transformando las variables predictoras, tratando valores perdidos y previniendo sobreajustes mediante un autotest. También permite predecir tomando en cuenta factores estructurales que pudieran tener influencia sobre la variable respuesta, generando modelos hipotéticos. El resultado final serviría para identificar puntos de corte relevantes en series de datos. En el área de la salud es poco utilizado, por lo que se propone como una herramienta más para la evaluación de indicadores relevantes en salud pública. Para efectos demostrativos se utilizaron series de datos de mortalidad de menores de 5 años de Costa Rica en el periodo 1978-2008.

  16. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    Science.gov (United States)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  17. Is ovarian hyperstimulation associated with higher blood pressure in 4-year-old IVF offspring? Part I: multivariable regression analysis.

    Science.gov (United States)

    Seggers, Jorien; Haadsma, Maaike L; La Bastide-Van Gemert, Sacha; Heineman, Maas Jan; Middelburg, Karin J; Roseboom, Tessa J; Schendelaar, Pamela; Van den Heuvel, Edwin R; Hadders-Algra, Mijna

    2014-03-01

    Does ovarian hyperstimulation, the in vitro procedure, or a combination of these two negatively influence blood pressure (BP) and anthropometrics of 4-year-old children born following IVF? Higher systolic blood pressure (SBP) percentiles were found in 4-year-old children born following conventional IVF with ovarian hyperstimulation compared with children born following IVF without ovarian hyperstimulation. Increasing evidence suggests that IVF, which has an increased incidence of preterm birth and low birthweight, is associated with higher BP and altered body fat distribution in offspring but the underlying mechanisms are largely unknown. We performed a prospective, assessor-blinded follow-up study in which 194 children were assessed. The attrition rate up until the 4-year-old assessment was 10%. We measured BP and anthropometrics of 4-year-old singletons born following conventional IVF with controlled ovarian hyperstimulation (COH-IVF, n = 63), or born following modified natural cycle IV (MNC-IVF, n = 52), or born to subfertile couples who conceived naturally (Sub-NC, n = 79). Both IVF and ICSI were performed. Primary outcome measures were the SBP percentiles and diastolic BP (DBP) percentiles. Anthropometric measures included triceps and subscapular skinfold thickness. Several multivariable regression analyses were applied in order to correct for subsets of confounders. The value 'B' is the unstandardized regression coefficient. SBP percentiles were significantly lower in the MNC-IVF group (mean 59, SD 24) than in the COH-IVF (mean 68, SD 22) and Sub-NC groups (mean 70, SD 16). The difference in SBP between COH-IVF and MNC-IVF remained significant after correction for current, early life and parental characteristics (B: 14.09; 95% confidence interval (CI): 5.39-22.79), whereas the difference between MNC-IVF and Sub-NC did not. DBP percentiles did not differ between groups. After correction for early life factors, subscapular skinfold thickness was thicker in the

  18. Predictive Ability of Pender's Health Promotion Model for Physical Activity and Exercise in People with Spinal Cord Injuries: A Hierarchical Regression Analysis

    Science.gov (United States)

    Keegan, John P.; Chan, Fong; Ditchman, Nicole; Chiu, Chung-Yi

    2012-01-01

    The main objective of this study was to validate Pender's Health Promotion Model (HPM) as a motivational model for exercise/physical activity self-management for people with spinal cord injuries (SCIs). Quantitative descriptive research design using hierarchical regression analysis (HRA) was used. A total of 126 individuals with SCI were recruited…

  19. Study of risk factors affecting both hypertension and obesity outcome by using multivariate multilevel logistic regression models

    Directory of Open Access Journals (Sweden)

    Sepedeh Gholizadeh

    2016-07-01

    Full Text Available Background:Obesity and hypertension are the most important non-communicable diseases thatin many studies, the prevalence and their risk factors have been performedin each geographic region univariately.Study of factors affecting both obesity and hypertension may have an important role which to be adrressed in this study. Materials &Methods:This cross-sectional study was conducted on 1000 men aged 20-70 living in Bushehr province. Blood pressure was measured three times and the average of them was considered as one of the response variables. Hypertension was defined as systolic blood pressure ≥140 (and-or diastolic blood pressure ≥90 and obesity was defined as body mass index ≥25. Data was analyzed by using multilevel, multivariate logistic regression model by MlwiNsoftware. Results:Intra class correlations in cluster level obtained 33% for high blood pressure and 37% for obesity, so two level model was fitted to data. The prevalence of obesity and hypertension obtained 43.6% (0.95%CI; 40.6-46.5, 29.4% (0.95%CI; 26.6-32.1 respectively. Age, gender, smoking, hyperlipidemia, diabetes, fruit and vegetable consumption and physical activity were the factors affecting blood pressure (p≤0.05. Age, gender, hyperlipidemia, diabetes, fruit and vegetable consumption, physical activity and place of residence are effective on obesity (p≤0.05. Conclusion: The multilevel models with considering levels distribution provide more precise estimates. As regards obesity and hypertension are the major risk factors for cardiovascular disease, by knowing the high-risk groups we can d careful planning to prevention of non-communicable diseases and promotion of society health.

  20. Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

    Science.gov (United States)

    Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

    2018-03-01

    This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).

  1. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    Science.gov (United States)

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  2. Effects of univariate and multivariate regression on the accuracy of hydrogen quantification with laser-induced breakdown spectroscopy

    Science.gov (United States)

    Ytsma, Cai R.; Dyar, M. Darby

    2018-01-01

    Hydrogen (H) is a critical element to measure on the surface of Mars because its presence in mineral structures is indicative of past hydrous conditions. The Curiosity rover uses the laser-induced breakdown spectrometer (LIBS) on the ChemCam instrument to analyze rocks for their H emission signal at 656.6 nm, from which H can be quantified. Previous LIBS calibrations for H used small data sets measured on standards and/or manufactured mixtures of hydrous minerals and rocks and applied univariate regression to spectra normalized in a variety of ways. However, matrix effects common to LIBS make these calibrations of limited usefulness when applied to the broad range of compositions on the Martian surface. In this study, 198 naturally-occurring hydrous geological samples covering a broad range of bulk compositions with directly-measured H content are used to create more robust prediction models for measuring H in LIBS data acquired under Mars conditions. Both univariate and multivariate prediction models, including partial least square (PLS) and the least absolute shrinkage and selection operator (Lasso), are compared using several different methods for normalization of H peak intensities. Data from the ChemLIBS Mars-analog spectrometer at Mount Holyoke College are compared against spectra from the same samples acquired using a ChemCam-like instrument at Los Alamos National Laboratory and the ChemCam instrument on Mars. Results show that all current normalization and data preprocessing variations for quantifying H result in models with statistically indistinguishable prediction errors (accuracies) ca. ± 1.5 weight percent (wt%) H2O, limiting the applications of LIBS in these implementations for geological studies. This error is too large to allow distinctions among the most common hydrous phases (basalts, amphiboles, micas) to be made, though some clays (e.g., chlorites with ≈ 12 wt% H2O, smectites with 15-20 wt% H2O) and hydrated phases (e.g., gypsum with ≈ 20

  3. Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach

    Science.gov (United States)

    Michael S. Balshi; A. David McGuire; Paul Duffy; Mike Flannigan; John Walsh; Jerry Melillo

    2009-01-01

    We developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5o (latitude x longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was...

  4. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study

    Directory of Open Access Journals (Sweden)

    Tania Dehesh

    2015-01-01

    Full Text Available Background. Univariate meta-analysis (UM procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS method as a multivariate meta-analysis approach. Methods. We evaluated the efficiency of four new approaches including zero correlation (ZC, common correlation (CC, estimated correlation (EC, and multivariate multilevel correlation (MMC on the estimation bias, mean square error (MSE, and 95% probability coverage of the confidence interval (CI in the synthesis of Cox proportional hazard models coefficients in a simulation study. Result. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. Conclusion. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.

  5. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study.

    Science.gov (United States)

    Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi

    2015-01-01

    Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.

  6. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    Science.gov (United States)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  7. Multivariate research in areas of phosphorus cast-iron brake shoes manufacturing using the statistical analysis and the multiple regression equations

    Science.gov (United States)

    Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.

    2017-05-01

    The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for

  8. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  9. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  10. New strategy for determination of anthocyanins, polyphenols and antioxidant capacity of Brassica oleracea liquid extract using infrared spectroscopies and multivariate regression

    Science.gov (United States)

    de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.

    2018-04-01

    A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.

  11. Partitioning of Multivariate Phenotypes using Regression Trees Reveals Complex Patterns of Adaptation to Climate across the Range of Black Cottonwood (Populus trichocarpa

    Directory of Open Access Journals (Sweden)

    Regis Wendpouire Oubida

    2015-03-01

    Full Text Available Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13 to 0.32 and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP, and mean annual precipitation (MAP. These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar.

  12. Regressão múltipla stepwise e hierárquica em Psicologia Organizacional: aplicações, problemas e soluções Stepwise and hierarchical multiple regression in organizational psychology: Applications, problemas and solutions

    Directory of Open Access Journals (Sweden)

    Gardênia Abbad

    2002-01-01

    Full Text Available Este artigo discute algumas aplicações das técnicas de análise de regressão múltipla stepwise e hierárquica, as quais são muito utilizadas em pesquisas da área de Psicologia Organizacional. São discutidas algumas estratégias de identificação e de solução de problemas relativos à ocorrência de erros do Tipo I e II e aos fenômenos de supressão, complementaridade e redundância nas equações de regressão múltipla. São apresentados alguns exemplos de pesquisas nas quais esses padrões de associação entre variáveis estiveram presentes e descritas as estratégias utilizadas pelos pesquisadores para interpretá-los. São discutidas as aplicações dessas análises no estudo de interação entre variáveis e na realização de testes para avaliação da linearidade do relacionamento entre variáveis. Finalmente, são apresentadas sugestões para lidar com as limitações das análises de regressão múltipla (stepwise e hierárquica.This article discusses applications of stepwise and hierarchical multiple regression analyses to research in organizational psychology. Strategies for identifying type I and II errors, and solutions to potential problems that may arise from such errors are proposed. In addition, phenomena such as suppression, complementarity, and redundancy are reviewed. The article presents examples of research where these phenomena occurred, and the manner in which they were explained by researchers. Some applications of multiple regression analyses to studies involving between-variable interactions are presented, along with tests used to analyze the presence of linearity among variables. Finally, some suggestions are provided for dealing with limitations implicit in multiple regression analyses (stepwise and hierarchical.

  13. Simultaneous determination of estrogens (ethinylestradiol and norgestimate) concentrations in human and bovine serum albumin by use of fluorescence spectroscopy and multivariate regression analysis.

    Science.gov (United States)

    Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O

    2016-05-15

    The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation

  14. A New Predictive Model Based on the ABC Optimized Multivariate Adaptive Regression Splines Approach for Predicting the Remaining Useful Life in Aircraft Engines

    Directory of Open Access Journals (Sweden)

    Paulino José García Nieto

    2016-05-01

    Full Text Available Remaining useful life (RUL estimation is considered as one of the most central points in the prognostics and health management (PHM. The present paper describes a nonlinear hybrid ABC–MARS-based model for the prediction of the remaining useful life of aircraft engines. Indeed, it is well-known that an accurate RUL estimation allows failure prevention in a more controllable way so that the effective maintenance can be carried out in appropriate time to correct impending faults. The proposed hybrid model combines multivariate adaptive regression splines (MARS, which have been successfully adopted for regression problems, with the artificial bee colony (ABC technique. This optimization technique involves parameter setting in the MARS training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not yet been widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid ABC–MARS-based model from the remaining measured parameters (input variables for aircraft engines with success. A correlation coefficient equal to 0.92 was obtained when this hybrid ABC–MARS-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. The main advantage of this predictive model is that it does not require information about the previous operation states of the aircraft engine.

  15. Stock price forecasting for companies listed on Tehran stock exchange using multivariate adaptive regression splines model and semi-parametric splines technique

    Science.gov (United States)

    Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad

    2015-11-01

    One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.

  16. Multivariate analysis of nystatin and metronidazole in a semi-solid matrix by means of diffuse reflectance NIR spectroscopy and PLS regression.

    Science.gov (United States)

    Baratieri, Sabrina C; Barbosa, Juliana M; Freitas, Matheus P; Martins, José A

    2006-01-23

    A multivariate method of analysis of nystatin and metronidazole in a semi-solid matrix, based on diffuse reflectance NIR measurements and partial least squares regression, is reported. The product, a vaginal cream used in the antifungal and antibacterial treatment, is usually, quantitatively analyzed through microbiological tests (nystatin) and HPLC technique (metronidazole), according to pharmacopeial procedures. However, near infrared spectroscopy has demonstrated to be a valuable tool for content determination, given the rapidity and scope of the method. In the present study, it was successfully applied in the prediction of nystatin (even in low concentrations, ca. 0.3-0.4%, w/w, which is around 100,000 IU/5g) and metronidazole contents, as demonstrated by some figures of merit, namely linearity, precision (mean and repeatability) and accuracy.

  17. Using Apparent Density of Paper from Hardwood Kraft Pulps to Predict Sheet Properties, based on Unsupervised Classification and Multivariable Regression Techniques

    Directory of Open Access Journals (Sweden)

    Ofélia Anjos

    2015-07-01

    Full Text Available Paper properties determine the product application potential and depend on the raw material, pulping conditions, and pulp refining. The aim of this study was to construct mathematical models that predict quantitative relations between the paper density and various mechanical and optical properties of the paper. A dataset of properties of paper handsheets produced with pulps of Acacia dealbata, Acacia melanoxylon, and Eucalyptus globulus beaten at 500, 2500, and 4500 revolutions was used. Unsupervised classification techniques were combined to assess the need to perform separated prediction models for each species, and multivariable regression techniques were used to establish such prediction models. It was possible to develop models with a high goodness of fit using paper density as the independent variable (or predictor for all variables except tear index and zero-span tensile strength, both dry and wet.

  18. A comparison between univariate probabilistic and multivariate (logistic regression) methods for landslide susceptibility analysis: the example of the Febbraro valley (Northern Alps, Italy)

    Science.gov (United States)

    Rossi, M.; Apuani, T.; Felletti, F.

    2009-04-01

    The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9

  19. Multivariable Regression Analysis in Schistosoma mansoni-Infected Individuals in the Sudan Reveals Unique Immunoepidemiological Profiles in Uninfected, egg+ and Non-egg+ Infected Individuals.

    Science.gov (United States)

    Elfaki, Tayseer Elamin Mohamed; Arndts, Kathrin; Wiszniewsky, Anna; Ritter, Manuel; Goreish, Ibtisam A; Atti El Mekki, Misk El Yemen A; Arriens, Sandra; Pfarr, Kenneth; Fimmers, Rolf; Doenhoff, Mike; Hoerauf, Achim; Layland, Laura E

    2016-05-01

    In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity. This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old) from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins) and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+), n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+) and n = 61 people who were infection-free (Sm uninf). Immunoepidemiological findings were further investigated using two binary multivariable regression analysis. Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis. Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers in order to

  20. Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer

    Science.gov (United States)

    Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan

    2014-01-01

    Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions

  1. Multivariable Regression Analysis in Schistosoma mansoni-Infected Individuals in the Sudan Reveals Unique Immunoepidemiological Profiles in Uninfected, egg+ and Non-egg+ Infected Individuals.

    Directory of Open Access Journals (Sweden)

    Tayseer Elamin Mohamed Elfaki

    2016-05-01

    Full Text Available In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity.This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+, n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+ and n = 61 people who were infection-free (Sm uninf. Immunoepidemiological findings were further investigated using two binary multivariable regression analysis.Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis.Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers

  2. Study of cyanotoxins presence from experimental cyanobacteria concentrations using a new data mining methodology based on multivariate adaptive regression splines in Trasona reservoir (Northern Spain).

    Science.gov (United States)

    Garcia Nieto, P J; Sánchez Lasheras, F; de Cos Juez, F J; Alonso Fernández, J R

    2011-11-15

    There is an increasing need to describe cyanobacteria blooms since some cyanobacteria produce toxins, termed cyanotoxins. These latter can be toxic and dangerous to humans as well as other animals and life in general. It must be remarked that the cyanobacteria are reproduced explosively under certain conditions. This results in algae blooms, which can become harmful to other species if the cyanobacteria involved produce cyanotoxins. In this research work, the evolution of cyanotoxins in Trasona reservoir (Principality of Asturias, Northern Spain) was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. The results of the present study are two-fold. On one hand, the importance of the different kind of cyanobacteria over the presence of cyanotoxins in the reservoir is presented through the MARS model and on the other hand a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. The agreement of the MARS model with experimental data confirmed the good performance of the same one. Finally, conclusions of this innovative research are exposed. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. The effect of hospital mergers on long-term sickness absence among hospital employees: a fixed effects multivariate regression analysis using panel data.

    Science.gov (United States)

    Kjekshus, Lars Erik; Bernstrøm, Vilde Hoff; Dahl, Espen; Lorentzen, Thomas

    2014-02-03

    Hospitals are merging to become more cost-effective. Mergers are often complex and difficult processes with variable outcomes. The aim of this study was to analyze the effect of mergers on long-term sickness absence among hospital employees. Long-term sickness absence was analyzed among hospital employees (N = 107 209) in 57 hospitals involved in 23 mergers in Norway between 2000 and 2009. Variation in long-term sickness absence was explained through a fixed effects multivariate regression analysis using panel data with years-since-merger as the independent variable. We found a significant but modest effect of mergers on long-term sickness absence in the year of the merger, and in years 2, 3 and 4; analyzed by gender there was a significant effect for women, also for these years, but only in year 4 for men. However, men are less represented among the hospital workforce; this could explain the lack of significance. Mergers has a significant effect on employee health that should be taken into consideration when deciding to merge hospitals. This study illustrates the importance of analyzing the effects of mergers over several years and the need for more detailed analyses of merger processes and of the changes that may occur as a result of such mergers.

  4. A comparative study of artificial neural network and multivariate regression analysis to analyze optimum renal stone fragmentation by extracorporeal shock wave lithotripsy

    Directory of Open Access Journals (Sweden)

    Goyal Neeraj

    2010-01-01

    Full Text Available To compare the accuracy of artificial neural network (ANN analysis and multi-variate regression analysis (MVRA for renal stone fragmentation by extracorporeal shock wave lithotripsy (ESWL. A total of 276 patients with renal calculus were treated by ESWL during December 2001 to December 2006. Of them, the data of 196 patients were used for training the ANN. The predictability of trained ANN was tested on 80 subsequent patients. The input data include age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values were number of shocks and shock power. Of these 80 patients, the input was analyzed and output was also calculated by MVRA. The output values (predicted values from both the methods were compared and the results were drawn. The predicted and observed values of shock power and number of shocks were compared using 1:1 slope line. The results were calculated as coefficient of correlation (COC (r2 . For prediction of power, the MVRA COC was 0.0195 and ANN COC was 0.8343. For prediction of number of shocks, the MVRA COC was 0.5726 and ANN COC was 0.9329. In conclusion, ANN gives better COC than MVRA, hence could be a better tool to analyze the optimum renal stone fragmentation by ESWL.

  5. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution

    Science.gov (United States)

    Kisi, Ozgur; Parmar, Kulwinder Singh

    2016-03-01

    This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.

  6. A comparative study of artificial neural network and multivariate regression analysis to analyze optimum renal stone fragmentation by extracorporeal shock wave lithotripsy

    International Nuclear Information System (INIS)

    Neeraj K Goyal, Abhay Kumar; Sameer Trivedi

    2010-01-01

    To compare the accuracy of artificial neural network (ANN) analysis and multivariate regression analysis (MVRA) for renal stone fragmentation by extracorporeal shock wave lithotripsy (ESWL). A total of 276 patients with renal calculus were treated by ESWL during December 2001 to December 2006. Of them, the data of 196 patients were used for training the ANN. The predictability of trained ANN was tested on 80 subsequent patients. The input data include age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values) were number of shocks and shock power. Of these 80 patients, the input was analyzed and output was also calculated by MVRA. The output values (predicted values) from both the methods were compared and the results were drawn. The predicted and observed values of shock power and number of shocks were compared using 1:1 slope line. The results were calculated as coefficient of correlation (COC) (r2 ). For prediction of power, the MVRA COC was 0.0195 and ANN COC was 0.8343. For prediction of number of shocks, the MVRA COC was 0.5726 and ANN COC was 0.9329. In conclusion, ANN gives better COC than MVRA, hence could be a better tool to analyze the optimum renal stone fragmentation by ESWL (Author).

  7. The importance of trait emotional intelligence and feelings in the prediction of perceived and biological stress in adolescents: hierarchical regressions and fsQCA models.

    Science.gov (United States)

    Villanueva, Lidón; Montoya-Castilla, Inmaculada; Prado-Gascó, Vicente

    2017-07-01

    The purpose of this study is to analyze the combined effects of trait emotional intelligence (EI) and feelings on healthy adolescents' stress. Identifying the extent to which adolescent stress varies with trait emotional differences and the feelings of adolescents is of considerable interest in the development of intervention programs for fostering youth well-being. To attain this goal, self-reported questionnaires (perceived stress, trait EI, and positive/negative feelings) and biological measures of stress (hair cortisol concentrations, HCC) were collected from 170 adolescents (12-14 years old). Two different methodologies were conducted, which included hierarchical regression models and a fuzzy-set qualitative comparative analysis (fsQCA). The results support trait EI as a protective factor against stress in healthy adolescents and suggest that feelings reinforce this relation. However, the debate continues regarding the possibility of optimal levels of trait EI for effective and adaptive emotional management, particularly in the emotional attention and clarity dimensions and for female adolescents.

  8. Prosthetic alignment after total knee replacement is not associated with dissatisfaction or change in Oxford Knee Score: A multivariable regression analysis.

    Science.gov (United States)

    Huijbregts, Henricus J T A M; Khan, Riaz J K; Fick, Daniel P; Jarrett, Olivia M; Haebich, Samantha

    2016-06-01

    Approximately 18% of the patients are dissatisfied with the result of total knee replacement. However, the relation between dissatisfaction and prosthetic alignment has not been investigated before. We retrospectively analysed prospectively gathered data of all patients who had a primary TKR, preoperative and one-year postoperative Oxford Knee Scores (OKS) and postoperative computed tomography (CT). The CT protocol measures hip-knee-ankle (HKA) angle, and coronal, sagittal and axial component alignment. Satisfaction was defined using a five-item Likert scale. We dichotomised dissatisfaction by combining '(very) dissatisfied' and 'neutral/not sure'. Associations with dissatisfaction and change in OKS were calculated using multivariable logistic and linear regression models. 230 TKRs were implanted in 105 men and 106 women. At one year, 12% were (very) dissatisfied and 10% neutral. Coronal alignment of the femoral component was 0.5 degrees more accurate in patients who were satisfied at one year. The other alignment measurements were not different between satisfied and dissatisfied patients. All radiographic measurements had a P-value>0.10 on univariate analyses. At one year, dissatisfaction was associated with the three-months OKS. Change in OKS was associated with three-months OKS, preoperative physical SF-12, preoperative pain and cruciate retaining design. Neither mechanical axis, nor component alignment, is associated with dissatisfaction at one year following TKR. Patients get the best outcome when pain reduction and function improvement are optimal during the first three months and when the indication to embark on surgery is based on physical limitations rather than on a high pain score. 2. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Seasonal variation of benzo(a)pyrene in the Spanish airborne PM10. Multivariate linear regression model applied to estimate BaP concentrations.

    Science.gov (United States)

    Callén, M S; López, J M; Mastral, A M

    2010-08-15

    The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R(2)=0.817, PRESS/SSY=0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q(CV)(2)=0.813, PRESS/SSY=0.187) and with the maximal external prediction for the 2001-2002 campaign (Q(ext)(2)=0.679 and PRESS/SSY=0.321) versus the 2001-2004 campaign (Q(ext)(2)=0.551, PRESS/SSY=0.449). Copyright 2010 Elsevier B.V. All rights reserved.

  10. Seasonal variation of benzo(a)pyrene in the Spanish airborne PM10. Multivariate linear regression model applied to estimate BaP concentrations

    International Nuclear Information System (INIS)

    Callen, M.S.; Lopez, J.M.; Mastral, A.M.

    2010-01-01

    The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R 2 = 0.817, PRESS/SSY = 0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q CV 2 =0.813, PRESS/SSY = 0.187) and with the maximal external prediction for the 2001-2002 campaign (Q ext 2 =0.679 and PRESS/SSY = 0.321) versus the 2001-2004 campaign (Q ext 2 =0.551, PRESS/SSY = 0.449).

  11. Mental and physical health correlates among family caregivers of patients with newly-diagnosed incurable cancer: a hierarchical linear regression analysis.

    Science.gov (United States)

    Shaffer, Kelly M; Jacobs, Jamie M; Nipp, Ryan D; Carr, Alaina; Jackson, Vicki A; Park, Elyse R; Pirl, William F; El-Jawahri, Areej; Gallagher, Emily R; Greer, Joseph A; Temel, Jennifer S

    2017-03-01

    Caregiver, relational, and patient factors have been associated with the health of family members and friends providing care to patients with early-stage cancer. Little research has examined whether findings extend to family caregivers of patients with incurable cancer, who experience unique and substantial caregiving burdens. We examined correlates of mental and physical health among caregivers of patients with newly-diagnosed incurable lung or non-colorectal gastrointestinal cancer. At baseline for a trial of early palliative care, caregivers of participating patients (N = 275) reported their mental and physical health (Medical Outcome Survey-Short Form-36); patients reported their quality of life (Functional Assessment of Cancer Therapy-General). Analyses used hierarchical linear regression with two-tailed significance tests. Caregivers' mental health was worse than the U.S. national population (M = 44.31, p caregiver, relational, and patient factors simultaneously revealed that younger (B = 0.31, p = .001), spousal caregivers (B = -8.70, p = .003), who cared for patients reporting low emotional well-being (B = 0.51, p = .01) reported worse mental health; older (B = -0.17, p = .01) caregivers with low educational attainment (B = 4.36, p family caregivers of patients with incurable cancer, caregiver demographics, relational factors, and patient-specific factors were all related to caregiver mental health, while caregiver demographics were primarily associated with caregiver physical health. These findings help identify characteristics of family caregivers at highest risk of poor mental and physical health who may benefit from greater supportive care.

  12. Personal, social, and game-related correlates of active and non-active gaming among dutch gaming adolescents: survey-based multivariable, multilevel logistic regression analyses.

    Science.gov (United States)

    Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-04-04

    Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a correlate of both active and non-active gaming

  13. Personal, Social, and Game-Related Correlates of Active and Non-Active Gaming Among Dutch Gaming Adolescents: Survey-Based Multivariable, Multilevel Logistic Regression Analyses

    Science.gov (United States)

    de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a

  14. Modelo hierárquico multivariado da inatividade física em crianças de escolas públicas Multivariate hierarchical model for physical inactivity among public school children

    Directory of Open Access Journals (Sweden)

    Mario M. Bracco

    2006-08-01

    Full Text Available OBJETIVO: Identificar fatores biológicos e sociodemográficos atribuíveis à inatividade física em crianças de escolas públicas. MÉTODOS: Foram estudadas, através de questionário auto-relatado pelos pais, 2.519 crianças (49,3% meninas, de 7 a 10 anos (média = 7,6±0,9 anos, de oito escolas públicas da cidade de São Paulo. Aplicamos a análise de correspondência múltipla para identificar grupos de respostas relacionadas com padrões de atividade e inatividade física e a geração de uma escala ótima. A análise de agrupamento identificou os grupos de crianças ativas e inativas. A análise de curva ROC (receiver operator characteristic, para o estudo das propriedades diagnósticas de uma escala simplificada de inatividade física derivada da escala ótima, mostrou o ponto de corte = 3 como o de melhor sensibilidade e especificidade, sendo utilizado como a variável de resposta no modelo de regressão. Um modelo hierárquico multivariado foi construído, assumindo variáveis categóricas como distais e proximais, adotando-se p OBJECTIVE: To identify biological and sociodemographic factors associated with physical inactivity in public school children. METHODS: Parents of 2,519 children (49.3% of whom were girls, aged 7 to 10 years (mean = 7.6±0.9 years, from eight public schools in São Paulo, Brazil, completed a self-administered questionnaire. We used multiple correspondence analysis to identify groups of responses related to levels of physical activity and inactivity and to obtain an optimal scale. The cluster analysis identified groups of active and inactive children. The analysis of the receiver operator characteristic (ROC curve, for the study of diagnostic properties of a simplified scale for physical inactivity derived from the optimal scale, revealed that a cutoff point of 3 had the best sensitivity and specificity, being therefore used as outcome variable in the regression model. A multivariate hierarchical model was

  15. Price promotions on healthier compared with less healthy foods: a hierarchical regression analysis of the impact on sales and social patterning of responses to promotions in Great Britain.

    Science.gov (United States)

    Nakamura, Ryota; Suhrcke, Marc; Jebb, Susan A; Pechey, Rachel; Almiron-Roig, Eva; Marteau, Theresa M

    2015-04-01

    There is a growing concern, but limited evidence, that price promotions contribute to a poor diet and the social patterning of diet-related disease. We examined the following questions: 1) Are less-healthy foods more likely to be promoted than healthier foods? 2) Are consumers more responsive to promotions on less-healthy products? 3) Are there socioeconomic differences in food purchases in response to price promotions? With the use of hierarchical regression, we analyzed data on purchases of 11,323 products within 135 food and beverage categories from 26,986 households in Great Britain during 2010. Major supermarkets operated the same price promotions in all branches. The number of stores that offered price promotions on each product for each week was used to measure the frequency of price promotions. We assessed the healthiness of each product by using a nutrient profiling (NP) model. A total of 6788 products (60%) were in healthier categories and 4535 products (40%) were in less-healthy categories. There was no significant gap in the frequency of promotion by the healthiness of products neither within nor between categories. However, after we controlled for the reference price, price discount rate, and brand-specific effects, the sales uplift arising from price promotions was larger in less-healthy than in healthier categories; a 1-SD point increase in the category mean NP score, implying the category becomes less healthy, was associated with an additional 7.7-percentage point increase in sales (from 27.3% to 35.0%; P sales uplift from promotions was larger for higher-socioeconomic status (SES) groups than for lower ones (34.6% for the high-SES group, 28.1% for the middle-SES group, and 23.1% for the low-SES group). Finally, there was no significant SES gap in the absolute volume of purchases of less-healthy foods made on promotion. Attempts to limit promotions on less-healthy foods could improve the population diet but would be unlikely to reduce health

  16. The Multivariate Regression Statistics Strategy to Investigate Content-Effect Correlation of Multiple Components in Traditional Chinese Medicine Based on a Partial Least Squares Method.

    Science.gov (United States)

    Peng, Ying; Li, Su-Ning; Pei, Xuexue; Hao, Kun

    2018-03-01

    Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.

  17. The Multivariate Regression Statistics Strategy to Investigate Content-Effect Correlation of Multiple Components in Traditional Chinese Medicine Based on a Partial Least Squares Method

    Directory of Open Access Journals (Sweden)

    Ying Peng

    2018-03-01

    Full Text Available Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.

  18. Significance of volatile compounds produced by spoilage bacteria in vacuum-packed cold-smoked salmon ( Salmo salar ) analyzed by GC-MS and multivariate regression

    DEFF Research Database (Denmark)

    Jørgensen, Lasse Vigel; Huss, Hans Henrik; Dalgaard, Paw

    2001-01-01

    alcohols, which were produced by microbial activity. Partial least- squares regression of volatile compounds and sensory results allowed for a multiple compound quality index to be developed. This index was based on volatile bacterial metabolites, 1- propanol and 2-butanone, and 2-furan......, 1- penten-3-ol, and 1-propanol. The potency and importance of these compounds was confirmed by gas chromatography- olfactometry. The present study provides valuable information on the bacterial reactions responsible for spoilage off-flavors of cold-smoked salmon, which can be used to develop...

  19. Analyses of polycyclic aromatic hydrocarbon (PAH) and chiral-PAH analogues-methyl-β-cyclodextrin guest-host inclusion complexes by fluorescence spectrophotometry and multivariate regression analysis.

    Science.gov (United States)

    Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O

    2017-03-05

    The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (K b ), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated K b and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81×10 -7 M for anthracene and 3.48×10 -8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a polarized

  20. Oil condition monitoring of gears onboard ships using a regression approach for multivariate T2 control charts

    DEFF Research Database (Denmark)

    Henneberg, Morten; Jørgensen, Bent; Eriksen, René Lynge

    2016-01-01

    In this paper, we present an oil condition and wear debris evaluation method for ship thruster gears using T2 statistics to form control charts from a multi-sensor platform. The proposed method takes into account the different ambient conditions by multiple linear regression on the mean value...... only quasi-stationary data are included in phase I of the T2 statistics. Data from two thruster gears onboard two different ships are presented and analyzed, and the selection of the phase I data size is discussed. A graphic overview for quick localization of T2 signaling is also demonstrated using...... spider plots. Finally, progression and trending of the T2 statistics are investigated using orthogonal polynomials for a fix-sized data window....

  1. Utilização de regressão multivariada para avaliação espectrofotométrica da demanda química de oxigênio em amostras de relevância ambiental Use of multivariate regression in spectrophotometric evaluation of chemical oxigen demand in samples of environmental relevance

    Directory of Open Access Journals (Sweden)

    Patricio Peralta-Zamora

    2005-10-01

    Full Text Available In this work, a partial least squares regression routine was used to develop a multivariate calibration model to predict the chemical oxygen demand (COD in substrates of environmental relevance (paper effluents and landfill leachates from UV-Vis spectral data. The calibration models permit the fast determination of the COD with typical relative errors lower by 10% with respect to the conventional methodology.

  2. Linear regression

    CERN Document Server

    Olive, David J

    2017-01-01

    This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...

  3. Aortic and Hepatic Contrast Enhancement During Hepatic-Arterial and Portal Venous Phase Computed Tomography Scanning: Multivariate Linear Regression Analysis Using Age, Sex, Total Body Weight, Height, and Cardiac Output.

    Science.gov (United States)

    Masuda, Takanori; Nakaura, Takeshi; Funama, Yoshinori; Higaki, Toru; Kiguchi, Masao; Imada, Naoyuki; Sato, Tomoyasu; Awai, Kazuo

    We evaluated the effect of the age, sex, total body weight (TBW), height (HT) and cardiac output (CO) of patients on aortic and hepatic contrast enhancement during hepatic-arterial phase (HAP) and portal venous phase (PVP) computed tomography (CT) scanning. This prospective study received institutional review board approval; prior informed consent to participate was obtained from all 168 patients. All were examined using our routine protocol; the contrast material was 600 mg/kg iodine. Cardiac output was measured with a portable electrical velocimeter within 5 minutes of starting the CT scan. We calculated contrast enhancement (per gram of iodine: [INCREMENT]HU/gI) of the abdominal aorta during the HAP and of the liver parenchyma during the PVP. We performed univariate and multivariate linear regression analysis between all patient characteristics and the [INCREMENT]HU/gI of aortic- and liver parenchymal enhancement. Univariate linear regression analysis demonstrated statistically significant correlations between the [INCREMENT]HU/gI and the age, sex, TBW, HT, and CO (all P linear regression analysis showed that only the TBW and CO were of independent predictive value (P linear regression analysis only the TBW and CO were significantly correlated with aortic and liver parenchymal enhancement; the age, sex, and HT were not. The CO was the only independent factor affecting aortic and liver parenchymal enhancement at hepatic CT when the protocol was adjusted for the TBW.

  4. Multivariate analysis with LISREL

    CERN Document Server

    Jöreskog, Karl G; Y Wallentin, Fan

    2016-01-01

    This book traces the theory and methodology of multivariate statistical analysis and shows how it can be conducted in practice using the LISREL computer program. It presents not only the typical uses of LISREL, such as confirmatory factor analysis and structural equation models, but also several other multivariate analysis topics, including regression (univariate, multivariate, censored, logistic, and probit), generalized linear models, multilevel analysis, and principal component analysis. It provides numerous examples from several disciplines and discusses and interprets the results, illustrated with sections of output from the LISREL program, in the context of the example. The book is intended for masters and PhD students and researchers in the social, behavioral, economic and many other sciences who require a basic understanding of multivariate statistical theory and methods for their analysis of multivariate data. It can also be used as a textbook on various topics of multivariate statistical analysis.

  5. Robust multivariate analysis

    CERN Document Server

    J Olive, David

    2017-01-01

    This text presents methods that are robust to the assumption of a multivariate normal distribution or methods that are robust to certain types of outliers. Instead of using exact theory based on the multivariate normal distribution, the simpler and more applicable large sample theory is given.  The text develops among the first practical robust regression and robust multivariate location and dispersion estimators backed by theory.   The robust techniques  are illustrated for methods such as principal component analysis, canonical correlation analysis, and factor analysis.  A simple way to bootstrap confidence regions is also provided. Much of the research on robust multivariate analysis in this book is being published for the first time. The text is suitable for a first course in Multivariate Statistical Analysis or a first course in Robust Statistics. This graduate text is also useful for people who are familiar with the traditional multivariate topics, but want to know more about handling data sets with...

  6. Reduced Rank Regression

    DEFF Research Database (Denmark)

    Johansen, Søren

    2008-01-01

    The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...

  7. Joint genome-wide prediction in several populations accounting for randomness of genotypes: A hierarchical Bayes approach. I: Multivariate Gaussian priors for marker effects and derivation of the joint probability mass function of genotypes.

    Science.gov (United States)

    Martínez, Carlos Alberto; Khare, Kshitij; Banerjee, Arunava; Elzo, Mauricio A

    2017-03-21

    It is important to consider heterogeneity of marker effects and allelic frequencies in across population genome-wide prediction studies. Moreover, all regression models used in genome-wide prediction overlook randomness of genotypes. In this study, a family of hierarchical Bayesian models to perform across population genome-wide prediction modeling genotypes as random variables and allowing population-specific effects for each marker was developed. Models shared a common structure and differed in the priors used and the assumption about residual variances (homogeneous or heterogeneous). Randomness of genotypes was accounted for by deriving the joint probability mass function of marker genotypes conditional on allelic frequencies and pedigree information. As a consequence, these models incorporated kinship and genotypic information that not only permitted to account for heterogeneity of allelic frequencies, but also to include individuals with missing genotypes at some or all loci without the need for previous imputation. This was possible because the non-observed fraction of the design matrix was treated as an unknown model parameter. For each model, a simpler version ignoring population structure, but still accounting for randomness of genotypes was proposed. Implementation of these models and computation of some criteria for model comparison were illustrated using two simulated datasets. Theoretical and computational issues along with possible applications, extensions and refinements were discussed. Some features of the models developed in this study make them promising for genome-wide prediction, the use of information contained in the probability distribution of genotypes is perhaps the most appealing. Further studies to assess the performance of the models proposed here and also to compare them with conventional models used in genome-wide prediction are needed. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures

    Science.gov (United States)

    Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  9. Comparing near-infrared conventional diffuse reflectance spectroscopy and hyperspectral imaging for determination of the bulk properties of solid samples by multivariate regression: determination of Mooney viscosity and plasticity indices of natural rubber.

    Science.gov (United States)

    Juliano da Silva, Carlos; Pasquini, Celio

    2015-01-21

    Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample

  10. Bayesian nonparametric hierarchical modeling.

    Science.gov (United States)

    Dunson, David B

    2009-04-01

    In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.

  11. A Matlab program for stepwise regression

    Directory of Open Access Journals (Sweden)

    Yanhong Qi

    2016-03-01

    Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.

  12. Price promotions on healthier compared with less healthy foods: a hierarchical regression analysis of the impact on sales and social patterning of responses to promotions in Great Britain12345

    Science.gov (United States)

    Nakamura, Ryota; Suhrcke, Marc; Jebb, Susan A; Pechey, Rachel; Almiron-Roig, Eva; Marteau, Theresa M

    2015-01-01

    Background: There is a growing concern, but limited evidence, that price promotions contribute to a poor diet and the social patterning of diet-related disease. Objective: We examined the following questions: 1) Are less-healthy foods more likely to be promoted than healthier foods? 2) Are consumers more responsive to promotions on less-healthy products? 3) Are there socioeconomic differences in food purchases in response to price promotions? Design: With the use of hierarchical regression, we analyzed data on purchases of 11,323 products within 135 food and beverage categories from 26,986 households in Great Britain during 2010. Major supermarkets operated the same price promotions in all branches. The number of stores that offered price promotions on each product for each week was used to measure the frequency of price promotions. We assessed the healthiness of each product by using a nutrient profiling (NP) model. Results: A total of 6788 products (60%) were in healthier categories and 4535 products (40%) were in less-healthy categories. There was no significant gap in the frequency of promotion by the healthiness of products neither within nor between categories. However, after we controlled for the reference price, price discount rate, and brand-specific effects, the sales uplift arising from price promotions was larger in less-healthy than in healthier categories; a 1-SD point increase in the category mean NP score, implying the category becomes less healthy, was associated with an additional 7.7–percentage point increase in sales (from 27.3% to 35.0%; P sales uplift from promotions was larger for higher–socioeconomic status (SES) groups than for lower ones (34.6% for the high-SES group, 28.1% for the middle-SES group, and 23.1% for the low-SES group). Finally, there was no significant SES gap in the absolute volume of purchases of less-healthy foods made on promotion. Conclusion: Attempts to limit promotions on less-healthy foods could improve the

  13. Conceptual hierarchical modeling to describe wetland plant community organization

    Science.gov (United States)

    Little, A.M.; Guntenspergen, G.R.; Allen, T.F.H.

    2010-01-01

    Using multivariate analysis, we created a hierarchical modeling process that describes how differently-scaled environmental factors interact to affect wetland-scale plant community organization in a system of small, isolated wetlands on Mount Desert Island, Maine. We followed the procedure: 1) delineate wetland groups using cluster analysis, 2) identify differently scaled environmental gradients using non-metric multidimensional scaling, 3) order gradient hierarchical levels according to spatiotem-poral scale of fluctuation, and 4) assemble hierarchical model using group relationships with ordination axes and post-hoc tests of environmental differences. Using this process, we determined 1) large wetland size and poor surface water chemistry led to the development of shrub fen wetland vegetation, 2) Sphagnum and water chemistry differences affected fen vs. marsh / sedge meadows status within small wetlands, and 3) small-scale hydrologic differences explained transitions between forested vs. non-forested and marsh vs. sedge meadow vegetation. This hierarchical modeling process can help explain how upper level contextual processes constrain biotic community response to lower-level environmental changes. It creates models with more nuanced spatiotemporal complexity than classification and regression tree procedures. Using this process, wetland scientists will be able to generate more generalizable theories of plant community organization, and useful management models. ?? Society of Wetland Scientists 2009.

  14. LS-SVM: uma nova ferramenta quimiométrica para regressão multivariada. Comparação de modelos de regressão LS-SVM e PLS na quantificação de adulterantes em leite em pó empregando NIR LS-SVM: a new chemometric tool for multivariate regression. Comparison of LS-SVM and pls regression for determination of common adulterants in powdered milk by nir spectroscopy

    Directory of Open Access Journals (Sweden)

    Marco F. Ferrão

    2007-08-01

    Full Text Available Least-squares support vector machines (LS-SVM were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R², RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine precisely the amount of one and two common adulterants simultaneously in powdered milk samples using LS-SVM and NIR spectra.

  15. Dual Regression

    OpenAIRE

    Spady, Richard; Stouli, Sami

    2012-01-01

    We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...

  16. Hierarchical species distribution models

    Science.gov (United States)

    Hefley, Trevor J.; Hooten, Mevin B.

    2016-01-01

    Determining the distribution pattern of a species is important to increase scientific knowledge, inform management decisions, and conserve biodiversity. To infer spatial and temporal patterns, species distribution models have been developed for use with many sampling designs and types of data. Recently, it has been shown that count, presence-absence, and presence-only data can be conceptualized as arising from a point process distribution. Therefore, it is important to understand properties of the point process distribution. We examine how the hierarchical species distribution modeling framework has been used to incorporate a wide array of regression and theory-based components while accounting for the data collection process and making use of auxiliary information. The hierarchical modeling framework allows us to demonstrate how several commonly used species distribution models can be derived from the point process distribution, highlight areas of potential overlap between different models, and suggest areas where further research is needed.

  17. Control Multivariable por Desacoplo

    Directory of Open Access Journals (Sweden)

    Fernando Morilla

    2013-01-01

    results obtained by the authors after several years of research giving priority to the problem generalization and practical issues like easiness of implementation and utilization of PID controllers as elementary blocks. This combination of interests makes difficult to obtain perfect decoupling in all cases; although it is possible to achieve an important interaction reduction at the basic level of the control pyramid in such a way that other control systems at higher hierarchical levels benefit of this fact. This article summarizes the main aspects of decoupling control and presents its application to two illustrative examples: an experimental quadruple tank process and a 4×4 model of a heat, ventilation and air conditioning system. Palabras clave: Control de procesos, Control multivariable, Control por desacoplo, Control PID, Keywords: Process control, multivariable control, decoupling control, PID control

  18. Exploratory multivariate analysis by example using R

    CERN Document Server

    Husson, Francois; Pages, Jerome

    2010-01-01

    Full of real-world case studies and practical advice, Exploratory Multivariate Analysis by Example Using R focuses on four fundamental methods of multivariate exploratory data analysis that are most suitable for applications. It covers principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, and hierarchical cluster analysis.The authors take a geometric point of view that provides a unified vision for exploring multivariate data tables. Within this framework, they present the prin

  19. Economic viability in concrete dams by multivariable regression tool for implantation of small hydroelectric plants; Viabilidade economica em barragens de concreto pela ferramenta de regressao multivariavel para implantacao de pequenas centrais hidreletrica (PCH)

    Energy Technology Data Exchange (ETDEWEB)

    Lima, Reginaldo Agapito de [Centro Universitario de Itajuba, MG (Brazil)], email: reginaldo_agapito@yahoo.com.br; Ribeiro Junior, Leopoldo Uberto [Voltalia Energia do Brasil, Sao Paulo, SP (Brazil)], email: leopoldo_junior@yahoo.com.br

    2010-07-01

    For implantation of a SHP, the barrage is the main structure where its sizing represents from 30% - 50% of general cost of civil works. Considering this it is very important to have a fast, didactic and accurate tool for elaborating a budget, also allowing a quantitative analysis of inherent cost for civil building of barrages concrete made for small hydropower plants. In face of this, the multi changing regression tool is very important as it allows a fast and correct establishing of preliminary costs, even approximate, for estimates of barrages in concrete cost, enabling to ease the budget, guiding feasibility decisions for selecting or neglecting new alternatives of fall. (author)

  20. Multivariate Analysis and Prediction of Dioxin-Furan ...

    Science.gov (United States)

    Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE

  1. Regression Phalanxes

    OpenAIRE

    Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.

    2017-01-01

    Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...

  2. Introduction to multivariate discrimination

    Science.gov (United States)

    Kégl, Balázs

    2013-07-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  3. Introduction to multivariate discrimination

    International Nuclear Information System (INIS)

    Kegl, B.

    2013-01-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyper-parameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  4. Implementation of a multi-variable regression analysis in the assessment of the generation rate and composition of hospital solid waste for the design of a sustainable management system in developing countries.

    Science.gov (United States)

    Al-Khatib, Issam A; Abu Fkhidah, Ismail; Khatib, Jumana I; Kontogianni, Stamatia

    2016-03-01

    Forecasting of hospital solid waste generation is a critical challenge for future planning. The composition and generation rate of hospital solid waste in hospital units was the field where the proposed methodology of the present article was applied in order to validate the results and secure the outcomes of the management plan in national hospitals. A set of three multiple-variable regression models has been derived for estimating the daily total hospital waste, general hospital waste, and total hazardous waste as a function of number of inpatients, number of total patients, and number of beds. The application of several key indicators and validation procedures indicates the high significance and reliability of the developed models in predicting the hospital solid waste of any hospital. Methodology data were drawn from existent scientific literature. Also, useful raw data were retrieved from international organisations and the investigated hospitals' personnel. The primal generation outcomes are compared with other local hospitals and also with hospitals from other countries. The main outcome, which is the developed model results, are presented and analysed thoroughly. The goal is this model to act as leverage in the discussions among governmental authorities on the implementation of a national plan for safe hospital waste management in Palestine. © The Author(s) 2016.

  5. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach

    Science.gov (United States)

    Klauer, Karl Christoph

    2010-01-01

    Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…

  6. Polynomial regression analysis and significance test of the regression function

    International Nuclear Information System (INIS)

    Gao Zhengming; Zhao Juan; He Shengping

    2012-01-01

    In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)

  7. Prediction of road accidents: A Bayesian hierarchical approach

    DEFF Research Database (Denmark)

    Deublein, Markus; Schubert, Matthias; Adey, Bryan T.

    2013-01-01

    the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link......In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson......-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks...

  8. Applied multivariate statistics with R

    CERN Document Server

    Zelterman, Daniel

    2015-01-01

    This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the B...

  9. Semiparametric regression during 2003–2007

    KAUST Repository

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2009-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.

  10. Autistic Regression

    Science.gov (United States)

    Matson, Johnny L.; Kozlowski, Alison M.

    2010-01-01

    Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…

  11. Vector regression introduced

    Directory of Open Access Journals (Sweden)

    Mok Tik

    2014-06-01

    Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.

  12. Determinants of falls in community-dwelling elderly: hierarchical analysis.

    Science.gov (United States)

    Brito, Thais Alves; Coqueiro, Raildo da Silva; Fernandes, Marcos Henrique; de Jesus, Cleber Souza

    2014-01-01

    To analyze the fall-related factors in community-dwelling elderly. Epidemiologic cross-sectional population-based household study with hierarchical interrelationships among the potential risk factors. The sample was made up of noninstitutionalized individuals over age 60, who were resident of a city in Brazil's Northeast Region. The dependent variable was fall occurrence in the last 12 months; independent variables were sociodemographic, behavioral, health, and functional status factors. Multivariate hierarchical Poisson regression analysis was used based on a proposed theoretic model. Three hundred and sixteen (89.0%) elderly participated of the survey, average age 74.2 years; the majority was female, with limited literacy and had low-medium family income. The fall prevalence was of 25.8%; occurrence was related to depression symptoms (PR = 1.55) and balance limitation (PR = 1.56). The high fall prevalence among elderly necessitates the identification of fall-related factors for action planning prevention programs with this group. © 2014 Wiley Periodicals, Inc.

  13. Adaptive metric kernel regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    2000-01-01

    Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...

  14. Adaptive Metric Kernel Regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    1998-01-01

    Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...

  15. Should metacognition be measured by logistic regression?

    Science.gov (United States)

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Continuous multivariate exponential extension

    International Nuclear Information System (INIS)

    Block, H.W.

    1975-01-01

    The Freund-Weinman multivariate exponential extension is generalized to the case of nonidentically distributed marginal distributions. A fatal shock model is given for the resulting distribution. Results in the bivariate case and the concept of constant multivariate hazard rate lead to a continuous distribution related to the multivariate exponential distribution (MVE) of Marshall and Olkin. This distribution is shown to be a special case of the extended Freund-Weinman distribution. A generalization of the bivariate model of Proschan and Sullo leads to a distribution which contains both the extended Freund-Weinman distribution and the MVE

  17. Methods of Multivariate Analysis

    CERN Document Server

    Rencher, Alvin C

    2012-01-01

    Praise for the Second Edition "This book is a systematic, well-written, well-organized text on multivariate analysis packed with intuition and insight . . . There is much practical wisdom in this book that is hard to find elsewhere."-IIE Transactions Filled with new and timely content, Methods of Multivariate Analysis, Third Edition provides examples and exercises based on more than sixty real data sets from a wide variety of scientific fields. It takes a "methods" approach to the subject, placing an emphasis on how students and practitioners can employ multivariate analysis in real-life sit

  18. Advanced statistics: linear regression, part II: multiple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  19. Multivariate GARCH models

    DEFF Research Database (Denmark)

    Silvennoinen, Annastiina; Teräsvirta, Timo

    This article contains a review of multivariate GARCH models. Most common GARCH models are presented and their properties considered. This also includes nonparametric and semiparametric models. Existing specification and misspecification tests are discussed. Finally, there is an empirical example...

  20. Multivariate Time Series Search

    Data.gov (United States)

    National Aeronautics and Space Administration — Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical...

  1. Hierarchical control of a nuclear reactor using uncertain dynamics techniques

    International Nuclear Information System (INIS)

    Rovere, L.A.; Otaduy, P.J.; Brittain, C.R.; Perez, R.B.

    1988-01-01

    Recent advances in the nonlinear optimal control area are opening new possibilities towards its implementation in process control. Algorithms for multivariate control, hierarchical decomposition, parameter tracking, model uncertainties actuator saturation effects and physical limits to state variables can be implemented on the basis of a consistent mathematical formulation. In this paper, good agreement is shown between a centralized and a hierarchical implementation of a controller for a hypothetical nuclear power plant subject to multiple demands. The performance of the hierarchical distributed system in the presence of localized subsystem failures is analyzed. 4 refs., 13 figs

  2. Prediction of road accidents: A Bayesian hierarchical approach.

    Science.gov (United States)

    Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H

    2013-03-01

    In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any

  3. Scale and shape mixtures of multivariate skew-normal distributions

    KAUST Repository

    Arellano-Valle, Reinaldo B.

    2018-02-26

    We introduce a broad and flexible class of multivariate distributions obtained by both scale and shape mixtures of multivariate skew-normal distributions. We present the probabilistic properties of this family of distributions in detail and lay down the theoretical foundations for subsequent inference with this model. In particular, we study linear transformations, marginal distributions, selection representations, stochastic representations and hierarchical representations. We also describe an EM-type algorithm for maximum likelihood estimation of the parameters of the model and demonstrate its implementation on a wind dataset. Our family of multivariate distributions unifies and extends many existing models of the literature that can be seen as submodels of our proposal.

  4. imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel.

    Science.gov (United States)

    Grapov, Dmitry; Newman, John W

    2012-09-01

    Interactive modules for Data Exploration and Visualization (imDEV) is a Microsoft Excel spreadsheet embedded application providing an integrated environment for the analysis of omics data through a user-friendly interface. Individual modules enables interactive and dynamic analyses of large data by interfacing R's multivariate statistics and highly customizable visualizations with the spreadsheet environment, aiding robust inferences and generating information-rich data visualizations. This tool provides access to multiple comparisons with false discovery correction, hierarchical clustering, principal and independent component analyses, partial least squares regression and discriminant analysis, through an intuitive interface for creating high-quality two- and a three-dimensional visualizations including scatter plot matrices, distribution plots, dendrograms, heat maps, biplots, trellis biplots and correlation networks. Freely available for download at http://sourceforge.net/projects/imdev/. Implemented in R and VBA and supported by Microsoft Excel (2003, 2007 and 2010).

  5. Prospective surveillance of multivariate spatial disease data

    Science.gov (United States)

    Corberán-Vallet, A

    2012-01-01

    Surveillance systems are often focused on more than one disease within a predefined area. On those occasions when outbreaks of disease are likely to be correlated, the use of multivariate surveillance techniques integrating information from multiple diseases allows us to improve the sensitivity and timeliness of outbreak detection. In this article, we present an extension of the surveillance conditional predictive ordinate to monitor multivariate spatial disease data. The proposed surveillance technique, which is defined for each small area and time period as the conditional predictive distribution of those counts of disease higher than expected given the data observed up to the previous time period, alerts us to both small areas of increased disease incidence and the diseases causing the alarm within each area. We investigate its performance within the framework of Bayesian hierarchical Poisson models using a simulation study. An application to diseases of the respiratory system in South Carolina is finally presented. PMID:22534429

  6. Catalysis with hierarchical zeolites

    DEFF Research Database (Denmark)

    Holm, Martin Spangsberg; Taarning, Esben; Egeblad, Kresten

    2011-01-01

    Hierarchical (or mesoporous) zeolites have attracted significant attention during the first decade of the 21st century, and so far this interest continues to increase. There have already been several reviews giving detailed accounts of the developments emphasizing different aspects of this research...... topic. Until now, the main reason for developing hierarchical zeolites has been to achieve heterogeneous catalysts with improved performance but this particular facet has not yet been reviewed in detail. Thus, the present paper summaries and categorizes the catalytic studies utilizing hierarchical...... zeolites that have been reported hitherto. Prototypical examples from some of the different categories of catalytic reactions that have been studied using hierarchical zeolite catalysts are highlighted. This clearly illustrates the different ways that improved performance can be achieved with this family...

  7. Hierarchical Network Design

    DEFF Research Database (Denmark)

    Thomadsen, Tommy

    2005-01-01

    Communication networks are immensely important today, since both companies and individuals use numerous services that rely on them. This thesis considers the design of hierarchical (communication) networks. Hierarchical networks consist of layers of networks and are well-suited for coping...... with changing and increasing demands. Two-layer networks consist of one backbone network, which interconnects cluster networks. The clusters consist of nodes and links, which connect the nodes. One node in each cluster is a hub node, and the backbone interconnects the hub nodes of each cluster and thus...... the clusters. The design of hierarchical networks involves clustering of nodes, hub selection, and network design, i.e. selection of links and routing of ows. Hierarchical networks have been in use for decades, but integrated design of these networks has only been considered for very special types of networks...

  8. Micromechanics of hierarchical materials

    DEFF Research Database (Denmark)

    Mishnaevsky, Leon, Jr.

    2012-01-01

    A short overview of micromechanical models of hierarchical materials (hybrid composites, biomaterials, fractal materials, etc.) is given. Several examples of the modeling of strength and damage in hierarchical materials are summarized, among them, 3D FE model of hybrid composites...... with nanoengineered matrix, fiber bundle model of UD composites with hierarchically clustered fibers and 3D multilevel model of wood considered as a gradient, cellular material with layered composite cell walls. The main areas of research in micromechanics of hierarchical materials are identified, among them......, the investigations of the effects of load redistribution between reinforcing elements at different scale levels, of the possibilities to control different material properties and to ensure synergy of strengthening effects at different scale levels and using the nanoreinforcement effects. The main future directions...

  9. Programming with Hierarchical Maps

    DEFF Research Database (Denmark)

    Ørbæk, Peter

    This report desribes the hierarchical maps used as a central data structure in the Corundum framework. We describe its most prominent features, ague for its usefulness and briefly describe some of the software prototypes implemented using the technology....

  10. Introduction into Hierarchical Matrices

    KAUST Repository

    Litvinenko, Alexander

    2013-12-05

    Hierarchical matrices allow us to reduce computational storage and cost from cubic to almost linear. This technique can be applied for solving PDEs, integral equations, matrix equations and approximation of large covariance and precision matrices.

  11. Introduction into Hierarchical Matrices

    KAUST Repository

    Litvinenko, Alexander

    2013-01-01

    Hierarchical matrices allow us to reduce computational storage and cost from cubic to almost linear. This technique can be applied for solving PDEs, integral equations, matrix equations and approximation of large covariance and precision matrices.

  12. A MULTIVARIATE ANALYSIS OF CROATIAN COUNTIES ENTREPRENEURSHIP

    Directory of Open Access Journals (Sweden)

    Elza Jurun

    2012-12-01

    Full Text Available In the focus of this paper is a multivariate analysis of Croatian Counties entrepreneurship. Complete data base available by official statistic institutions at national and regional level is used. Modern econometric methodology starting from a comparative analysis via multiple regression to multivariate cluster analysis is carried out as well as the analysis of successful or inefficacious entrepreneurship measured by indicators of efficiency, profitability and productivity. Time horizons of the comparative analysis are in 2004 and 2010. Accelerators of socio-economic development - number of entrepreneur investors, investment in fixed assets and current assets ratio in multiple regression model are analytically filtered between twenty-six independent variables as variables of the dominant influence on GDP per capita in 2010 as dependent variable. Results of multivariate cluster analysis of twentyone Croatian Counties are interpreted also in the sense of three Croatian NUTS 2 regions according to European nomenclature of regional territorial division of Croatia.

  13. Multivariate Birkhoff interpolation

    CERN Document Server

    Lorentz, Rudolph A

    1992-01-01

    The subject of this book is Lagrange, Hermite and Birkhoff (lacunary Hermite) interpolation by multivariate algebraic polynomials. It unifies and extends a new algorithmic approach to this subject which was introduced and developed by G.G. Lorentz and the author. One particularly interesting feature of this algorithmic approach is that it obviates the necessity of finding a formula for the Vandermonde determinant of a multivariate interpolation in order to determine its regularity (which formulas are practically unknown anyways) by determining the regularity through simple geometric manipulations in the Euclidean space. Although interpolation is a classical problem, it is surprising how little is known about its basic properties in the multivariate case. The book therefore starts by exploring its fundamental properties and its limitations. The main part of the book is devoted to a complete and detailed elaboration of the new technique. A chapter with an extensive selection of finite elements follows as well a...

  14. Applied multivariate statistical analysis

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners.  It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added.  All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior.  All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...

  15. Directional quantile regression in R

    Czech Academy of Sciences Publication Activity Database

    Boček, Pavel; Šiman, Miroslav

    2017-01-01

    Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf

  16. A MULTIVARIATE WEIBULL DISTRIBUTION

    Directory of Open Access Journals (Sweden)

    Cheng Lee

    2010-07-01

    Full Text Available A multivariate survival function of Weibull Distribution is developed by expanding the theorem by Lu and Bhattacharyya. From the survival function, the probability density function, the cumulative probability function, the determinant of the Jacobian Matrix, and the general moment are derived.

  17. Multivariate realised kernels

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole; Hansen, Peter Reinhard; Lunde, Asger

    We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement noise of certain types and can also handle non-synchronous trading. It is the first estimator...

  18. Multivariate data analysis

    DEFF Research Database (Denmark)

    Hansen, Michael Adsetts Edberg

    Interest in statistical methodology is increasing so rapidly in the astronomical community that accessible introductory material in this area is long overdue. This book fills the gap by providing a presentation of the most useful techniques in multivariate statistics. A wide-ranging annotated set...

  19. Multivariate pattern dependence.

    Directory of Open Access Journals (Sweden)

    Stefano Anzellotti

    2017-11-01

    Full Text Available When we perform a cognitive task, multiple brain regions are engaged. Understanding how these regions interact is a fundamental step to uncover the neural bases of behavior. Most research on the interactions between brain regions has focused on the univariate responses in the regions. However, fine grained patterns of response encode important information, as shown by multivariate pattern analysis. In the present article, we introduce and apply multivariate pattern dependence (MVPD: a technique to study the statistical dependence between brain regions in humans in terms of the multivariate relations between their patterns of responses. MVPD characterizes the responses in each brain region as trajectories in region-specific multidimensional spaces, and models the multivariate relationship between these trajectories. We applied MVPD to the posterior superior temporal sulcus (pSTS and to the fusiform face area (FFA, using a searchlight approach to reveal interactions between these seed regions and the rest of the brain. Across two different experiments, MVPD identified significant statistical dependence not detected by standard functional connectivity. Additionally, MVPD outperformed univariate connectivity in its ability to explain independent variance in the responses of individual voxels. In the end, MVPD uncovered different connectivity profiles associated with different representational subspaces of FFA: the first principal component of FFA shows differential connectivity with occipital and parietal regions implicated in the processing of low-level properties of faces, while the second and third components show differential connectivity with anterior temporal regions implicated in the processing of invariant representations of face identity.

  20. Parallel hierarchical radiosity rendering

    Energy Technology Data Exchange (ETDEWEB)

    Carter, Michael [Iowa State Univ., Ames, IA (United States)

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  1. Neutrosophic Hierarchical Clustering Algoritms

    Directory of Open Access Journals (Sweden)

    Rıdvan Şahin

    2014-03-01

    Full Text Available Interval neutrosophic set (INS is a generalization of interval valued intuitionistic fuzzy set (IVIFS, whose the membership and non-membership values of elements consist of fuzzy range, while single valued neutrosophic set (SVNS is regarded as extension of intuitionistic fuzzy set (IFS. In this paper, we extend the hierarchical clustering techniques proposed for IFSs and IVIFSs to SVNSs and INSs respectively. Based on the traditional hierarchical clustering procedure, the single valued neutrosophic aggregation operator, and the basic distance measures between SVNSs, we define a single valued neutrosophic hierarchical clustering algorithm for clustering SVNSs. Then we extend the algorithm to classify an interval neutrosophic data. Finally, we present some numerical examples in order to show the effectiveness and availability of the developed clustering algorithms.

  2. Differentiating regressed melanoma from regressed lichenoid keratosis.

    Science.gov (United States)

    Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

    2017-04-01

    Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  3. Multivariate wavelet frames

    CERN Document Server

    Skopina, Maria; Protasov, Vladimir

    2016-01-01

    This book presents a systematic study of multivariate wavelet frames with matrix dilation, in particular, orthogonal and bi-orthogonal bases, which are a special case of frames. Further, it provides algorithmic methods for the construction of dual and tight wavelet frames with a desirable approximation order, namely compactly supported wavelet frames, which are commonly required by engineers. It particularly focuses on methods of constructing them. Wavelet bases and frames are actively used in numerous applications such as audio and graphic signal processing, compression and transmission of information. They are especially useful in image recovery from incomplete observed data due to the redundancy of frame systems. The construction of multivariate wavelet frames, especially bases, with desirable properties remains a challenging problem as although a general scheme of construction is well known, its practical implementation in the multidimensional setting is difficult. Another important feature of wavelet is ...

  4. Multivariate calculus and geometry

    CERN Document Server

    Dineen, Seán

    2014-01-01

    Multivariate calculus can be understood best by combining geometric insight, intuitive arguments, detailed explanations and mathematical reasoning. This textbook has successfully followed this programme. It additionally provides a solid description of the basic concepts, via familiar examples, which are then tested in technically demanding situations. In this new edition the introductory chapter and two of the chapters on the geometry of surfaces have been revised. Some exercises have been replaced and others provided with expanded solutions. Familiarity with partial derivatives and a course in linear algebra are essential prerequisites for readers of this book. Multivariate Calculus and Geometry is aimed primarily at higher level undergraduates in the mathematical sciences. The inclusion of many practical examples involving problems of several variables will appeal to mathematics, science and engineering students.

  5. Intelligent multivariate process supervision

    International Nuclear Information System (INIS)

    Visuri, Pertti.

    1986-01-01

    This thesis addresses the difficulties encountered in managing large amounts of data in supervisory control of complex systems. Some previous alarm and disturbance analysis concepts are reviewed and a method for improving the supervision of complex systems is presented. The method, called multivariate supervision, is based on adding low level intelligence to the process control system. By using several measured variables linked together by means of deductive logic, the system can take into account the overall state of the supervised system. Thus, it can present to the operators fewer messages with higher information content than the conventional control systems which are based on independent processing of each variable. In addition, the multivariate method contains a special information presentation concept for improving the man-machine interface. (author)

  6. Multivariate rational data fitting

    Science.gov (United States)

    Cuyt, Annie; Verdonk, Brigitte

    1992-12-01

    Sections 1 and 2 discuss the advantages of an object-oriented implementation combined with higher floating-point arithmetic, of the algorithms available for multivariate data fitting using rational functions. Section 1 will in particular explain what we mean by "higher arithmetic". Section 2 will concentrate on the concepts of "object orientation". In sections 3 and 4 we shall describe the generality of the data structure that can be dealt with: due to some new results virtually every data set is acceptable right now, with possible coalescence of coordinates or points. In order to solve the multivariate rational interpolation problem the data sets are fed to different algorithms depending on the structure of the interpolation points in then-variate space.

  7. Transient multivariable sensor evaluation

    Energy Technology Data Exchange (ETDEWEB)

    Vilim, Richard B.; Heifetz, Alexander

    2017-02-21

    A method and system for performing transient multivariable sensor evaluation. The method and system includes a computer system for identifying a model form, providing training measurement data, generating a basis vector, monitoring system data from sensor, loading the system data in a non-transient memory, performing an estimation to provide desired data and comparing the system data to the desired data and outputting an alarm for a defective sensor.

  8. What are hierarchical models and how do we analyze them?

    Science.gov (United States)

    Royle, Andy

    2016-01-01

    In this chapter we provide a basic definition of hierarchical models and introduce the two canonical hierarchical models in this book: site occupancy and N-mixture models. The former is a hierarchical extension of logistic regression and the latter is a hierarchical extension of Poisson regression. We introduce basic concepts of probability modeling and statistical inference including likelihood and Bayesian perspectives. We go through the mechanics of maximizing the likelihood and characterizing the posterior distribution by Markov chain Monte Carlo (MCMC) methods. We give a general perspective on topics such as model selection and assessment of model fit, although we demonstrate these topics in practice in later chapters (especially Chapters 5, 6, 7, and 10 Chapter 5 Chapter 6 Chapter 7 Chapter 10)

  9. Determination of sulfamethoxazole and trimethoprim mixtures by multivariate electronic spectroscopy

    OpenAIRE

    Cordeiro, Gilcélia A.; Peralta-Zamora, Patricio; Nagata, Noemi; Pontarollo, Roberto

    2008-01-01

    In this work a multivariate spectroscopic methodology is proposed for quantitative determination of sulfamethoxazole and trimethoprim in pharmaceutical associations. The multivariate model was developed by partial least-squares regression, using twenty synthetic mixtures and the spectral region between 190 and 350 nm. In the validation stage, which involved the analysis of five synthetic mixtures, prediction errors lower that 3% were observed. The predictive capacity of the multivariate model...

  10. Regression: A Bibliography.

    Science.gov (United States)

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  11. Hierarchical wave functions revisited

    International Nuclear Information System (INIS)

    Li Dingping.

    1997-11-01

    We study the hierarchical wave functions on a sphere and on a torus. We simplify some wave functions on a sphere or a torus using the analytic properties of wave functions. The open question, the construction of the wave function for quasi electron excitation on a torus, is also solved in this paper. (author)

  12. Hierarchical Porous Structures

    Energy Technology Data Exchange (ETDEWEB)

    Grote, Christopher John [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-06-07

    Materials Design is often at the forefront of technological innovation. While there has always been a push to generate increasingly low density materials, such as aero or hydrogels, more recently the idea of bicontinuous structures has gone more into play. This review will cover some of the methods and applications for generating both porous, and hierarchically porous structures.

  13. The Hierarchical Perspective

    Directory of Open Access Journals (Sweden)

    Daniel Sofron

    2015-05-01

    Full Text Available This paper is focused on the hierarchical perspective, one of the methods for representing space that was used before the discovery of the Renaissance linear perspective. The hierarchical perspective has a more or less pronounced scientific character and its study offers us a clear image of the way the representatives of the cultures that developed it used to perceive the sensitive reality. This type of perspective is an original method of representing three-dimensional space on a flat surface, which characterises the art of Ancient Egypt and much of the art of the Middle Ages, being identified in the Eastern European Byzantine art, as well as in the Western European Pre-Romanesque and Romanesque art. At the same time, the hierarchical perspective is also present in naive painting and infantile drawing. Reminiscences of this method can be recognised also in the works of some precursors of the Italian Renaissance. The hierarchical perspective can be viewed as a subjective ranking criterion, according to which the elements are visually represented by taking into account their relevance within the image while perception is ignored. This paper aims to show how the main objective of the artists of those times was not to faithfully represent the objective reality, but rather to emphasize the essence of the world and its perennial aspects. This may represent a possible explanation for the refusal of perspective in the Egyptian, Romanesque and Byzantine painting, characterised by a marked two-dimensionality.

  14. Proposing a Hierarchical Utility Package with Reference to Mobile Advertising

    OpenAIRE

    Shalini N. Tripathi; Masood H. Siddiqui

    2011-01-01

    Mobile advertising is a powerful tool for direct and interactive marketing. However effective marketing requires examining consumers’ psyche. This study proposes a hierarchical utility package (in the consumers’ perception) with reference to mobile advertising, thus enhancing its acceptance. Confirmatory factor analysis revealed four consolidated utility dimensions (with reference to mobile advertising). Binary logistic regression was used to create a hierarchical utility package with res...

  15. Canonical variate regression.

    Science.gov (United States)

    Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun

    2016-07-01

    In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. Multivariate realised kernels

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger

    2011-01-01

    We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement error of certain types and can also handle non-synchronous trading. It is the first estimator...... which has these three properties which are all essential for empirical work in this area. We derive the large sample asymptotics of this estimator and assess its accuracy using a Monte Carlo study. We implement the estimator on some US equity data, comparing our results to previous work which has used...

  17. Precision Index in the Multivariate Context

    Czech Academy of Sciences Publication Activity Database

    Šiman, Miroslav

    2014-01-01

    Roč. 43, č. 2 (2014), s. 377-387 ISSN 0361-0926 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : data depth * multivariate quantile * process capability index * precision index * regression quantile Subject RIV: BA - General Mathematics Impact factor: 0.274, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/siman-0425059.pdf

  18. Hierarchical spatial point process analysis for a plant community with high biodiversity

    DEFF Research Database (Denmark)

    Illian, Janine B.; Møller, Jesper; Waagepetersen, Rasmus

    2009-01-01

    A complex multivariate spatial point pattern of a plant community with high biodiversity is modelled using a hierarchical multivariate point process model. In the model, interactions between plants with different post-fire regeneration strategies are of key interest. We consider initially a maxim...

  19. Hierarchically Structured Electrospun Fibers

    Science.gov (United States)

    2013-01-07

    in the natural lotus and silver ragwort leaves. Figure 4. Examples of electrospun bio-mimics of natural hierarchical structures. (A) Lotus leaf...B) pillared poly(methyl methacrylate) (PMMA) electrospun fiber mimic; (C) silver ragwort leaf; (D) electrospun fiber mimic made from nylon 6 and...domains containing the protein in the surrounding EVA fibers [115]. A wide variety of core-shell fibers have been generated, including PCL/ gelatin

  20. On Bayesian shared component disease mapping and ecological regression with errors in covariates.

    Science.gov (United States)

    MacNab, Ying C

    2010-05-20

    Recent literature on Bayesian disease mapping presents shared component models (SCMs) for joint spatial modeling of two or more diseases with common risk factors. In this study, Bayesian hierarchical formulations of shared component disease mapping and ecological models are explored and developed in the context of ecological regression, taking into consideration errors in covariates. A review of multivariate disease mapping models (MultiVMs) such as the multivariate conditional autoregressive models that are also part of the more recent Bayesian disease mapping literature is presented. Some insights into the connections and distinctions between the SCM and MultiVM procedures are communicated. Important issues surrounding (appropriate) formulation of shared- and disease-specific components, consideration/choice of spatial or non-spatial random effects priors, and identification of model parameters in SCMs are explored and discussed in the context of spatial and ecological analysis of small area multivariate disease or health outcome rates and associated ecological risk factors. The methods are illustrated through an in-depth analysis of four-variate road traffic accident injury (RTAI) data: gender-specific fatal and non-fatal RTAI rates in 84 local health areas in British Columbia (Canada). Fully Bayesian inference via Markov chain Monte Carlo simulations is presented. Copyright 2010 John Wiley & Sons, Ltd.

  1. Hierarchical video summarization

    Science.gov (United States)

    Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.

    1998-12-01

    We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.

  2. Hierarchically Structured Electrospun Fibers

    Directory of Open Access Journals (Sweden)

    Nicole E. Zander

    2013-01-01

    Full Text Available Traditional electrospun nanofibers have a myriad of applications ranging from scaffolds for tissue engineering to components of biosensors and energy harvesting devices. The generally smooth one-dimensional structure of the fibers has stood as a limitation to several interesting novel applications. Control of fiber diameter, porosity and collector geometry will be briefly discussed, as will more traditional methods for controlling fiber morphology and fiber mat architecture. The remainder of the review will focus on new techniques to prepare hierarchically structured fibers. Fibers with hierarchical primary structures—including helical, buckled, and beads-on-a-string fibers, as well as fibers with secondary structures, such as nanopores, nanopillars, nanorods, and internally structured fibers and their applications—will be discussed. These new materials with helical/buckled morphology are expected to possess unique optical and mechanical properties with possible applications for negative refractive index materials, highly stretchable/high-tensile-strength materials, and components in microelectromechanical devices. Core-shell type fibers enable a much wider variety of materials to be electrospun and are expected to be widely applied in the sensing, drug delivery/controlled release fields, and in the encapsulation of live cells for biological applications. Materials with a hierarchical secondary structure are expected to provide new superhydrophobic and self-cleaning materials.

  3. Multivariable calculus with applications

    CERN Document Server

    Lax, Peter D

    2017-01-01

    This text in multivariable calculus fosters comprehension through meaningful explanations. Written with students in mathematics, the physical sciences, and engineering in mind, it extends concepts from single variable calculus such as derivative, integral, and important theorems to partial derivatives, multiple integrals, Stokes’ and divergence theorems. Students with a background in single variable calculus are guided through a variety of problem solving techniques and practice problems. Examples from the physical sciences are utilized to highlight the essential relationship between calculus and modern science. The symbiotic relationship between science and mathematics is shown by deriving and discussing several conservation laws, and vector calculus is utilized to describe a number of physical theories via partial differential equations. Students will learn that mathematics is the language that enables scientific ideas to be precisely formulated and that science is a source for the development of mathemat...

  4. Multivariate Statistical Process Control

    DEFF Research Database (Denmark)

    Kulahci, Murat

    2013-01-01

    As sensor and computer technology continues to improve, it becomes a normal occurrence that we confront with high dimensional data sets. As in many areas of industrial statistics, this brings forth various challenges in statistical process control (SPC) and monitoring for which the aim...... is to identify “out-of-control” state of a process using control charts in order to reduce the excessive variation caused by so-called assignable causes. In practice, the most common method of monitoring multivariate data is through a statistic akin to the Hotelling’s T2. For high dimensional data with excessive...... amount of cross correlation, practitioners are often recommended to use latent structures methods such as Principal Component Analysis to summarize the data in only a few linear combinations of the original variables that capture most of the variation in the data. Applications of these control charts...

  5. Practical multivariate analysis

    CERN Document Server

    Afifi, Abdelmonem; Clark, Virginia A

    2011-01-01

    ""First of all, it is very easy to read. … The authors manage to introduce and (at least partially) explain even quite complex concepts, e.g. eigenvalues, in an easy and pedagogical way that I suppose is attractive to readers without deeper statistical knowledge. The text is also sprinkled with references for those who want to probe deeper into a certain topic. Secondly, I personally find the book's emphasis on practical data handling very appealing. … Thirdly, the book gives very nice coverage of regression analysis. … this is a nicely written book that gives a good overview of a large number

  6. Multivariate statistical methods a primer

    CERN Document Server

    Manly, Bryan FJ

    2004-01-01

    THE MATERIAL OF MULTIVARIATE ANALYSISExamples of Multivariate DataPreview of Multivariate MethodsThe Multivariate Normal DistributionComputer ProgramsGraphical MethodsChapter SummaryReferencesMATRIX ALGEBRAThe Need for Matrix AlgebraMatrices and VectorsOperations on MatricesMatrix InversionQuadratic FormsEigenvalues and EigenvectorsVectors of Means and Covariance MatricesFurther Reading Chapter SummaryReferencesDISPLAYING MULTIVARIATE DATAThe Problem of Displaying Many Variables in Two DimensionsPlotting index VariablesThe Draftsman's PlotThe Representation of Individual Data P:ointsProfiles o

  7. Multivariate methods and forecasting with IBM SPSS statistics

    CERN Document Server

    Aljandali, Abdulkader

    2017-01-01

    This is the second of a two-part guide to quantitative analysis using the IBM SPSS Statistics software package; this volume focuses on multivariate statistical methods and advanced forecasting techniques. More often than not, regression models involve more than one independent variable. For example, forecasting methods are commonly applied to aggregates such as inflation rates, unemployment, exchange rates, etc., that have complex relationships with determining variables. This book introduces multivariate regression models and provides examples to help understand theory underpinning the model. The book presents the fundamentals of multivariate regression and then moves on to examine several related techniques that have application in business-orientated fields such as logistic and multinomial regression. Forecasting tools such as the Box-Jenkins approach to time series modeling are introduced, as well as exponential smoothing and naïve techniques. This part also covers hot topics such as Factor Analysis, Dis...

  8. Context updates are hierarchical

    Directory of Open Access Journals (Sweden)

    Anton Karl Ingason

    2016-10-01

    Full Text Available This squib studies the order in which elements are added to the shared context of interlocutors in a conversation. It focuses on context updates within one hierarchical structure and argues that structurally higher elements are entered into the context before lower elements, even if the structurally higher elements are pronounced after the lower elements. The crucial data are drawn from a comparison of relative clauses in two head-initial languages, English and Icelandic, and two head-final languages, Korean and Japanese. The findings have consequences for any theory of a dynamic semantics.

  9. Function approximation with polynomial regression slines

    International Nuclear Information System (INIS)

    Urbanski, P.

    1996-01-01

    Principles of the polynomial regression splines as well as algorithms and programs for their computation are presented. The programs prepared using software package MATLAB are generally intended for approximation of the X-ray spectra and can be applied in the multivariate calibration of radiometric gauges. (author)

  10. Detecting Hierarchical Structure in Networks

    DEFF Research Database (Denmark)

    Herlau, Tue; Mørup, Morten; Schmidt, Mikkel Nørgaard

    2012-01-01

    Many real-world networks exhibit hierarchical organization. Previous models of hierarchies within relational data has focused on binary trees; however, for many networks it is unknown whether there is hierarchical structure, and if there is, a binary tree might not account well for it. We propose...... a generative Bayesian model that is able to infer whether hierarchies are present or not from a hypothesis space encompassing all types of hierarchical tree structures. For efficient inference we propose a collapsed Gibbs sampling procedure that jointly infers a partition and its hierarchical structure....... On synthetic and real data we demonstrate that our model can detect hierarchical structure leading to better link-prediction than competing models. Our model can be used to detect if a network exhibits hierarchical structure, thereby leading to a better comprehension and statistical account the network....

  11. Hierarchical quark mass matrices

    International Nuclear Information System (INIS)

    Rasin, A.

    1998-02-01

    I define a set of conditions that the most general hierarchical Yukawa mass matrices have to satisfy so that the leading rotations in the diagonalization matrix are a pair of (2,3) and (1,2) rotations. In addition to Fritzsch structures, examples of such hierarchical structures include also matrices with (1,3) elements of the same order or even much larger than the (1,2) elements. Such matrices can be obtained in the framework of a flavor theory. To leading order, the values of the angle in the (2,3) plane (s 23 ) and the angle in the (1,2) plane (s 12 ) do not depend on the order in which they are taken when diagonalizing. We find that any of the Cabbibo-Kobayashi-Maskawa matrix parametrizations that consist of at least one (1,2) and one (2,3) rotation may be suitable. In the particular case when the s 13 diagonalization angles are sufficiently small compared to the product s 12 s 23 , two special CKM parametrizations emerge: the R 12 R 23 R 12 parametrization follows with s 23 taken before the s 12 rotation, and vice versa for the R 23 R 12 R 23 parametrization. (author)

  12. Hierarchical partial order ranking

    International Nuclear Information System (INIS)

    Carlsen, Lars

    2008-01-01

    Assessing the potential impact on environmental and human health from the production and use of chemicals or from polluted sites involves a multi-criteria evaluation scheme. A priori several parameters are to address, e.g., production tonnage, specific release scenarios, geographical and site-specific factors in addition to various substance dependent parameters. Further socio-economic factors may be taken into consideration. The number of parameters to be included may well appear to be prohibitive for developing a sensible model. The study introduces hierarchical partial order ranking (HPOR) that remedies this problem. By HPOR the original parameters are initially grouped based on their mutual connection and a set of meta-descriptors is derived representing the ranking corresponding to the single groups of descriptors, respectively. A second partial order ranking is carried out based on the meta-descriptors, the final ranking being disclosed though average ranks. An illustrative example on the prioritisation of polluted sites is given. - Hierarchical partial order ranking of polluted sites has been developed for prioritization based on a large number of parameters

  13. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius; Huser, Raphaë l; Prasad, Avinash

    2017-01-01

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  14. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius

    2017-07-03

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  15. Scale of association: hierarchical linear models and the measurement of ecological systems

    Science.gov (United States)

    Sean M. McMahon; Jeffrey M. Diez

    2007-01-01

    A fundamental challenge to understanding patterns in ecological systems lies in employing methods that can analyse, test and draw inference from measured associations between variables across scales. Hierarchical linear models (HLM) use advanced estimation algorithms to measure regression relationships and variance-covariance parameters in hierarchically structured...

  16. Multivariate statistics exercises and solutions

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    The authors present tools and concepts of multivariate data analysis by means of exercises and their solutions. The first part is devoted to graphical techniques. The second part deals with multivariate random variables and presents the derivation of estimators and tests for various practical situations. The last part introduces a wide variety of exercises in applied multivariate data analysis. The book demonstrates the application of simple calculus and basic multivariate methods in real life situations. It contains altogether more than 250 solved exercises which can assist a university teacher in setting up a modern multivariate analysis course. All computer-based exercises are available in the R language. All R codes and data sets may be downloaded via the quantlet download center  www.quantlet.org or via the Springer webpage. For interactive display of low-dimensional projections of a multivariate data set, we recommend GGobi.

  17. Applied Bayesian hierarchical methods

    National Research Council Canada - National Science Library

    Congdon, P

    2010-01-01

    .... It also incorporates BayesX code, which is particularly useful in nonlinear regression. To demonstrate MCMC sampling from first principles, the author includes worked examples using the R package...

  18. Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis

    Science.gov (United States)

    Luo, Wen; Azen, Razia

    2013-01-01

    Dominance analysis (DA) is a method used to evaluate the relative importance of predictors that was originally proposed for linear regression models. This article proposes an extension of DA that allows researchers to determine the relative importance of predictors in hierarchical linear models (HLM). Commonly used measures of model adequacy in…

  19. Regression analysis by example

    CERN Document Server

    Chatterjee, Samprit

    2012-01-01

    Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

  20. Transmutations across hierarchical levels

    International Nuclear Information System (INIS)

    O'Neill, R.V.

    1977-01-01

    The development of large-scale ecological models depends implicitly on a concept known as hierarchy theory which views biological systems in a series of hierarchical levels (i.e., organism, population, trophic level, ecosystem). The theory states that an explanation of a biological phenomenon is provided when it is shown to be the consequence of the activities of the system's components, which are themselves systems in the next lower level of the hierarchy. Thus, the behavior of a population is explained by the behavior of the organisms in the population. The initial step in any modeling project is, therefore, to identify the system components and the interactions between them. A series of examples of transmutations in aquatic and terrestrial ecosystems are presented to show how and why changes occur. The types of changes are summarized and possible implications of transmutation for hierarchy theory, for the modeler, and for the ecological theoretician are discussed

  1. Trees and Hierarchical Structures

    CERN Document Server

    Haeseler, Arndt

    1990-01-01

    The "raison d'etre" of hierarchical dustering theory stems from one basic phe­ nomenon: This is the notorious non-transitivity of similarity relations. In spite of the fact that very often two objects may be quite similar to a third without being that similar to each other, one still wants to dassify objects according to their similarity. This should be achieved by grouping them into a hierarchy of non-overlapping dusters such that any two objects in ~ne duster appear to be more related to each other than they are to objects outside this duster. In everyday life, as well as in essentially every field of scientific investigation, there is an urge to reduce complexity by recognizing and establishing reasonable das­ sification schemes. Unfortunately, this is counterbalanced by the experience of seemingly unavoidable deadlocks caused by the existence of sequences of objects, each comparatively similar to the next, but the last rather different from the first.

  2. Optimisation by hierarchical search

    Science.gov (United States)

    Zintchenko, Ilia; Hastings, Matthew; Troyer, Matthias

    2015-03-01

    Finding optimal values for a set of variables relative to a cost function gives rise to some of the hardest problems in physics, computer science and applied mathematics. Although often very simple in their formulation, these problems have a complex cost function landscape which prevents currently known algorithms from efficiently finding the global optimum. Countless techniques have been proposed to partially circumvent this problem, but an efficient method is yet to be found. We present a heuristic, general purpose approach to potentially improve the performance of conventional algorithms or special purpose hardware devices by optimising groups of variables in a hierarchical way. We apply this approach to problems in combinatorial optimisation, machine learning and other fields.

  3. Processing data collected from radiometric experiments by multivariate technique

    International Nuclear Information System (INIS)

    Urbanski, P.; Kowalska, E.; Machaj, B.; Jakowiuk, A.

    2005-01-01

    Multivariate techniques applied for processing data collected from radiometric experiments can provide more efficient extraction of the information contained in the spectra. Several techniques are considered: (i) multivariate calibration using Partial Least Square Regression and Artificial Neural Network, (ii) standardization of the spectra, (iii) smoothing of collected spectra were autocorrelation function and bootstrap were used for the assessment of the processed data, (iv) image processing using Principal Component Analysis. Application of these techniques is illustrated on examples of some industrial applications. (author)

  4. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    Science.gov (United States)

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  5. Distributed Monitoring of the R2 Statistic for Linear Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and...

  6. How hierarchical is language use?

    Science.gov (United States)

    Frank, Stefan L.; Bod, Rens; Christiansen, Morten H.

    2012-01-01

    It is generally assumed that hierarchical phrase structure plays a central role in human language. However, considerations of simplicity and evolutionary continuity suggest that hierarchical structure should not be invoked too hastily. Indeed, recent neurophysiological, behavioural and computational studies show that sequential sentence structure has considerable explanatory power and that hierarchical processing is often not involved. In this paper, we review evidence from the recent literature supporting the hypothesis that sequential structure may be fundamental to the comprehension, production and acquisition of human language. Moreover, we provide a preliminary sketch outlining a non-hierarchical model of language use and discuss its implications and testable predictions. If linguistic phenomena can be explained by sequential rather than hierarchical structure, this will have considerable impact in a wide range of fields, such as linguistics, ethology, cognitive neuroscience, psychology and computer science. PMID:22977157

  7. Quantile Regression Methods

    DEFF Research Database (Denmark)

    Fitzenberger, Bernd; Wilke, Ralf Andreas

    2015-01-01

    if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...

  8. Multivariate analysis: models and method

    International Nuclear Information System (INIS)

    Sanz Perucha, J.

    1990-01-01

    Data treatment techniques are increasingly used since computer methods result of wider access. Multivariate analysis consists of a group of statistic methods that are applied to study objects or samples characterized by multiple values. A final goal is decision making. The paper describes the models and methods of multivariate analysis

  9. Model Checking Multivariate State Rewards

    DEFF Research Database (Denmark)

    Nielsen, Bo Friis; Nielson, Flemming; Nielson, Hanne Riis

    2010-01-01

    We consider continuous stochastic logics with state rewards that are interpreted over continuous time Markov chains. We show how results from multivariate phase type distributions can be used to obtain higher-order moments for multivariate state rewards (including covariance). We also generalise...

  10. Multivariate analysis methods in physics

    International Nuclear Information System (INIS)

    Wolter, M.

    2007-01-01

    A review of multivariate methods based on statistical training is given. Several multivariate methods useful in high-energy physics analysis are discussed. Selected examples from current research in particle physics are discussed, both from the on-line trigger selection and from the off-line analysis. Also statistical training methods are presented and some new application are suggested [ru

  11. Multivariate covariance generalized linear models

    DEFF Research Database (Denmark)

    Bonat, W. H.; Jørgensen, Bent

    2016-01-01

    are fitted by using an efficient Newton scoring algorithm based on quasi-likelihood and Pearson estimating functions, using only second-moment assumptions. This provides a unified approach to a wide variety of types of response variables and covariance structures, including multivariate extensions......We propose a general framework for non-normal multivariate data analysis called multivariate covariance generalized linear models, designed to handle multivariate response variables, along with a wide range of temporal and spatial correlation structures defined in terms of a covariance link...... function combined with a matrix linear predictor involving known matrices. The method is motivated by three data examples that are not easily handled by existing methods. The first example concerns multivariate count data, the second involves response variables of mixed types, combined with repeated...

  12. Leadership styles across hierarchical levels in nursing departments.

    Science.gov (United States)

    Stordeur, S; Vandenberghe, C; D'hoore, W

    2000-01-01

    Some researchers have reported on the cascading effect of transformational leadership across hierarchical levels. One study examined this effect in nursing, but it was limited to a single hospital. To examine the cascading effect of leadership styles across hierarchical levels in a sample of nursing departments and to investigate the effect of hierarchical level on the relationships between leadership styles and various work outcomes. Based on a sample of eight hospitals, the cascading effect was tested using correlation analysis. The main sources of variation among leadership scores were determined with analyses of variance (ANOVA), and the interaction effect of hierarchical level and leadership styles on criterion variables was tested with moderated regression analysis. No support was found for a cascading effect of leadership across hierarchical levels. Rather, the variation of leadership scores was explained primarily by the organizational context. Transformational leadership had a stronger impact on criterion variables than transactional leadership. Interaction effects between leadership styles and hierarchical level were observed only for perceived unit effectiveness. The hospital's structure and culture are major determinants of leadership styles.

  13. Growing hierarchical probabilistic self-organizing graphs.

    Science.gov (United States)

    López-Rubio, Ezequiel; Palomo, Esteban José

    2011-07-01

    Since the introduction of the growing hierarchical self-organizing map, much work has been done on self-organizing neural models with a dynamic structure. These models allow adjusting the layers of the model to the features of the input dataset. Here we propose a new self-organizing model which is based on a probabilistic mixture of multivariate Gaussian components. The learning rule is derived from the stochastic approximation framework, and a probabilistic criterion is used to control the growth of the model. Moreover, the model is able to adapt to the topology of each layer, so that a hierarchy of dynamic graphs is built. This overcomes the limitations of the self-organizing maps with a fixed topology, and gives rise to a faithful visualization method for high-dimensional data.

  14. Understanding logistic regression analysis

    OpenAIRE

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...

  15. Introduction to regression graphics

    CERN Document Server

    Cook, R Dennis

    2009-01-01

    Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava

  16. Alternative Methods of Regression

    CERN Document Server

    Birkes, David

    2011-01-01

    Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s

  17. A primer of multivariate statistics

    CERN Document Server

    Harris, Richard J

    2014-01-01

    Drawing upon more than 30 years of experience in working with statistics, Dr. Richard J. Harris has updated A Primer of Multivariate Statistics to provide a model of balance between how-to and why. This classic text covers multivariate techniques with a taste of latent variable approaches. Throughout the book there is a focus on the importance of describing and testing one's interpretations of the emergent variables that are produced by multivariate analysis. This edition retains its conversational writing style while focusing on classical techniques. The book gives the reader a feel for why

  18. Hierarchical Discriminant Analysis

    Directory of Open Access Journals (Sweden)

    Di Lu

    2018-01-01

    Full Text Available The Internet of Things (IoT generates lots of high-dimensional sensor intelligent data. The processing of high-dimensional data (e.g., data visualization and data classification is very difficult, so it requires excellent subspace learning algorithms to learn a latent subspace to preserve the intrinsic structure of the high-dimensional data, and abandon the least useful information in the subsequent processing. In this context, many subspace learning algorithms have been presented. However, in the process of transforming the high-dimensional data into the low-dimensional space, the huge difference between the sum of inter-class distance and the sum of intra-class distance for distinct data may cause a bias problem. That means that the impact of intra-class distance is overwhelmed. To address this problem, we propose a novel algorithm called Hierarchical Discriminant Analysis (HDA. It minimizes the sum of intra-class distance first, and then maximizes the sum of inter-class distance. This proposed method balances the bias from the inter-class and that from the intra-class to achieve better performance. Extensive experiments are conducted on several benchmark face datasets. The results reveal that HDA obtains better performance than other dimensionality reduction algorithms.

  19. Hierarchical Linked Views

    Energy Technology Data Exchange (ETDEWEB)

    Erbacher, Robert; Frincke, Deb

    2007-07-02

    Coordinated views have proven critical to the development of effective visualization environments. This results from the fact that a single view or representation of the data cannot show all of the intricacies of a given data set. Additionally, users will often need to correlate more data parameters than can effectively be integrated into a single visual display. Typically, development of multiple-linked views results in an adhoc configuration of views and associated interactions. The hierarchical model we are proposing is geared towards more effective organization of such environments and the views they encompass. At the same time, this model can effectively integrate much of the prior work on interactive and visual frameworks. Additionally, we expand the concept of views to incorporate perceptual views. This is related to the fact that visual displays can have information encoded at various levels of focus. Thus, a global view of the display provides overall trends of the data while focusing in on individual elements provides detailed specifics. By integrating interaction and perception into a single model, we show how one impacts the other. Typically, interaction and perception are considered separately, however, when interaction is being considered at a fundamental level and allowed to direct/modify the visualization directly we must consider them simultaneously and how they impact one another.

  20. Adaptive estimation of multivariate functions using conditionally Gaussian tensor-product spline priors

    NARCIS (Netherlands)

    Jonge, de R.; Zanten, van J.H.

    2012-01-01

    We investigate posterior contraction rates for priors on multivariate functions that are constructed using tensor-product B-spline expansions. We prove that using a hierarchical prior with an appropriate prior distribution on the partition size and Gaussian prior weights on the B-spline

  1. A multivariate decision tree analysis of biophysical factors in tropical forest fire occurrence

    Science.gov (United States)

    Rey S. Ofren; Edward Harvey

    2000-01-01

    A multivariate decision tree model was used to quantify the relative importance of complex hierarchical relationships between biophysical variables and the occurrence of tropical forest fires. The study site is the Huai Kha Kbaeng wildlife sanctuary, a World Heritage Site in northwestern Thailand where annual fires are common and particularly destructive. Thematic...

  2. Direct hierarchical assembly of nanoparticles

    Science.gov (United States)

    Xu, Ting; Zhao, Yue; Thorkelsson, Kari

    2014-07-22

    The present invention provides hierarchical assemblies of a block copolymer, a bifunctional linking compound and a nanoparticle. The block copolymers form one micro-domain and the nanoparticles another micro-domain.

  3. Hierarchical materials: Background and perspectives

    DEFF Research Database (Denmark)

    2016-01-01

    Hierarchical design draws inspiration from analysis of biological materials and has opened new possibilities for enhancing performance and enabling new functionalities and extraordinary properties. With the development of nanotechnology, the necessary technological requirements for the manufactur...

  4. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  5. On directional multiple-output quantile regression

    Czech Academy of Sciences Publication Activity Database

    Paindaveine, D.; Šiman, Miroslav

    2011-01-01

    Roč. 102, č. 2 (2011), s. 193-212 ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant - others:Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value-at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf

  6. Clinical, laboratory, and demographic determinants of hospitalization due to dengue in 7613 patients: A retrospective study based on hierarchical models.

    Science.gov (United States)

    da Silva, Natal Santos; Undurraga, Eduardo A; da Silva Ferreira, Elis Regina; Estofolete, Cássia Fernanda; Nogueira, Maurício Lacerda

    2018-01-01

    In Brazil, the incidence of hospitalization due to dengue, as an indicator of severity, has drastically increased since 1998. The objective of our study was to identify risk factors associated with subsequent hospitalization related to dengue. We analyzed 7613 dengue confirmed via serology (ELISA), non-structural protein 1, or polymerase chain reaction amplification. We used a hierarchical framework to generate a multivariate logistic regression based on a variety of risk variables. This was followed by multiple statistical analyses to assess hierarchical model accuracy, variance, goodness of fit, and whether or not this model reliably represented the population. The final model, which included age, sex, ethnicity, previous dengue infection, hemorrhagic manifestations, plasma leakage, and organ failure, showed that all measured parameters, with the exception of previous dengue, were statistically significant. The presence of organ failure was associated with the highest risk of subsequent dengue hospitalization (OR=5·75; CI=3·53-9·37). Therefore, plasma leakage and organ failure were the main indicators of hospitalization due to dengue, although other variables of minor importance should also be considered to refer dengue patients to hospital treatment, which may lead to a reduction in avoidable deaths as well as costs related to dengue. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Hierarchical architecture of active knits

    International Nuclear Information System (INIS)

    Abel, Julianna; Luntz, Jonathan; Brei, Diann

    2013-01-01

    Nature eloquently utilizes hierarchical structures to form the world around us. Applying the hierarchical architecture paradigm to smart materials can provide a basis for a new genre of actuators which produce complex actuation motions. One promising example of cellular architecture—active knits—provides complex three-dimensional distributed actuation motions with expanded operational performance through a hierarchically organized structure. The hierarchical structure arranges a single fiber of active material, such as shape memory alloys (SMAs), into a cellular network of interlacing adjacent loops according to a knitting grid. This paper defines a four-level hierarchical classification of knit structures: the basic knit loop, knit patterns, grid patterns, and restructured grids. Each level of the hierarchy provides increased architectural complexity, resulting in expanded kinematic actuation motions of active knits. The range of kinematic actuation motions are displayed through experimental examples of different SMA active knits. The results from this paper illustrate and classify the ways in which each level of the hierarchical knit architecture leverages the performance of the base smart material to generate unique actuation motions, providing necessary insight to best exploit this new actuation paradigm. (paper)

  8. Boosted beta regression.

    Directory of Open Access Journals (Sweden)

    Matthias Schmid

    Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.

  9. Directional quantile regression in Octave (and MATLAB)

    Czech Academy of Sciences Publication Activity Database

    Boček, Pavel; Šiman, Miroslav

    2016-01-01

    Roč. 52, č. 1 (2016), s. 28-51 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * multivariate quantile * depth contour * Matlab Subject RIV: IN - Informatics, Computer Science Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/bocek-0458380.pdf

  10. Multivariate Statistical Process Control Charts: An Overview

    OpenAIRE

    Bersimis, Sotiris; Psarakis, Stelios; Panaretos, John

    2006-01-01

    In this paper we discuss the basic procedures for the implementation of multivariate statistical process control via control charting. Furthermore, we review multivariate extensions for all kinds of univariate control charts, such as multivariate Shewhart-type control charts, multivariate CUSUM control charts and multivariate EWMA control charts. In addition, we review unique procedures for the construction of multivariate control charts, based on multivariate statistical techniques such as p...

  11. Understanding logistic regression analysis.

    Science.gov (United States)

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.

  12. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2013-01-01

    Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus

  13. Applied logistic regression

    CERN Document Server

    Hosmer, David W; Sturdivant, Rodney X

    2013-01-01

     A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-

  14. application of multilinear regression analysis in modeling of soil

    African Journals Online (AJOL)

    Windows User

    Accordingly [1, 3] in their work, they applied linear regression ... (MLRA) is a statistical technique that uses several explanatory ... order to check this, they adopted bivariate correlation analysis .... groups, namely A-1 through A-7, based on their relative expected ..... Multivariate Regression in Gorgan Province North of Iran” ...

  15. Models and Inference for Multivariate Spatial Extremes

    KAUST Repository

    Vettori, Sabrina

    2017-12-07

    The development of flexible and interpretable statistical methods is necessary in order to provide appropriate risk assessment measures for extreme events and natural disasters. In this thesis, we address this challenge by contributing to the developing research field of Extreme-Value Theory. We initially study the performance of existing parametric and non-parametric estimators of extremal dependence for multivariate maxima. As the dimensionality increases, non-parametric estimators are more flexible than parametric methods but present some loss in efficiency that we quantify under various scenarios. We introduce a statistical tool which imposes the required shape constraints on non-parametric estimators in high dimensions, significantly improving their performance. Furthermore, by embedding the tree-based max-stable nested logistic distribution in the Bayesian framework, we develop a statistical algorithm that identifies the most likely tree structures representing the data\\'s extremal dependence using the reversible jump Monte Carlo Markov Chain method. A mixture of these trees is then used for uncertainty assessment in prediction through Bayesian model averaging. The computational complexity of full likelihood inference is significantly decreased by deriving a recursive formula for the nested logistic model likelihood. The algorithm performance is verified through simulation experiments which also compare different likelihood procedures. Finally, we extend the nested logistic representation to the spatial framework in order to jointly model multivariate variables collected across a spatial region. This situation emerges often in environmental applications but is not often considered in the current literature. Simulation experiments show that the new class of multivariate max-stable processes is able to detect both the cross and inner spatial dependence of a number of extreme variables at a relatively low computational cost, thanks to its Bayesian hierarchical

  16. Multivariate Generalized Multiscale Entropy Analysis

    Directory of Open Access Journals (Sweden)

    Anne Humeau-Heurtier

    2016-11-01

    Full Text Available Multiscale entropy (MSE was introduced in the 2000s to quantify systems’ complexity. MSE relies on (i a coarse-graining procedure to derive a set of time series representing the system dynamics on different time scales; (ii the computation of the sample entropy for each coarse-grained time series. A refined composite MSE (rcMSE—based on the same steps as MSE—also exists. Compared to MSE, rcMSE increases the accuracy of entropy estimation and reduces the probability of inducing undefined entropy for short time series. The multivariate versions of MSE (MMSE and rcMSE (MrcMSE have also been introduced. In the coarse-graining step used in MSE, rcMSE, MMSE, and MrcMSE, the mean value is used to derive representations of the original data at different resolutions. A generalization of MSE was recently published, using the computation of different moments in the coarse-graining procedure. However, so far, this generalization only exists for univariate signals. We therefore herein propose an extension of this generalized MSE to multivariate data. The multivariate generalized algorithms of MMSE and MrcMSE presented herein (MGMSE and MGrcMSE, respectively are first analyzed through the processing of synthetic signals. We reveal that MGrcMSE shows better performance than MGMSE for short multivariate data. We then study the performance of MGrcMSE on two sets of short multivariate electroencephalograms (EEG available in the public domain. We report that MGrcMSE may show better performance than MrcMSE in distinguishing different types of multivariate EEG data. MGrcMSE could therefore supplement MMSE or MrcMSE in the processing of multivariate datasets.

  17. imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel

    Science.gov (United States)

    Grapov, Dmitry; Newman, John W.

    2012-01-01

    Summary: Interactive modules for Data Exploration and Visualization (imDEV) is a Microsoft Excel spreadsheet embedded application providing an integrated environment for the analysis of omics data through a user-friendly interface. Individual modules enables interactive and dynamic analyses of large data by interfacing R's multivariate statistics and highly customizable visualizations with the spreadsheet environment, aiding robust inferences and generating information-rich data visualizations. This tool provides access to multiple comparisons with false discovery correction, hierarchical clustering, principal and independent component analyses, partial least squares regression and discriminant analysis, through an intuitive interface for creating high-quality two- and a three-dimensional visualizations including scatter plot matrices, distribution plots, dendrograms, heat maps, biplots, trellis biplots and correlation networks. Availability and implementation: Freely available for download at http://sourceforge.net/projects/imdev/. Implemented in R and VBA and supported by Microsoft Excel (2003, 2007 and 2010). Contact: John.Newman@ars.usda.gov Supplementary Information: Installation instructions, tutorials and users manual are available at http://sourceforge.net/projects/imdev/. PMID:22815358

  18. On the Optimality of Multivariate S-Estimators

    NARCIS (Netherlands)

    Croux, C.; Dehon, C.; Yadine, A.

    2010-01-01

    In this paper we maximize the efficiency of a multivariate S-estimator under a constraint on the breakdown point. In the linear regression model, it is known that the highest possible efficiency of a maximum breakdown S-estimator is bounded above by 33% for Gaussian errors. We prove the surprising

  19. Understanding poisson regression.

    Science.gov (United States)

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.

  20. Advances in Applications of Hierarchical Bayesian Methods with Hydrological Models

    Science.gov (United States)

    Alexander, R. B.; Schwarz, G. E.; Boyer, E. W.

    2017-12-01

    Mechanistic and empirical watershed models are increasingly used to inform water resource decisions. Growing access to historical stream measurements and data from in-situ sensor technologies has increased the need for improved techniques for coupling models with hydrological measurements. Techniques that account for the intrinsic uncertainties of both models and measurements are especially needed. Hierarchical Bayesian methods provide an efficient modeling tool for quantifying model and prediction uncertainties, including those associated with measurements. Hierarchical methods can also be used to explore spatial and temporal variations in model parameters and uncertainties that are informed by hydrological measurements. We used hierarchical Bayesian methods to develop a hybrid (statistical-mechanistic) SPARROW (SPAtially Referenced Regression On Watershed attributes) model of long-term mean annual streamflow across diverse environmental and climatic drainages in 18 U.S. hydrological regions. Our application illustrates the use of a new generation of Bayesian methods that offer more advanced computational efficiencies than the prior generation. Evaluations of the effects of hierarchical (regional) variations in model coefficients and uncertainties on model accuracy indicates improved prediction accuracies (median of 10-50%) but primarily in humid eastern regions, where model uncertainties are one-third of those in arid western regions. Generally moderate regional variability is observed for most hierarchical coefficients. Accounting for measurement and structural uncertainties, using hierarchical state-space techniques, revealed the effects of spatially-heterogeneous, latent hydrological processes in the "localized" drainages between calibration sites; this improved model precision, with only minor changes in regional coefficients. Our study can inform advances in the use of hierarchical methods with hydrological models to improve their integration with stream

  1. EXPLORATORY DATA ANALYSIS AND MULTIVARIATE STRATEGIES FOR REVEALING MULTIVARIATE STRUCTURES IN CLIMATE DATA

    Directory of Open Access Journals (Sweden)

    2016-12-01

    Full Text Available This paper is on data analysis strategy in a complex, multidimensional, and dynamic domain. The focus is on the use of data mining techniques to explore the importance of multivariate structures; using climate variables which influences climate change. Techniques involved in data mining exercise vary according to the data structures. The multivariate analysis strategy considered here involved choosing an appropriate tool to analyze a process. Factor analysis is introduced into data mining technique in order to reveal the influencing impacts of factors involved as well as solving for multicolinearity effect among the variables. The temporal nature and multidimensionality of the target variables is revealed in the model using multidimensional regression estimates. The strategy of integrating the method of several statistical techniques, using climate variables in Nigeria was employed. R2 of 0.518 was obtained from the ordinary least square regression analysis carried out and the test was not significant at 5% level of significance. However, factor analysis regression strategy gave a good fit with R2 of 0.811 and the test was significant at 5% level of significance. Based on this study, model building should go beyond the usual confirmatory data analysis (CDA, rather it should be complemented with exploratory data analysis (EDA in order to achieve a desired result.

  2. Bayesian Modeling of Air Pollution Extremes Using Nested Multivariate Max-Stable Processes

    KAUST Repository

    Vettori, Sabrina; Huser, Raphaë l; Genton, Marc G.

    2018-01-01

    Capturing the potentially strong dependence among the peak concentrations of multiple air pollutants across a spatial region is crucial for assessing the related public health risks. In order to investigate the multivariate spatial dependence properties of air pollution extremes, we introduce a new class of multivariate max-stable processes. Our proposed model admits a hierarchical tree-based formulation, in which the data are conditionally independent given some latent nested $\\alpha$-stable random factors. The hierarchical structure facilitates Bayesian inference and offers a convenient and interpretable characterization. We fit this nested multivariate max-stable model to the maxima of air pollution concentrations and temperatures recorded at a number of sites in the Los Angeles area, showing that the proposed model succeeds in capturing their complex tail dependence structure.

  3. Bayesian Modeling of Air Pollution Extremes Using Nested Multivariate Max-Stable Processes

    KAUST Repository

    Vettori, Sabrina

    2018-03-18

    Capturing the potentially strong dependence among the peak concentrations of multiple air pollutants across a spatial region is crucial for assessing the related public health risks. In order to investigate the multivariate spatial dependence properties of air pollution extremes, we introduce a new class of multivariate max-stable processes. Our proposed model admits a hierarchical tree-based formulation, in which the data are conditionally independent given some latent nested $\\\\alpha$-stable random factors. The hierarchical structure facilitates Bayesian inference and offers a convenient and interpretable characterization. We fit this nested multivariate max-stable model to the maxima of air pollution concentrations and temperatures recorded at a number of sites in the Los Angeles area, showing that the proposed model succeeds in capturing their complex tail dependence structure.

  4. Multivariate stochastic simulation with subjective multivariate normal distributions

    Science.gov (United States)

    P. J. Ince; J. Buongiorno

    1991-01-01

    In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...

  5. Multivariate Matrix-Exponential Distributions

    DEFF Research Database (Denmark)

    Bladt, Mogens; Nielsen, Bo Friis

    2010-01-01

    be written as linear combinations of the elements in the exponential of a matrix. For this reason we shall refer to multivariate distributions with rational Laplace transform as multivariate matrix-exponential distributions (MVME). The marginal distributions of an MVME are univariate matrix......-exponential distributions. We prove a characterization that states that a distribution is an MVME distribution if and only if all non-negative, non-null linear combinations of the coordinates have a univariate matrix-exponential distribution. This theorem is analog to a well-known characterization theorem...

  6. Local bilinear multiple-output quantile/depth regression

    Czech Academy of Sciences Publication Activity Database

    Hallin, M.; Lu, Z.; Paindaveine, D.; Šiman, Miroslav

    2015-01-01

    Roč. 21, č. 3 (2015), s. 1435-1466 ISSN 1350-7265 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : conditional depth * growth chart * halfspace depth * local bilinear regression * multivariate quantile * quantile regression * regression depth Subject RIV: BA - General Mathematics Impact factor: 1.372, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/siman-0446857.pdf

  7. Mixture of Regression Models with Single-Index

    OpenAIRE

    Xiang, Sijia; Yao, Weixin

    2016-01-01

    In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...

  8. Multicollinearity and Regression Analysis

    Science.gov (United States)

    Daoud, Jamal I.

    2017-12-01

    In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.

  9. The Multivariate Gaussian Probability Distribution

    DEFF Research Database (Denmark)

    Ahrendt, Peter

    2005-01-01

    This technical report intends to gather information about the multivariate gaussian distribution, that was previously not (at least to my knowledge) to be found in one place and written as a reference manual. Additionally, some useful tips and tricks are collected that may be useful in practical ...

  10. A "Model" Multivariable Calculus Course.

    Science.gov (United States)

    Beckmann, Charlene E.; Schlicker, Steven J.

    1999-01-01

    Describes a rich, investigative approach to multivariable calculus. Introduces a project in which students construct physical models of surfaces that represent real-life applications of their choice. The models, along with student-selected datasets, serve as vehicles to study most of the concepts of the course from both continuous and discrete…

  11. Minimax Regression Quantiles

    DEFF Research Database (Denmark)

    Bache, Stefan Holst

    A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....

  12. riskRegression

    DEFF Research Database (Denmark)

    Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas

    2017-01-01

    In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....

  13. Deliberate change without hierarchical influence?

    DEFF Research Database (Denmark)

    Nørskov, Sladjana; Kesting, Peter; Ulhøi, John Parm

    2017-01-01

    reveals that deliberate change is indeed achievable in a non-hierarchical collaborative OSS community context. However, it presupposes the presence and active involvement of informal change agents. The paper identifies and specifies four key drivers for change agents’ influence. Originality....../value The findings contribute to organisational analysis by providing a deeper understanding of the importance of leadership in making deliberate change possible in non-hierarchical settings. It points to the importance of “change-by-conviction”, essentially based on voluntary behaviour. This can open the door...

  14. Multiple linear regression analysis

    Science.gov (United States)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  15. Bayesian logistic regression analysis

    NARCIS (Netherlands)

    Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.

    2012-01-01

    In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an

  16. Linear Regression Analysis

    CERN Document Server

    Seber, George A F

    2012-01-01

    Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

  17. Nonlinear Regression with R

    CERN Document Server

    Ritz, Christian; Parmigiani, Giovanni

    2009-01-01

    R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.

  18. Bayesian ARTMAP for regression.

    Science.gov (United States)

    Sasu, L M; Andonie, R

    2013-10-01

    Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. Bounded Gaussian process regression

    DEFF Research Database (Denmark)

    Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan

    2013-01-01

    We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...... with the proposed explicit noise-model extension....

  20. and Multinomial Logistic Regression

    African Journals Online (AJOL)

    This work presented the results of an experimental comparison of two models: Multinomial Logistic Regression (MLR) and Artificial Neural Network (ANN) for classifying students based on their academic performance. The predictive accuracy for each model was measured by their average Classification Correct Rate (CCR).

  1. Mechanisms of neuroblastoma regression

    Science.gov (United States)

    Brodeur, Garrett M.; Bagatell, Rochelle

    2014-01-01

    Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179

  2. Bayesian nonlinear regression for large small problems

    KAUST Repository

    Chakraborty, Sounak; Ghosh, Malay; Mallick, Bani K.

    2012-01-01

    Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik's ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.

  3. Bayesian nonlinear regression for large small problems

    KAUST Repository

    Chakraborty, Sounak

    2012-07-01

    Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.

  4. Robust methods for multivariate data analysis A1

    DEFF Research Database (Denmark)

    Frosch, Stina; Von Frese, J.; Bro, Rasmus

    2005-01-01

    Outliers may hamper proper classical multivariate analysis, and lead to incorrect conclusions. To remedy the problem of outliers, robust methods are developed in statistics and chemometrics. Robust methods reduce or remove the effect of outlying data points and allow the ?good? data to primarily...... determine the result. This article reviews the most commonly used robust multivariate regression and exploratory methods that have appeared since 1996 in the field of chemometrics. Special emphasis is put on the robust versions of chemometric standard tools like PCA and PLS and the corresponding robust...

  5. Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression.

    Science.gov (United States)

    Zhen, Xiantong; Yu, Mengyang; Islam, Ali; Bhaduri, Mousumi; Chan, Ian; Li, Shuo

    2017-09-01

    Multioutput regression has recently shown great ability to solve challenging problems in both computer vision and medical image analysis. However, due to the huge image variability and ambiguity, it is fundamentally challenging to handle the highly complex input-target relationship of multioutput regression, especially with indiscriminate high-dimensional representations. In this paper, we propose a novel supervised descriptor learning (SDL) algorithm for multioutput regression, which can establish discriminative and compact feature representations to improve the multivariate estimation performance. The SDL is formulated as generalized low-rank approximations of matrices with a supervised manifold regularization. The SDL is able to simultaneously extract discriminative features closely related to multivariate targets and remove irrelevant and redundant information by transforming raw features into a new low-dimensional space aligned to targets. The achieved discriminative while compact descriptor largely reduces the variability and ambiguity for multioutput regression, which enables more accurate and efficient multivariate estimation. We conduct extensive evaluation of the proposed SDL on both synthetic data and real-world multioutput regression tasks for both computer vision and medical image analysis. Experimental results have shown that the proposed SDL can achieve high multivariate estimation accuracy on all tasks and largely outperforms the algorithms in the state of the arts. Our method establishes a novel SDL framework for multioutput regression, which can be widely used to boost the performance in different applications.

  6. Modular networks with hierarchical organization

    Indian Academy of Sciences (India)

    Several networks occurring in real life have modular structures that are arranged in a hierarchical fashion. In this paper, we have proposed a model for such networks, using a stochastic generation method. Using this model we show that, the scaling relation between the clustering and degree of the nodes is not a necessary ...

  7. Hierarchical Microaggressions in Higher Education

    Science.gov (United States)

    Young, Kathryn; Anderson, Myron; Stewart, Saran

    2015-01-01

    Although there has been substantial research examining the effects of microaggressions in the public sphere, there has been little research that examines microaggressions in the workplace. This study explores the types of microaggressions that affect employees at universities. We coin the term "hierarchical microaggression" to represent…

  8. Online Monitoring of Copper Damascene Electroplating Bath by Voltammetry: Selection of Variables for Multiblock and Hierarchical Chemometric Analysis of Voltammetric Data

    Directory of Open Access Journals (Sweden)

    Aleksander Jaworski

    2017-01-01

    Full Text Available The Real Time Analyzer (RTA utilizing DC- and AC-voltammetric techniques is an in situ, online monitoring system that provides a complete chemical analysis of different electrochemical deposition solutions. The RTA employs multivariate calibration when predicting concentration parameters from a multivariate data set. Although the hierarchical and multiblock Principal Component Regression- (PCR- and Partial Least Squares- (PLS- based methods can handle data sets even when the number of variables significantly exceeds the number of samples, it can be advantageous to reduce the number of variables to obtain improvement of the model predictions and better interpretation. This presentation focuses on the introduction of a multistep, rigorous method of data-selection-based Least Squares Regression, Simple Modeling of Class Analogy modeling power, and, as a novel application in electroanalysis, Uninformative Variable Elimination by PLS and by PCR, Variable Importance in the Projection coupled with PLS, Interval PLS, Interval PCR, and Moving Window PLS. Selection criteria of the optimum decomposition technique for the specific data are also demonstrated. The chief goal of this paper is to introduce to the community of electroanalytical chemists numerous variable selection methods which are well established in spectroscopy and can be successfully applied to voltammetric data analysis.

  9. Ridge Regression Signal Processing

    Science.gov (United States)

    Kuhl, Mark R.

    1990-01-01

    The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.

  10. Subset selection in regression

    CERN Document Server

    Miller, Alan

    2002-01-01

    Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...

  11. On generalized elliptical quantiles in the nonlinear quantile regression setup

    Czech Academy of Sciences Publication Activity Database

    Hlubinka, D.; Šiman, Miroslav

    2015-01-01

    Roč. 24, č. 2 (2015), s. 249-264 ISSN 1133-0686 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * elliptical quantile * quantile regression * multivariate statistical inference * portfolio optimization Subject RIV: BA - General Mathematics Impact factor: 1.207, year: 2015 http://library.utia.cas.cz/separaty/2014/SI/siman-0434510.pdf

  12. Better Autologistic Regression

    Directory of Open Access Journals (Sweden)

    Mark A. Wolters

    2017-11-01

    Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.

  13. Regression in organizational leadership.

    Science.gov (United States)

    Kernberg, O F

    1979-02-01

    The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.

  14. Classification and regression trees

    CERN Document Server

    Breiman, Leo; Olshen, Richard A; Stone, Charles J

    1984-01-01

    The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

  15. Logistic regression models

    CERN Document Server

    Hilbe, Joseph M

    2009-01-01

    This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...

  16. Use of multivariate extensions of generalized linear models in the analysis of data from clinical trials

    OpenAIRE

    ALONSO ABAD, Ariel; Rodriguez, O.; TIBALDI, Fabian; CORTINAS ABRAHANTES, Jose

    2002-01-01

    In medical studies the categorical endpoints are quite often. Even though nowadays some models for handling this multicategorical variables have been developed their use is not common. This work shows an application of the Multivariate Generalized Linear Models to the analysis of Clinical Trials data. After a theoretical introduction models for ordinal and nominal responses are applied and the main results are discussed. multivariate analysis; multivariate logistic regression; multicategor...

  17. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong; Wu, Tao

    2017-01-01

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced

  18. Linear models of coregionalization for multivariate lattice data: Order-dependent and order-free cMCARs.

    Science.gov (United States)

    MacNab, Ying C

    2016-08-01

    This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.

  19. Steganalysis using logistic regression

    Science.gov (United States)

    Lubenko, Ivans; Ker, Andrew D.

    2011-02-01

    We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.

  20. SEPARATION PHENOMENA LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    Ikaro Daniel de Carvalho Barreto

    2014-03-01

    Full Text Available This paper proposes an application of concepts about the maximum likelihood estimation of the binomial logistic regression model to the separation phenomena. It generates bias in the estimation and provides different interpretations of the estimates on the different statistical tests (Wald, Likelihood Ratio and Score and provides different estimates on the different iterative methods (Newton-Raphson and Fisher Score. It also presents an example that demonstrates the direct implications for the validation of the model and validation of variables, the implications for estimates of odds ratios and confidence intervals, generated from the Wald statistics. Furthermore, we present, briefly, the Firth correction to circumvent the phenomena of separation.

  1. riskRegression

    DEFF Research Database (Denmark)

    Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas

    2017-01-01

    In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...

  2. Sparse reduced-rank regression with covariance estimation

    KAUST Repository

    Chen, Lisha

    2014-12-08

    Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.

  3. Sparse reduced-rank regression with covariance estimation

    KAUST Repository

    Chen, Lisha; Huang, Jianhua Z.

    2014-01-01

    Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.

  4. Likelihood estimators for multivariate extremes

    KAUST Repository

    Huser, Raphaë l; Davison, Anthony C.; Genton, Marc G.

    2015-01-01

    The main approach to inference for multivariate extremes consists in approximating the joint upper tail of the observations by a parametric family arising in the limit for extreme events. The latter may be expressed in terms of componentwise maxima, high threshold exceedances or point processes, yielding different but related asymptotic characterizations and estimators. The present paper clarifies the connections between the main likelihood estimators, and assesses their practical performance. We investigate their ability to estimate the extremal dependence structure and to predict future extremes, using exact calculations and simulation, in the case of the logistic model.

  5. Sparse Linear Identifiable Multivariate Modeling

    DEFF Research Database (Denmark)

    Henao, Ricardo; Winther, Ole

    2011-01-01

    and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable......In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully...

  6. Improved multivariate polynomial factoring algorithm

    International Nuclear Information System (INIS)

    Wang, P.S.

    1978-01-01

    A new algorithm for factoring multivariate polynomials over the integers based on an algorithm by Wang and Rothschild is described. The new algorithm has improved strategies for dealing with the known problems of the original algorithm, namely, the leading coefficient problem, the bad-zero problem and the occurrence of extraneous factors. It has an algorithm for correctly predetermining leading coefficients of the factors. A new and efficient p-adic algorithm named EEZ is described. Bascially it is a linearly convergent variable-by-variable parallel construction. The improved algorithm is generally faster and requires less store then the original algorithm. Machine examples with comparative timing are included

  7. Likelihood estimators for multivariate extremes

    KAUST Repository

    Huser, Raphaël

    2015-11-17

    The main approach to inference for multivariate extremes consists in approximating the joint upper tail of the observations by a parametric family arising in the limit for extreme events. The latter may be expressed in terms of componentwise maxima, high threshold exceedances or point processes, yielding different but related asymptotic characterizations and estimators. The present paper clarifies the connections between the main likelihood estimators, and assesses their practical performance. We investigate their ability to estimate the extremal dependence structure and to predict future extremes, using exact calculations and simulation, in the case of the logistic model.

  8. Simulation of multivariate diffusion bridges

    DEFF Research Database (Denmark)

    Bladt, Mogens; Finch, Samuel; Sørensen, Michael

    We propose simple methods for multivariate diffusion bridge simulation, which plays a fundamental role in simulation-based likelihood and Bayesian inference for stochastic differential equations. By a novel application of classical coupling methods, the new approach generalizes a previously...... proposed simulation method for one-dimensional bridges to the mulit-variate setting. First a method of simulating approzimate, but often very accurate, diffusion bridges is proposed. These approximate bridges are used as proposal for easily implementable MCMC algorithms that produce exact diffusion bridges...

  9. Essentials of multivariate data analysis

    CERN Document Server

    Spencer, Neil H

    2013-01-01

    ""… this text provides an overview at an introductory level of several methods in multivariate data analysis. It contains in-depth examples from one data set woven throughout the text, and a free [Excel] Add-In to perform the analyses in Excel, with step-by-step instructions provided for each technique. … could be used as a text (possibly supplemental) for courses in other fields where researchers wish to apply these methods without delving too deeply into the underlying statistics.""-The American Statistician, February 2015

  10. Multivariate process monitoring of EAFs

    Energy Technology Data Exchange (ETDEWEB)

    Sandberg, E.; Lennox, B.; Marjanovic, O.; Smith, K.

    2005-06-01

    Improved knowledge of the effect of scrap grades on the electric steelmaking process and optimised scrap loading practices increase the potential for process automation. As part of an ongoing programme, process data from four Scandinavian EAFs have been analysed, using the multivariate process monitoring approach, to develop predictive models for end point conditions such as chemical composition, yield and energy consumption. The models developed generally predict final Cr, Ni and Mo and tramp element contents well, but electrical energy consumption, yield and content of oxidisable and impurity elements (C, Si, Mn, P, S) are at present more difficult to predict. Potential scrap management applications of the prediction models are also presented. (author)

  11. Aspects of multivariate statistical theory

    CERN Document Server

    Muirhead, Robb J

    2009-01-01

    The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "". . . the wealth of material on statistics concerning the multivariate normal distribution is quite exceptional. As such it is a very useful source of information for the general statistician and a must for anyone wanting to pen

  12. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  13. Hierarchical matrices algorithms and analysis

    CERN Document Server

    Hackbusch, Wolfgang

    2015-01-01

    This self-contained monograph presents matrix algorithms and their analysis. The new technique enables not only the solution of linear systems but also the approximation of matrix functions, e.g., the matrix exponential. Other applications include the solution of matrix equations, e.g., the Lyapunov or Riccati equation. The required mathematical background can be found in the appendix. The numerical treatment of fully populated large-scale matrices is usually rather costly. However, the technique of hierarchical matrices makes it possible to store matrices and to perform matrix operations approximately with almost linear cost and a controllable degree of approximation error. For important classes of matrices, the computational cost increases only logarithmically with the approximation error. The operations provided include the matrix inversion and LU decomposition. Since large-scale linear algebra problems are standard in scientific computing, the subject of hierarchical matrices is of interest to scientists ...

  14. Hierarchical Semantic Model of Geovideo

    Directory of Open Access Journals (Sweden)

    XIE Xiao

    2015-05-01

    Full Text Available The public security incidents were getting increasingly challenging with regard to their new features, including multi-scale mobility, multistage dynamic evolution, as well as spatiotemporal concurrency and uncertainty in the complex urban environment. However, the existing video models, which were used/designed for independent archive or local analysis of surveillance video, have seriously inhibited emergency response to the urgent requirements.Aiming at the explicit representation of change mechanism in video, the paper proposed a novel hierarchical geovideo semantic model using UML. This model was characterized by the hierarchical representation of both data structure and semantics based on the change-oriented three domains (feature domain, process domain and event domain instead of overall semantic description of video streaming; combining both geographical semantics and video content semantics, in support of global semantic association between multiple geovideo data. The public security incidents by video surveillance are inspected as an example to illustrate the validity of this model.

  15. Hybrid and hierarchical composite materials

    CERN Document Server

    Kim, Chang-Soo; Sano, Tomoko

    2015-01-01

    This book addresses a broad spectrum of areas in both hybrid materials and hierarchical composites, including recent development of processing technologies, structural designs, modern computer simulation techniques, and the relationships between the processing-structure-property-performance. Each topic is introduced at length with numerous  and detailed examples and over 150 illustrations.   In addition, the authors present a method of categorizing these materials, so that representative examples of all material classes are discussed.

  16. Hierarchical analysis of urban space

    OpenAIRE

    Kataeva, Y.

    2014-01-01

    Multi-level structure of urban space, multitude of subjects of its transformation, which follow asymmetric interests, multilevel system of institutions which regulate interaction in the "population business government -public organizations" system, determine the use of hierarchic approach to the analysis of urban space. The article observes theoretical justification of using this approach to study correlations and peculiarities of interaction in urban space as in an intricately organized syst...

  17. Statistical Significance for Hierarchical Clustering

    Science.gov (United States)

    Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

    2017-01-01

    Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990

  18. Accurate and versatile multivariable arbitrary piecewise model regression of nonlinear fluidic muscle behavior

    NARCIS (Netherlands)

    Veale, A.J.; Xie, Sheng Quan; Anderson, Iain Alexander

    2017-01-01

    Wearable exoskeletons and soft robots require actuators with muscle-like compliance. These actuators can benefit from the robust and effective interaction that biological muscles' compliance enables them to have in the uncertainty of the real world. Fluidic muscles are compliant but difficult to

  19. Multivariate Regression Model of Impedance of Normal and Chemically Irritated Skin Shows Predictive Ability

    National Research Council Canada - National Science Library

    Aberg, P

    2001-01-01

    ... before and after application of chemicals on volar forearms of volunteers, Tegobetaine and sodium lauryl sulphate were used to induce the irritations, The spectra were filtered using orthogonal signal correction (OSC...

  20. Factors Associated with Increased Pain in Primary Dysmenorrhea: Analysis Using a Multivariate Ordered Logistic Regression Model.

    Science.gov (United States)

    Tomás-Rodríguez, María I; Palazón-Bru, Antonio; Martínez-St John, Damian R J; Navarro-Cremades, Felipe; Toledo-Marhuenda, José V; Gil-Guillén, Vicente F

    2017-04-01

    In the literature about primary dysmenorrhea (PD), either a pain gradient has been studied just in women with PD or pain was assessed as a binary variable (presence or absence). Accordingly, we decided to carry out a study in young women to determine possible factors associated with intense pain. A cross-sectional observational study. A Spanish University in 2016. A total of 306 women, aged 18-30 years. A questionnaire was filled in by the participants to assess associated factors with dysmenorrhoea. Our outcome measure was the Andersch and Milsom scale (grade from 0 to 3). grade 0 (menstruation is not painful and daily activity is unaffected), grade 1 (menstruation is painful but seldom inhibits normal activity, analgesics are seldom required, and mild pain), grade 2 (daily activity affected, analgesics required and give relief so that absence from work or school is unusual, and moderate pain), and grade 3 (activity clearly inhibited, poor effect of analgesics, vegetative symptoms and severe pain). Factors significantly associated with more extreme pain: a higher menstrual flow (odds ratio [OR], 2.11; P < .001), a worse quality of life (OR, 0.97; P < .001) and use of medication for PD (OR, 8.22; P < .001). We determined factors associated with extreme pain in PD in a novel way. Further studies are required to corroborate our results. Copyright © 2016 North American Society for Pediatric and Adolescent Gynecology. Published by Elsevier Inc. All rights reserved.

  1. Differential diagnosis of degenerative dementias using basic neuropsychological tests: multivariable logistic regression analysis of 301 patients.

    Science.gov (United States)

    Jiménez-Huete, Adolfo; Riva, Elena; Toledano, Rafael; Campo, Pablo; Esteban, Jesús; Barrio, Antonio Del; Franch, Oriol

    2014-12-01

    The validity of neuropsychological tests for the differential diagnosis of degenerative dementias may depend on the clinical context. We constructed a series of logistic models taking into account this factor. We retrospectively analyzed the demographic and neuropsychological data of 301 patients with probable Alzheimer's disease (AD), frontotemporal degeneration (FTLD), or dementia with Lewy bodies (DLB). Nine models were constructed taking into account the diagnostic question (eg, AD vs DLB) and subpopulation (incident vs prevalent). The AD versus DLB model for all patients, including memory recovery and phonological fluency, was highly accurate (area under the curve = 0.919, sensitivity = 90%, and specificity = 80%). The results were comparable in incident and prevalent cases. The FTLD versus AD and DLB versus FTLD models were both inaccurate. The models constructed from basic neuropsychological variables allowed an accurate differential diagnosis of AD versus DLB but not of FTLD versus AD or DLB. © The Author(s) 2014.

  2. Deconvolution of Defocused Image with Multivariate Local Polynomial Regression and Iterative Wiener Filtering in DWT Domain

    Directory of Open Access Journals (Sweden)

    Liyun Su

    2010-01-01

    obtaining the point spread function (PSF parameter, iterative wiener filter is adopted to complete the restoration. We experimentally illustrate its performance on simulated data and real blurred image. Results show that the proposed PSF parameter estimation technique and the image restoration method are effective.

  3. Multivariate regression methods for estimating velocity of ictal discharges from human microelectrode recordings

    Science.gov (United States)

    Liou, Jyun-you; Smith, Elliot H.; Bateman, Lisa M.; McKhann, Guy M., II; Goodman, Robert R.; Greger, Bradley; Davis, Tyler S.; Kellis, Spencer S.; House, Paul A.; Schevon, Catherine A.

    2017-08-01

    Objective. Epileptiform discharges, an electrophysiological hallmark of seizures, can propagate across cortical tissue in a manner similar to traveling waves. Recent work has focused attention on the origination and propagation patterns of these discharges, yielding important clues to their source location and mechanism of travel. However, systematic studies of methods for measuring propagation are lacking. Approach. We analyzed epileptiform discharges in microelectrode array recordings of human seizures. The array records multiunit activity and local field potentials at 400 micron spatial resolution, from a small cortical site free of obstructions. We evaluated several computationally efficient statistical methods for calculating traveling wave velocity, benchmarking them to analyses of associated neuronal burst firing. Main results. Over 90% of discharges met statistical criteria for propagation across the sampled cortical territory. Detection rate, direction and speed estimates derived from a multiunit estimator were compared to four field potential-based estimators: negative peak, maximum descent, high gamma power, and cross-correlation. Interestingly, the methods that were computationally simplest and most efficient (negative peak and maximal descent) offer non-inferior results in predicting neuronal traveling wave velocities compared to the other two, more complex methods. Moreover, the negative peak and maximal descent methods proved to be more robust against reduced spatial sampling challenges. Using least absolute deviation in place of least squares error minimized the impact of outliers, and reduced the discrepancies between local field potential-based and multiunit estimators. Significance. Our findings suggest that ictal epileptiform discharges typically take the form of exceptionally strong, rapidly traveling waves, with propagation detectable across millimeter distances. The sequential activation of neurons in space can be inferred from clinically-observable EEG data, with a variety of straightforward computation methods available. This opens possibilities for systematic assessments of ictal discharge propagation in clinical and research settings.

  4. Aid and growth regressions

    DEFF Research Database (Denmark)

    Hansen, Henrik; Tarp, Finn

    2001-01-01

    This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy....... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes.......This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy...

  5. Elliptical multiple-output quantile regression and convex optimization

    Czech Academy of Sciences Publication Activity Database

    Hallin, M.; Šiman, Miroslav

    2016-01-01

    Roč. 109, č. 1 (2016), s. 232-237 ISSN 0167-7152 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * elliptical quantile * multivariate quantile * multiple-output regression Subject RIV: BA - General Mathematics Impact factor: 0.540, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/siman-0458243.pdf

  6. Multivariate calibration applied to the quantitative analysis of infrared spectra

    Energy Technology Data Exchange (ETDEWEB)

    Haaland, D.M.

    1991-01-01

    Multivariate calibration methods are very useful for improving the precision, accuracy, and reliability of quantitative spectral analyses. Spectroscopists can more effectively use these sophisticated statistical tools if they have a qualitative understanding of the techniques involved. A qualitative picture of the factor analysis multivariate calibration methods of partial least squares (PLS) and principal component regression (PCR) is presented using infrared calibrations based upon spectra of phosphosilicate glass thin films on silicon wafers. Comparisons of the relative prediction abilities of four different multivariate calibration methods are given based on Monte Carlo simulations of spectral calibration and prediction data. The success of multivariate spectral calibrations is demonstrated for several quantitative infrared studies. The infrared absorption and emission spectra of thin-film dielectrics used in the manufacture of microelectronic devices demonstrate rapid, nondestructive at-line and in-situ analyses using PLS calibrations. Finally, the application of multivariate spectral calibrations to reagentless analysis of blood is presented. We have found that the determination of glucose in whole blood taken from diabetics can be precisely monitored from the PLS calibration of either mind- or near-infrared spectra of the blood. Progress toward the non-invasive determination of glucose levels in diabetics is an ultimate goal of this research. 13 refs., 4 figs.

  7. Finding Similarities in Ancient Ceramics by EDXRF and Multivariate Methods

    International Nuclear Information System (INIS)

    Civici, N.; Stamati, F.

    1999-01-01

    We have studied 39 samples of fragments from ceramic roof tiles with different stamps(Diamalas and Heraion), dated between 330 to 170 BC and found at the archaeological site of Dimales, some 30 km from the Adriatic coast. The data from these samples were compared with those obtained from 7 samples of similar objects and period with the stamp H eraion , found at the archaeological site of APOLLONIA. The samples were analyzed by energy-dispersive X -ray fluorescence(EDXRF), using of the x-ray lines of the elements to the intensity of the Compton peak. The results have been treated with diverse multivariate methods. The application of hierarchical cluster analysis and factor analysis permitted the identification of two main clusters. The first cluster is composed from the ''Heraion'' samples discovered in Apollonia, while the second comprises all the samples discovered in Dimale independent of their stamp. (authors)

  8. Multivariate methods for particle identification

    CERN Document Server

    Visan, Cosmin

    2013-01-01

    The purpose of this project was to evaluate several MultiVariate methods in order to determine which one, if any, offers better results in Particle Identification (PID) than a simple n$\\sigma$ cut on the response of the ALICE PID detectors. The particles considered in the analysis were Pions, Kaons and Protons and the detectors used were TPC and TOF. When used with the same input n$\\sigma$ variables, the results show similar perfoance between the Rectangular Cuts Optimization method and the simple n$\\sigma$ cuts. The method MLP and BDT show poor results for certain ranges of momentum. The KNN method is the best performing, showing similar results for Pions and Protons as the Cuts method, and better results for Kaons. The extension of the methods to include additional input variables leads to poor results, related to instabilities still to be investigated.

  9. Acoustic multivariate condition monitoring - AMCM

    Energy Technology Data Exchange (ETDEWEB)

    Rosenhave, P E [Vestfold College, Maritime Dept., Toensberg (Norway)

    1998-12-31

    In Norway, Vestfold College, Maritime Department presents new opportunities for non-invasive, on- or off-line acoustic monitoring of rotating machinery such as off-shore pumps and diesel engines. New developments within acoustic sensor technology coupled with chemometric data analysis of complex signals now allow condition monitoring of hitherto unavailable flexibility and diagnostic specificity. Chemometrics paired with existing knowledge yields a new and powerful tool for condition monitoring. By the use of multivariate techniques and acoustics it is possible to quantify wear and tear as well as predict the performance of working components in complex machinery. This presentation describes the AMCM method and one result of a feasibility study conducted onboard the LPG/C `Norgas Mariner` owned by Norwegian Gas Carriers as (NGC), Oslo. (orig.) 6 refs.

  10. Acoustic multivariate condition monitoring - AMCM

    Energy Technology Data Exchange (ETDEWEB)

    Rosenhave, P.E. [Vestfold College, Maritime Dept., Toensberg (Norway)

    1997-12-31

    In Norway, Vestfold College, Maritime Department presents new opportunities for non-invasive, on- or off-line acoustic monitoring of rotating machinery such as off-shore pumps and diesel engines. New developments within acoustic sensor technology coupled with chemometric data analysis of complex signals now allow condition monitoring of hitherto unavailable flexibility and diagnostic specificity. Chemometrics paired with existing knowledge yields a new and powerful tool for condition monitoring. By the use of multivariate techniques and acoustics it is possible to quantify wear and tear as well as predict the performance of working components in complex machinery. This presentation describes the AMCM method and one result of a feasibility study conducted onboard the LPG/C `Norgas Mariner` owned by Norwegian Gas Carriers as (NGC), Oslo. (orig.) 6 refs.

  11. Multivariate supOU processes

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Stelzer, Robert

    Univariate superpositions of Ornstein-Uhlenbeck (OU) type processes, called supOU processes, provide a class of continuous time processes capable of exhibiting long memory behaviour. This paper introduces multivariate supOU processes and gives conditions for their existence and finiteness...... of moments. Moreover, the second order moment structure is explicitly calculated, and examples exhibit the possibility of long range dependence. Our supOU processes are defined via homogeneous and factorisable Lévy bases. We show that the behaviour of supOU processes is particularly nice when the mean...... reversion parameter is restricted to normal matrices and especially to strictly negative definite ones.For finite variation Lévy bases we are able to give conditions for supOU processes to have locally bounded càdlàg paths of finite variation and to show an analogue of the stochastic differential equation...

  12. Multivariate supOU processes

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Stelzer, Robert

    2011-01-01

    Univariate superpositions of Ornstein–Uhlenbeck-type processes (OU), called supOU processes, provide a class of continuous time processes capable of exhibiting long memory behavior. This paper introduces multivariate supOU processes and gives conditions for their existence and finiteness of moments....... Moreover, the second-order moment structure is explicitly calculated, and examples exhibit the possibility of long-range dependence. Our supOU processes are defined via homogeneous and factorizable Lévy bases. We show that the behavior of supOU processes is particularly nice when the mean reversion...... parameter is restricted to normal matrices and especially to strictly negative definite ones. For finite variation Lévy bases we are able to give conditions for supOU processes to have locally bounded càdlàg paths of finite variation and to show an analogue of the stochastic differential equation of OU...

  13. An Exact Confidence Region in Multivariate Calibration

    OpenAIRE

    Mathew, Thomas; Kasala, Subramanyam

    1994-01-01

    In the multivariate calibration problem using a multivariate linear model, an exact confidence region is constructed. It is shown that the region is always nonempty and is invariant under nonsingular transformations.

  14. Modified Regression Correlation Coefficient for Poisson Regression Model

    Science.gov (United States)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  15. Unsupervised classification of multivariate geostatistical data: Two algorithms

    Science.gov (United States)

    Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

    2015-12-01

    With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

  16. Assessing risk factors for periodontitis using regression

    Science.gov (United States)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  17. A kernel version of multivariate alteration detection

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Vestergaard, Jacob Schack

    2013-01-01

    Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations.......Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations....

  18. Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM

    Science.gov (United States)

    Warner, Rebecca M.

    2007-01-01

    This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…

  19. Multivariate strategies in functional magnetic resonance imaging

    DEFF Research Database (Denmark)

    Hansen, Lars Kai

    2007-01-01

    We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a `mind reading' predictive multivariate fMRI model....

  20. Theory of net analyte signal vectors in inverse regression

    DEFF Research Database (Denmark)

    Bro, R.; Andersen, Charlotte Møller

    2003-01-01

    The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares...

  1. A binary logistic regression model with complex sampling design of ...

    African Journals Online (AJOL)

    2017-09-03

    Sep 3, 2017 ... Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. .... Data was entered into STATA-12 and analyzed using. SPSS-21. .... lack of access/too far or costs too much. 35. 1.2.

  2. Hierarchal scalar and vector tetrahedra

    International Nuclear Information System (INIS)

    Webb, J.P.; Forghani, B.

    1993-01-01

    A new set of scalar and vector tetrahedral finite elements are presented. The elements are hierarchal, allowing mixing of polynomial orders; scalar orders up to 3 and vector orders up to 2 are defined. The vector elements impose tangential continuity on the field but not normal continuity, making them suitable for representing the vector electric or magnetic field. Further, the scalar and vector elements are such that they can easily be used in the same mesh, a requirement of many quasi-static formulations. Results are presented for two 50 Hz problems: the Bath Cube, and TEAM Problem 7

  3. Loops in hierarchical channel networks

    Science.gov (United States)

    Katifori, Eleni; Magnasco, Marcelo

    2012-02-01

    Nature provides us with many examples of planar distribution and structural networks having dense sets of closed loops. An archetype of this form of network organization is the vasculature of dicotyledonous leaves, which showcases a hierarchically-nested architecture. Although a number of methods have been proposed to measure aspects of the structure of such networks, a robust metric to quantify their hierarchical organization is still lacking. We present an algorithmic framework that allows mapping loopy networks to binary trees, preserving in the connectivity of the trees the architecture of the original graph. We apply this framework to investigate computer generated and natural graphs extracted from digitized images of dicotyledonous leaves and animal vasculature. We calculate various metrics on the corresponding trees and discuss the relationship of these quantities to the architectural organization of the original graphs. This algorithmic framework decouples the geometric information from the metric topology (connectivity and edge weight) and it ultimately allows us to perform a quantitative statistical comparison between predictions of theoretical models and naturally occurring loopy graphs.

  4. Hierarchically nested river landform sequences

    Science.gov (United States)

    Pasternack, G. B.; Weber, M. D.; Brown, R. A.; Baig, D.

    2017-12-01

    River corridors exhibit landforms nested within landforms repeatedly down spatial scales. In this study we developed, tested, and implemented a new way to create river classifications by mapping domains of fluvial processes with respect to the hierarchical organization of topographic complexity that drives fluvial dynamism. We tested this approach on flow convergence routing, a morphodynamic mechanism with different states depending on the structure of nondimensional topographic variability. Five nondimensional landform types with unique functionality (nozzle, wide bar, normal channel, constricted pool, and oversized) represent this process at any flow. When this typology is nested at base flow, bankfull, and floodprone scales it creates a system with up to 125 functional types. This shows how a single mechanism produces complex dynamism via nesting. Given the classification, we answered nine specific scientific questions to investigate the abundance, sequencing, and hierarchical nesting of these new landform types using a 35-km gravel/cobble river segment of the Yuba River in California. The nested structure of flow convergence routing landforms found in this study revealed that bankfull landforms are nested within specific floodprone valley landform types, and these types control bankfull morphodynamics during moderate to large floods. As a result, this study calls into question the prevailing theory that the bankfull channel of a gravel/cobble river is controlled by in-channel, bankfull, and/or small flood flows. Such flows are too small to initiate widespread sediment transport in a gravel/cobble river with topographic complexity.

  5. Stability of glassy hierarchical networks

    Science.gov (United States)

    Zamani, M.; Camargo-Forero, L.; Vicsek, T.

    2018-02-01

    The structure of interactions in most animal and human societies can be best represented by complex hierarchical networks. In order to maintain close-to-optimal function both stability and adaptability are necessary. Here we investigate the stability of hierarchical networks that emerge from the simulations of an organization type with an efficiency function reminiscent of the Hamiltonian of spin glasses. Using this quantitative approach we find a number of expected (from everyday observations) and highly non-trivial results for the obtained locally optimal networks, including, for example: (i) stability increases with growing efficiency and level of hierarchy; (ii) the same perturbation results in a larger change for more efficient states; (iii) networks with a lower level of hierarchy become more efficient after perturbation; (iv) due to the huge number of possible optimal states only a small fraction of them exhibit resilience and, finally, (v) ‘attacks’ targeting the nodes selectively (regarding their position in the hierarchy) can result in paradoxical outcomes.

  6. Hierarchical modeling of active materials

    International Nuclear Information System (INIS)

    Taya, Minoru

    2003-01-01

    Intelligent (or smart) materials are increasingly becoming key materials for use in actuators and sensors. If an intelligent material is used as a sensor, it can be embedded in a variety of structure functioning as a health monitoring system to make their life longer with high reliability. If an intelligent material is used as an active material in an actuator, it plays a key role of making dynamic movement of the actuator under a set of stimuli. This talk intends to cover two different active materials in actuators, (1) piezoelectric laminate with FGM microstructure, (2) ferromagnetic shape memory alloy (FSMA). The advantage of using the FGM piezo laminate is to enhance its fatigue life while maintaining large bending displacement, while that of use in FSMA is its fast actuation while providing a large force and stroke capability. Use of hierarchical modeling of the above active materials is a key design step in optimizing its microstructure for enhancement of their performance. I will discuss briefly hierarchical modeling of the above two active materials. For FGM piezo laminate, we will use both micromechanical model and laminate theory, while for FSMA, the modeling interfacing nano-structure, microstructure and macro-behavior is discussed. (author)

  7. Hierarchical organisation of causal graphs

    International Nuclear Information System (INIS)

    Dziopa, P.

    1993-01-01

    This paper deals with the design of a supervision system using a hierarchy of models formed by graphs, in which the variables are the nodes and the causal relations between the variables of the arcs. To obtain a representation of the variables evolutions which contains only the relevant features of their real evolutions, the causal relations are completed with qualitative transfer functions (QTFs) which produce roughly the behaviour of the classical transfer functions. Major improvements have been made in the building of the hierarchical organization. First, the basic variables of the uppermost level and the causal relations between them are chosen. The next graph is built by adding intermediary variables to the upper graph. When the undermost graph has been built, the transfer functions parameters corresponding to its causal relations are identified. The second task consists in the upwelling of the information from the undermost graph to the uppermost one. A fusion procedure of the causal relations has been designed to compute the QFTs relevant for each level. This procedure aims to reduce the number of parameters needed to represent an evolution at a high level of abstraction. These techniques have been applied to the hierarchical modelling of nuclear process. (authors). 8 refs., 12 figs

  8. Recursive Algorithm For Linear Regression

    Science.gov (United States)

    Varanasi, S. V.

    1988-01-01

    Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.

  9. The pathways for intelligible speech: multivariate and univariate perspectives.

    Science.gov (United States)

    Evans, S; Kyong, J S; Rosen, S; Golestani, N; Warren, J E; McGettigan, C; Mourão-Miranda, J; Wise, R J S; Scott, S K

    2014-09-01

    An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400-2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486-2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local "searchlights" and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. © The Author 2013. Published by Oxford University Press.

  10. Multivariate pluvial flood damage models

    International Nuclear Information System (INIS)

    Van Ootegem, Luc; Verhofstadt, Elsy; Van Herck, Kristine; Creten, Tom

    2015-01-01

    Depth–damage-functions, relating the monetary flood damage to the depth of the inundation, are commonly used in the case of fluvial floods (floods caused by a river overflowing). We construct four multivariate damage models for pluvial floods (caused by extreme rainfall) by differentiating on the one hand between ground floor floods and basement floods and on the other hand between damage to residential buildings and damage to housing contents. We do not only take into account the effect of flood-depth on damage, but also incorporate the effects of non-hazard indicators (building characteristics, behavioural indicators and socio-economic variables). By using a Tobit-estimation technique on identified victims of pluvial floods in Flanders (Belgium), we take into account the effect of cases of reported zero damage. Our results show that the flood depth is an important predictor of damage, but with a diverging impact between ground floor floods and basement floods. Also non-hazard indicators are important. For example being aware of the risk just before the water enters the building reduces content damage considerably, underlining the importance of warning systems and policy in this case of pluvial floods. - Highlights: • Prediction of damage of pluvial floods using also non-hazard information • We include ‘no damage cases’ using a Tobit model. • The damage of flood depth is stronger for ground floor than for basement floods. • Non-hazard indicators are especially important for content damage. • Potential gain of policies that increase awareness of flood risks

  11. Multivariate pluvial flood damage models

    Energy Technology Data Exchange (ETDEWEB)

    Van Ootegem, Luc [HIVA — University of Louvain (Belgium); SHERPPA — Ghent University (Belgium); Verhofstadt, Elsy [SHERPPA — Ghent University (Belgium); Van Herck, Kristine; Creten, Tom [HIVA — University of Louvain (Belgium)

    2015-09-15

    Depth–damage-functions, relating the monetary flood damage to the depth of the inundation, are commonly used in the case of fluvial floods (floods caused by a river overflowing). We construct four multivariate damage models for pluvial floods (caused by extreme rainfall) by differentiating on the one hand between ground floor floods and basement floods and on the other hand between damage to residential buildings and damage to housing contents. We do not only take into account the effect of flood-depth on damage, but also incorporate the effects of non-hazard indicators (building characteristics, behavioural indicators and socio-economic variables). By using a Tobit-estimation technique on identified victims of pluvial floods in Flanders (Belgium), we take into account the effect of cases of reported zero damage. Our results show that the flood depth is an important predictor of damage, but with a diverging impact between ground floor floods and basement floods. Also non-hazard indicators are important. For example being aware of the risk just before the water enters the building reduces content damage considerably, underlining the importance of warning systems and policy in this case of pluvial floods. - Highlights: • Prediction of damage of pluvial floods using also non-hazard information • We include ‘no damage cases’ using a Tobit model. • The damage of flood depth is stronger for ground floor than for basement floods. • Non-hazard indicators are especially important for content damage. • Potential gain of policies that increase awareness of flood risks.

  12. Students' Outcome Expectation on Spiritual and Religious Competency: A Hierarchical Regression Analysis

    Science.gov (United States)

    Lu, Junfei; Woo, Hongryun

    2017-01-01

    In this study, 74 master's-level counseling students from various programs completed a questionnaire inquiring about their perceived program environment in relation to the topics of spirituality and religion (S/R), program emphasis on nine specific S/R competencies, as well as their outcome expectations toward being S/R competent through training.…

  13. Application of Hierarchical Linear Models/Linear Mixed-Effects Models in School Effectiveness Research

    Science.gov (United States)

    Ker, H. W.

    2014-01-01

    Multilevel data are very common in educational research. Hierarchical linear models/linear mixed-effects models (HLMs/LMEs) are often utilized to analyze multilevel data nowadays. This paper discusses the problems of utilizing ordinary regressions for modeling multilevel educational data, compare the data analytic results from three regression…

  14. LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients

    Science.gov (United States)

    Chad Babcock; Andrew O. Finley; John B. Bradford; Randy Kolka; Richard Birdsey; Michael G. Ryan

    2015-01-01

    Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both...

  15. On logistic regression analysis of dichotomized responses.

    Science.gov (United States)

    Lu, Kaifeng

    2017-01-01

    We study the properties of treatment effect estimate in terms of odds ratio at the study end point from logistic regression model adjusting for the baseline value when the underlying continuous repeated measurements follow a multivariate normal distribution. Compared with the analysis that does not adjust for the baseline value, the adjusted analysis produces a larger treatment effect as well as a larger standard error. However, the increase in standard error is more than offset by the increase in treatment effect so that the adjusted analysis is more powerful than the unadjusted analysis for detecting the treatment effect. On the other hand, the true adjusted odds ratio implied by the normal distribution of the underlying continuous variable is a function of the baseline value and hence is unlikely to be able to be adequately represented by a single value of adjusted odds ratio from the logistic regression model. In contrast, the risk difference function derived from the logistic regression model provides a reasonable approximation to the true risk difference function implied by the normal distribution of the underlying continuous variable over the range of the baseline distribution. We show that different metrics of treatment effect have similar statistical power when evaluated at the baseline mean. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  16. Combining Alphas via Bounded Regression

    Directory of Open Access Journals (Sweden)

    Zura Kakushadze

    2015-11-01

    Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.

  17. Regression in autistic spectrum disorders.

    Science.gov (United States)

    Stefanatos, Gerry A

    2008-12-01

    A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.

  18. Symbolic Regression Via Genetic Programming as a Discovery Engine: Insights on Outliers and Prototypes

    Science.gov (United States)

    Kotanchek, Mark E.; Vladislavleva, Ekaterina Y.; Smits, Guido F.

    In this chapter we illustrate a framework based on symbolic regression to generate and sharpen the questions about the nature of the underlying system and provide additional context and understanding based on multi-variate numeric data.

  19. Linear regression in astronomy. I

    Science.gov (United States)

    Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh

    1990-01-01

    Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.

  20. Multivariate refined composite multiscale entropy analysis

    International Nuclear Information System (INIS)

    Humeau-Heurtier, Anne

    2016-01-01

    Multiscale entropy (MSE) has become a prevailing method to quantify signals complexity. MSE relies on sample entropy. However, MSE may yield imprecise complexity estimation at large scales, because sample entropy does not give precise estimation of entropy when short signals are processed. A refined composite multiscale entropy (RCMSE) has therefore recently been proposed. Nevertheless, RCMSE is for univariate signals only. The simultaneous analysis of multi-channel (multivariate) data often over-performs studies based on univariate signals. We therefore introduce an extension of RCMSE to multivariate data. Applications of multivariate RCMSE to simulated processes reveal its better performances over the standard multivariate MSE. - Highlights: • Multiscale entropy quantifies data complexity but may be inaccurate at large scale. • A refined composite multiscale entropy (RCMSE) has therefore recently been proposed. • Nevertheless, RCMSE is adapted to univariate time series only. • We herein introduce an extension of RCMSE to multivariate data. • It shows better performances than the standard multivariate multiscale entropy.

  1. Multicollinearity in hierarchical linear models.

    Science.gov (United States)

    Yu, Han; Jiang, Shanhe; Land, Kenneth C

    2015-09-01

    This study investigates an ill-posed problem (multicollinearity) in Hierarchical Linear Models from both the data and the model perspectives. We propose an intuitive, effective approach to diagnosing the presence of multicollinearity and its remedies in this class of models. A simulation study demonstrates the impacts of multicollinearity on coefficient estimates, associated standard errors, and variance components at various levels of multicollinearity for finite sample sizes typical in social science studies. We further investigate the role multicollinearity plays at each level for estimation of coefficient parameters in terms of shrinkage. Based on these analyses, we recommend a top-down method for assessing multicollinearity in HLMs that first examines the contextual predictors (Level-2 in a two-level model) and then the individual predictors (Level-1) and uses the results for data collection, research problem redefinition, model re-specification, variable selection and estimation of a final model. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Distributed hierarchical radiation monitoring system

    International Nuclear Information System (INIS)

    Barak, D.

    1985-01-01

    A solution to the problem of monitoring the radiation levels in and around a nuclear facility is presented in this paper. This is a private case of a large scale general purpose data acqisition system with high reliability, availability and short maintenance time. The physical layout of the detectors in the plant, and the strict control demands dictated a distributed and hierarchical system. The system is comprised of three levels, each level contains modules. Level one contains the Control modules which collects data from groups of detectors and executes emergency local control tasks. In level two are the Group controllers which concentrate data from the Control modules, and enable local display and communication. The system computer is in level three, enabling the plant operator to receive information from the detectors and execute control tasks. The described system was built and is operating successfully for about two years. (author)

  3. Hierarchical Control for Smart Grids

    DEFF Research Database (Denmark)

    Trangbæk, K; Bendtsen, Jan Dimon; Stoustrup, Jakob

    2011-01-01

    of autonomous consumers. The control system is tasked with balancing electric power production and consumption within the smart grid, and makes active use of the flexibility of a large number of power producing and/or power consuming units. The objective is to accommodate the load variation on the grid, arising......This paper deals with hierarchical model predictive control (MPC) of smart grid systems. The design consists of a high level MPC controller, a second level of so-called aggregators, which reduces the computational and communication-related load on the high-level control, and a lower level...... on one hand from varying consumption, and on the other hand by natural variations in power production e.g. from wind turbines. The high-level MPC problem is solved using quadratic optimisation, while the aggregator level can either involve quadratic optimisation or simple sorting-based min-max solutions...

  4. Silver Films with Hierarchical Chirality.

    Science.gov (United States)

    Ma, Liguo; Cao, Yuanyuan; Duan, Yingying; Han, Lu; Che, Shunai

    2017-07-17

    Physical fabrication of chiral metallic films usually results in singular or large-sized chirality, restricting the optical asymmetric responses to long electromagnetic wavelengths. The chiral molecule-induced formation of silver films prepared chemically on a copper substrate through a redox reaction is presented. Three levels of chirality were identified: primary twisted nanoflakes with atomic crystal lattices, secondary helical stacking of these nanoflakes to form nanoplates, and tertiary micrometer-sized circinates consisting of chiral arranged nanoplates. The chiral Ag films exhibited multiple plasmonic absorption- and scattering-based optical activities at UV/Vis wavelengths based on their hierarchical chirality. The Ag films showed chiral selectivity for amino acids in catalytic electrochemical reactions, which originated from their primary atomic crystal lattices. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Hierarchical coarse-graining transform.

    Science.gov (United States)

    Pancaldi, Vera; King, Peter R; Christensen, Kim

    2009-03-01

    We present a hierarchical transform that can be applied to Laplace-like differential equations such as Darcy's equation for single-phase flow in a porous medium. A finite-difference discretization scheme is used to set the equation in the form of an eigenvalue problem. Within the formalism suggested, the pressure field is decomposed into an average value and fluctuations of different kinds and at different scales. The application of the transform to the equation allows us to calculate the unknown pressure with a varying level of detail. A procedure is suggested to localize important features in the pressure field based only on the fine-scale permeability, and hence we develop a form of adaptive coarse graining. The formalism and method are described and demonstrated using two synthetic toy problems.

  6. Advanced statistics: linear regression, part I: simple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  7. Association between parental guilt and oral health problems in preschool children: a hierarchical approach.

    Science.gov (United States)

    Gomes, Monalisa Cesarino; Clementino, Marayza Alves; Pinto-Sarmento, Tassia Cristina de Almeida; Martins, Carolina Castro; Granville-Garcia, Ana Flávia; Paiva, Saul Martins

    2014-08-16

    Dental caries and traumatic dental injury (TDI) can play an important role in the emergence of parental guilt, since parents feel responsible for their child's health. The aim of the present study was to evaluate the influence of oral health problems among preschool children on parental guilt. A preschool-based, cross-sectional study was carried out with 832 preschool children between three and five years of age in the city of Campina Grande, Brazil. Parents/caregivers answered the Brazilian version of the Early Childhood Oral Health Impact Scale (B-ECOHIS). The item "parental guilt" was the dependent variable. Questionnaires addressing socio-demographic variables (child's sex, child's age, parent's/caregiver's age, mother's schooling, type of preschool and household income), history of toothache and health perceptions (general and oral) were also administered. Clinical exams for dental caries and TDI were performed by three dentists who had undergone a training and calibration exercise (Kappa: 0.85-0.90). Poisson hierarchical regression was used to determine the significance of associations between parental guilt and oral health problems (α = 5%). The multivariate model was carried out on three levels using a hierarchical approach from distal to proximal determinants: 1) socio-demographic aspects; 2) health perceptions; and 3) oral health problems. The frequency of parental guilt was 22.8%. The following variables were significantly associated with parental guilt: parental perception of child's oral health as poor (PR = 2.010; 95% CI: 1.502-2.688), history of toothache (PR = 2.344; 95% CI: 1.755-3.130), cavitated lesions (PR = 2.002; 95% CI: 1.388-2.887), avulsion/luxation (PR = 2.029; 95% CI: 1.141-3.610) and tooth discoloration (PR = 1.540; 95% CI: 1.169-2.028). Based on the present findings, parental guilt increases with the occurrence of oral health problems that require treatment, such as dental caries and TDI of greater severity. Parental perceptions of

  8. Adaptive hierarchical multi-agent organizations

    NARCIS (Netherlands)

    Ghijsen, M.; Jansweijer, W.N.H.; Wielinga, B.J.; Babuška, R.; Groen, F.C.A.

    2010-01-01

    In this chapter, we discuss the design of adaptive hierarchical organizations for multi-agent systems (MAS). Hierarchical organizations have a number of advantages such as their ability to handle complex problems and their scalability to large organizations. By introducing adaptivity in the

  9. The Case for a Hierarchical Cosmology

    Science.gov (United States)

    Vaucouleurs, G. de

    1970-01-01

    The development of modern theoretical cosmology is presented and some questionable assumptions of orthodox cosmology are pointed out. Suggests that recent observations indicate that hierarchical clustering is a basic factor in cosmology. The implications of hierarchical models of the universe are considered. Bibliography. (LC)

  10. Discovering hierarchical structure in normal relational data

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Herlau, Tue; Mørup, Morten

    2014-01-01

    -parametric generative model for hierarchical clustering of similarity based on multifurcating Gibbs fragmentation trees. This allows us to infer and display the posterior distribution of hierarchical structures that comply with the data. We demonstrate the utility of our method on synthetic data and data of functional...

  11. Biased trapping issue on weighted hierarchical networks

    Indian Academy of Sciences (India)

    archical networks which are based on the classic scale-free hierarchical networks. ... Weighted hierarchical networks; weight-dependent walks; mean first passage ..... The weighted networks can mimic some real-world natural and social systems to ... the Priority Academic Program Development of Jiangsu Higher Education ...

  12. Analysis of multi-species point patterns using multivariate log Gaussian Cox processes

    DEFF Research Database (Denmark)

    Waagepetersen, Rasmus; Guan, Yongtao; Jalilian, Abdollah

    Multivariate log Gaussian Cox processes are flexible models for multivariate point patterns. However, they have so far only been applied in bivariate cases. In this paper we move beyond the bivariate case in order to model multi-species point patterns of tree locations. In particular we address t...... of the data. The selected number of common latent fields provides an index of complexity of the multivariate covariance structure. Hierarchical clustering is used to identify groups of species with similar patterns of dependence on the common latent fields.......Multivariate log Gaussian Cox processes are flexible models for multivariate point patterns. However, they have so far only been applied in bivariate cases. In this paper we move beyond the bivariate case in order to model multi-species point patterns of tree locations. In particular we address...... the problems of identifying parsimonious models and of extracting biologically relevant information from the fitted models. The latent multivariate Gaussian field is decomposed into components given in terms of random fields common to all species and components which are species specific. This allows...

  13. Hierarchically Nanostructured Materials for Sustainable Environmental Applications

    Directory of Open Access Journals (Sweden)

    Zheng eRen

    2013-11-01

    Full Text Available This article presents a comprehensive overview of the hierarchical nanostructured materials with either geometry or composition complexity in environmental applications. The hierarchical nanostructures offer advantages of high surface area, synergistic interactions and multiple functionalities towards water remediation, environmental gas sensing and monitoring as well as catalytic gas treatment. Recent advances in synthetic strategies for various hierarchical morphologies such as hollow spheres and urchin-shaped architectures have been reviewed. In addition to the chemical synthesis, the physical mechanisms associated with the materials design and device fabrication have been discussed for each specific application. The development and application of hierarchical complex perovskite oxide nanostructures have also been introduced in photocatalytic water remediation, gas sensing and catalytic converter. Hierarchical nanostructures will open up many possibilities for materials design and device fabrication in environmental chemistry and technology.

  14. Hierarchical Rhetorical Sentence Categorization for Scientific Papers

    Science.gov (United States)

    Rachman, G. H.; Khodra, M. L.; Widyantoro, D. H.

    2018-03-01

    Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naïve Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child-category can be built by hierarchical classifier.

  15. Processing of hierarchical syntactic structure in music.

    Science.gov (United States)

    Koelsch, Stefan; Rohrmeier, Martin; Torrecuso, Renzo; Jentschke, Sebastian

    2013-09-17

    Hierarchical structure with nested nonlocal dependencies is a key feature of human language and can be identified theoretically in most pieces of tonal music. However, previous studies have argued against the perception of such structures in music. Here, we show processing of nonlocal dependencies in music. We presented chorales by J. S. Bach and modified versions in which the hierarchical structure was rendered irregular whereas the local structure was kept intact. Brain electric responses differed between regular and irregular hierarchical structures, in both musicians and nonmusicians. This finding indicates that, when listening to music, humans apply cognitive processes that are capable of dealing with long-distance dependencies resulting from hierarchically organized syntactic structures. Our results reveal that a brain mechanism fundamental for syntactic processing is engaged during the perception of music, indicating that processing of hierarchical structure with nested nonlocal dependencies is not just a key component of human language, but a multidomain capacity of human cognition.

  16. Nonparametric regression using the concept of minimum energy

    International Nuclear Information System (INIS)

    Williams, Mike

    2011-01-01

    It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.

  17. Linear regression in astronomy. II

    Science.gov (United States)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  18. Time-adaptive quantile regression

    DEFF Research Database (Denmark)

    Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik

    2008-01-01

    and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....

  19. TMVA(Toolkit for Multivariate Analysis) new architectures design and implementation.

    CERN Document Server

    Zapata Mesa, Omar Andres

    2016-01-01

    Toolkit for Multivariate Analysis(TMVA) is a package in ROOT for machine learning algorithms for classification and regression of the events in the detectors. In TMVA, we are developing new high level algorithms to perform multivariate analysis as cross validation, hyper parameter optimization, variable importance etc... Almost all the algorithms are expensive and designed to process a huge amount of data. It is very important to implement the new technologies on parallel computing to reduce the processing times.

  20. Handbook of univariate and multivariate data analysis with IBM SPSS

    CERN Document Server

    Ho, Robert

    2013-01-01

    Using the same accessible, hands-on approach as its best-selling predecessor, the Handbook of Univariate and Multivariate Data Analysis with IBM SPSS, Second Edition explains how to apply statistical tests to experimental findings, identify the assumptions underlying the tests, and interpret the findings. This second edition now covers more topics and has been updated with the SPSS statistical package for Windows.New to the Second EditionThree new chapters on multiple discriminant analysis, logistic regression, and canonical correlationNew section on how to deal with missing dataCoverage of te

  1. The Realized Hierarchical Archimedean Copula in Risk Modelling

    Directory of Open Access Journals (Sweden)

    Ostap Okhrin

    2017-06-01

    Full Text Available This paper introduces the concept of the realized hierarchical Archimedean copula (rHAC. The proposed approach inherits the ability of the copula to capture the dependencies among financial time series, and combines it with additional information contained in high-frequency data. The considered model does not suffer from the curse of dimensionality, and is able to accurately predict high-dimensional distributions. This flexibility is obtained by using a hierarchical structure in the copula. The time variability of the model is provided by daily forecasts of the realized correlation matrix, which is used to estimate the structure and the parameters of the rHAC. Extensive simulation studies show the validity of the estimator based on this realized correlation matrix, and its performance, in comparison to the benchmark models. The application of the estimator to one-day-ahead Value at Risk (VaR prediction using high-frequency data exhibits good forecasting properties for a multivariate portfolio.

  2. Quantile regression theory and applications

    CERN Document Server

    Davino, Cristina; Vistocco, Domenico

    2013-01-01

    A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and

  3. Hierarchical Bayesian modelling of mobility metrics for hazard model input calibration

    Science.gov (United States)

    Calder, Eliza; Ogburn, Sarah; Spiller, Elaine; Rutarindwa, Regis; Berger, Jim

    2015-04-01

    In this work we present a method to constrain flow mobility input parameters for pyroclastic flow models using hierarchical Bayes modeling of standard mobility metrics such as H/L and flow volume etc. The advantage of hierarchical modeling is that it can leverage the information in global dataset for a particular mobility metric in order to reduce the uncertainty in modeling of an individual volcano, especially important where individual volcanoes have only sparse datasets. We use compiled pyroclastic flow runout data from Colima, Merapi, Soufriere Hills, Unzen and Semeru volcanoes, presented in an open-source database FlowDat (https://vhub.org/groups/massflowdatabase). While the exact relationship between flow volume and friction varies somewhat between volcanoes, dome collapse flows originating from the same volcano exhibit similar mobility relationships. Instead of fitting separate regression models for each volcano dataset, we use a variation of the hierarchical linear model (Kass and Steffey, 1989). The model presents a hierarchical structure with two levels; all dome collapse flows and dome collapse flows at specific volcanoes. The hierarchical model allows us to assume that the flows at specific volcanoes share a common distribution of regression slopes, then solves for that distribution. We present comparisons of the 95% confidence intervals on the individual regression lines for the data set from each volcano as well as those obtained from the hierarchical model. The results clearly demonstrate the advantage of considering global datasets using this technique. The technique developed is demonstrated here for mobility metrics, but can be applied to many other global datasets of volcanic parameters. In particular, such methods can provide a means to better contain parameters for volcanoes for which we only have sparse data, a ubiquitous problem in volcanology.

  4. Multivariate Marshall and Olkin Exponential Minification Process ...

    African Journals Online (AJOL)

    A stationary bivariate minification process with bivariate Marshall-Olkin exponential distribution that was earlier studied by Miroslav et al [15]is in this paper extended to multivariate minification process with multivariate Marshall and Olkin exponential distribution as its stationary marginal distribution. The innovation and the ...

  5. Multivariate multiscale entropy of financial markets

    Science.gov (United States)

    Lu, Yunfan; Wang, Jun

    2017-11-01

    In current process of quantifying the dynamical properties of the complex phenomena in financial market system, the multivariate financial time series are widely concerned. In this work, considering the shortcomings and limitations of univariate multiscale entropy in analyzing the multivariate time series, the multivariate multiscale sample entropy (MMSE), which can evaluate the complexity in multiple data channels over different timescales, is applied to quantify the complexity of financial markets. Its effectiveness and advantages have been detected with numerical simulations with two well-known synthetic noise signals. For the first time, the complexity of four generated trivariate return series for each stock trading hour in China stock markets is quantified thanks to the interdisciplinary application of this method. We find that the complexity of trivariate return series in each hour show a significant decreasing trend with the stock trading time progressing. Further, the shuffled multivariate return series and the absolute multivariate return series are also analyzed. As another new attempt, quantifying the complexity of global stock markets (Asia, Europe and America) is carried out by analyzing the multivariate returns from them. Finally we utilize the multivariate multiscale entropy to assess the relative complexity of normalized multivariate return volatility series with different degrees.

  6. Prediction of periodically correlated processes by wavelet transform and multivariate methods with applications to climatological data

    Science.gov (United States)

    Ghanbarzadeh, Mitra; Aminghafari, Mina

    2015-05-01

    This article studies the prediction of periodically correlated process using wavelet transform and multivariate methods with applications to climatological data. Periodically correlated processes can be reformulated as multivariate stationary processes. Considering this fact, two new prediction methods are proposed. In the first method, we use stepwise regression between the principal components of the multivariate stationary process and past wavelet coefficients of the process to get a prediction. In the second method, we propose its multivariate version without principal component analysis a priori. Also, we study a generalization of the prediction methods dealing with a deterministic trend using exponential smoothing. Finally, we illustrate the performance of the proposed methods on simulated and real climatological data (ozone amounts, flows of a river, solar radiation, and sea levels) compared with the multivariate autoregressive model. The proposed methods give good results as we expected.

  7. Collaborative regression-based anatomical landmark detection

    International Nuclear Information System (INIS)

    Gao, Yaozong; Shen, Dinggang

    2015-01-01

    Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head and neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods. (paper)

  8. Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

    Science.gov (United States)

    Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

    2006-01-01

    Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

  9. Multivariate statistical analysis of a multi-step industrial processes

    DEFF Research Database (Denmark)

    Reinikainen, S.P.; Høskuldsson, Agnar

    2007-01-01

    Monitoring and quality control of industrial processes often produce information on how the data have been obtained. In batch processes, for instance, the process is carried out in stages; some process or control parameters are set at each stage. However, the obtained data might not be utilized...... efficiently, even if this information may reveal significant knowledge about process dynamics or ongoing phenomena. When studying the process data, it may be important to analyse the data in the light of the physical or time-wise development of each process step. In this paper, a unified approach to analyse...... multivariate multi-step processes, where results from each step are used to evaluate future results, is presented. The methods presented are based on Priority PLS Regression. The basic idea is to compute the weights in the regression analysis for given steps, but adjust all data by the resulting score vectors...

  10. Hierarchical ordering with partial pairwise hierarchical relationships on the macaque brain data sets.

    Directory of Open Access Journals (Sweden)

    Woosang Lim

    Full Text Available Hierarchical organizations of information processing in the brain networks have been known to exist and widely studied. To find proper hierarchical structures in the macaque brain, the traditional methods need the entire pairwise hierarchical relationships between cortical areas. In this paper, we present a new method that discovers hierarchical structures of macaque brain networks by using partial information of pairwise hierarchical relationships. Our method uses a graph-based manifold learning to exploit inherent relationship, and computes pseudo distances of hierarchical levels for every pair of cortical areas. Then, we compute hierarchy levels of all cortical areas by minimizing the sum of squared hierarchical distance errors with the hierarchical information of few cortical areas. We evaluate our method on the macaque brain data sets whose true hierarchical levels are known as the FV91 model. The experimental results show that hierarchy levels computed by our method are similar to the FV91 model, and its errors are much smaller than the errors of hierarchical clustering approaches.

  11. Panel Smooth Transition Regression Models

    DEFF Research Database (Denmark)

    González, Andrés; Terasvirta, Timo; Dijk, Dick van

    We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...

  12. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin

    2017-01-19

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  13. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun

    2017-01-01

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  14. Logistic Regression: Concept and Application

    Science.gov (United States)

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  15. Classification using Hierarchical Naive Bayes models

    DEFF Research Database (Denmark)

    Langseth, Helge; Dyhre Nielsen, Thomas

    2006-01-01

    Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe......, termed Hierarchical Naïve Bayes models. Hierarchical Naïve Bayes models extend the modeling flexibility of Naïve Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naïve Bayes models...

  16. Hierarchical analysis of acceptable use policies

    Directory of Open Access Journals (Sweden)

    P. A. Laughton

    2008-01-01

    Full Text Available Acceptable use policies (AUPs are vital tools for organizations to protect themselves and their employees from misuse of computer facilities provided. A well structured, thorough AUP is essential for any organization. It is impossible for an effective AUP to deal with every clause and remain readable. For this reason, some sections of an AUP carry more weight than others, denoting importance. The methodology used to develop the hierarchical analysis is a literature review, where various sources were consulted. This hierarchical approach to AUP analysis attempts to highlight important sections and clauses dealt with in an AUP. The emphasis of the hierarchal analysis is to prioritize the objectives of an AUP.

  17. Hierarchical modeling and analysis for spatial data

    CERN Document Server

    Banerjee, Sudipto; Gelfand, Alan E

    2003-01-01

    Among the many uses of hierarchical modeling, their application to the statistical analysis of spatial and spatio-temporal data from areas such as epidemiology And environmental science has proven particularly fruitful. Yet to date, the few books that address the subject have been either too narrowly focused on specific aspects of spatial analysis, or written at a level often inaccessible to those lacking a strong background in mathematical statistics.Hierarchical Modeling and Analysis for Spatial Data is the first accessible, self-contained treatment of hierarchical methods, modeling, and dat

  18. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong

    2017-08-03

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced on a large-scale approach. The unique HNDCM holds great promise as components in separation and advanced carbon devices because they could offer unconventional fluidic transport phenomena on the nanoscale. Overall, the invention set forth herein covers a hierarchically structured, nitrogen-doped carbon membranes and methods of making and using such a membranes.

  19. Fungible weights in logistic regression.

    Science.gov (United States)

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  20. Ordinary least square regression, orthogonal regression, geometric mean regression and their applications in aerosol science

    International Nuclear Information System (INIS)

    Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei

    2007-01-01

    Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age

  1. Tumor regression patterns in retinoblastoma

    International Nuclear Information System (INIS)

    Zafar, S.N.; Siddique, S.N.; Zaheer, N.

    2016-01-01

    To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)

  2. Regression to Causality : Regression-style presentation influences causal attribution

    DEFF Research Database (Denmark)

    Bordacconi, Mats Joe; Larsen, Martin Vinæs

    2014-01-01

    of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...

  3. Multivariate meta-analysis: Potential and promise

    Science.gov (United States)

    Jackson, Dan; Riley, Richard; White, Ian R

    2011-01-01

    The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052

  4. Regression analysis with categorized regression calibrated exposure: some interesting findings

    Directory of Open Access Journals (Sweden)

    Hjartåker Anette

    2006-07-01

    Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a

  5. Extracting bb Higgs Decay Signals using Multivariate Techniques

    Energy Technology Data Exchange (ETDEWEB)

    Smith, W Clarke; /George Washington U. /SLAC

    2012-08-28

    For low-mass Higgs boson production at ATLAS at {radical}s = 7 TeV, the hard subprocess gg {yields} h{sup 0} {yields} b{bar b} dominates but is in turn drowned out by background. We seek to exploit the intrinsic few-MeV mass width of the Higgs boson to observe it above the background in b{bar b}-dijet mass plots. The mass resolution of existing mass-reconstruction algorithms is insufficient for this purpose due to jet combinatorics, that is, the algorithms cannot identify every jet that results from b{bar b} Higgs decay. We combine these algorithms using the neural net (NN) and boosted regression tree (BDT) multivariate methods in attempt to improve the mass resolution. Events involving gg {yields} h{sup 0} {yields} b{bar b} are generated using Monte Carlo methods with Pythia and then the Toolkit for Multivariate Analysis (TMVA) is used to train and test NNs and BDTs. For a 120 GeV Standard Model Higgs boson, the m{sub h{sup 0}}-reconstruction width is reduced from 8.6 to 6.5 GeV. Most importantly, however, the methods used here allow for more advanced m{sub h{sup 0}}-reconstructions to be created in the future using multivariate methods.

  6. Multivariate statistical methods a first course

    CERN Document Server

    Marcoulides, George A

    2014-01-01

    Multivariate statistics refer to an assortment of statistical methods that have been developed to handle situations in which multiple variables or measures are involved. Any analysis of more than two variables or measures can loosely be considered a multivariate statistical analysis. An introductory text for students learning multivariate statistical methods for the first time, this book keeps mathematical details to a minimum while conveying the basic principles. One of the principal strategies used throughout the book--in addition to the presentation of actual data analyses--is poin

  7. Multivariable control in nuclear power stations

    International Nuclear Information System (INIS)

    Parent, M.; McMorran, P.D.

    1982-11-01

    Multivariable methods have the potential to improve the control of large systems such as nuclear power stations. Linear-quadratic optimal control is a multivariable method based on the minimization of a cost function. A related technique leads to the Kalman filter for estimation of plant state from noisy measurements. A design program for optimal control and Kalman filtering has been developed as part of a computer-aided design package for multivariable control systems. The method is demonstrated on a model of a nuclear steam generator, and simulated results are presented

  8. Logic regression and its extensions.

    Science.gov (United States)

    Schwender, Holger; Ruczinski, Ingo

    2010-01-01

    Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.

  9. Zeolitic materials with hierarchical porous structures.

    Science.gov (United States)

    Lopez-Orozco, Sofia; Inayat, Amer; Schwab, Andreas; Selvam, Thangaraj; Schwieger, Wilhelm

    2011-06-17

    During the past several years, different kinds of hierarchical structured zeolitic materials have been synthesized due to their highly attractive properties, such as superior mass/heat transfer characteristics, lower restriction of the diffusion of reactants in the mesopores, and low pressure drop. Our contribution provides general information regarding types and preparation methods of hierarchical zeolitic materials and their relative advantages and disadvantages. Thereafter, recent advances in the preparation and characterization of hierarchical zeolitic structures within the crystallites by post-synthetic treatment methods, such as dealumination or desilication; and structured devices by in situ and ex situ zeolite coatings on open-cellular ceramic foams as (non-reactive as well as reactive) supports are highlighted. Specific advantages of using hierarchical zeolitic catalysts/structures in selected catalytic reactions, such as benzene to phenol (BTOP) and methanol to olefins (MTO) are presented. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. HIERARCHICAL ORGANIZATION OF INFORMATION, IN RELATIONAL DATABASES

    Directory of Open Access Journals (Sweden)

    Demian Horia

    2008-05-01

    Full Text Available In this paper I will present different types of representation, of hierarchical information inside a relational database. I also will compare them to find the best organization for specific scenarios.

  11. Hierarchical DSE for multi-ASIP platforms

    DEFF Research Database (Denmark)

    Micconi, Laura; Corvino, Rosilde; Gangadharan, Deepak

    2013-01-01

    This work proposes a hierarchical Design Space Exploration (DSE) for the design of multi-processor platforms targeted to specific applications with strict timing and area constraints. In particular, it considers platforms integrating multiple Application Specific Instruction Set Processors (ASIPs...

  12. Packaging glass with hierarchically nanostructured surface

    KAUST Repository

    He, Jr-Hau

    2017-08-03

    An optical device includes an active region and packaging glass located on top of the active region. A top surface of the packaging glass includes hierarchical nanostructures comprised of honeycombed nanowalls (HNWs) and nanorod (NR) structures extending from the HNWs.

  13. Packaging glass with hierarchically nanostructured surface

    KAUST Repository

    He, Jr-Hau; Fu, Hui-Chun

    2017-01-01

    An optical device includes an active region and packaging glass located on top of the active region. A top surface of the packaging glass includes hierarchical nanostructures comprised of honeycombed nanowalls (HNWs) and nanorod (NR) structures

  14. Dirichlet Component Regression and its Applications to Psychiatric Data

    OpenAIRE

    Gueorguieva, Ralitza; Rosenheck, Robert; Zelterman, Daniel

    2008-01-01

    We describe a Dirichlet multivariable regression method useful for modeling data representing components as a percentage of a total. This model is motivated by the unmet need in psychiatry and other areas to simultaneously assess the effects of covariates on the relative contributions of different components of a measure. The model is illustrated using the Positive and Negative Syndrome Scale (PANSS) for assessment of schizophrenia symptoms which, like many other metrics in psychiatry, is com...

  15. Predicting allergic contact dermatitis: a hierarchical structure activity relationship (SAR) approach to chemical classification using topological and quantum chemical descriptors

    Science.gov (United States)

    Basak, Subhash C.; Mills, Denise; Hawkins, Douglas M.

    2008-06-01

    A hierarchical classification study was carried out based on a set of 70 chemicals—35 which produce allergic contact dermatitis (ACD) and 35 which do not. This approach was implemented using a regular ridge regression computer code, followed by conversion of regression output to binary data values. The hierarchical descriptor classes used in the modeling include topostructural (TS), topochemical (TC), and quantum chemical (QC), all of which are based solely on chemical structure. The concordance, sensitivity, and specificity are reported. The model based on the TC descriptors was found to be the best, while the TS model was extremely poor.

  16. Hierarchical organization versus self-organization

    OpenAIRE

    Busseniers, Evo

    2014-01-01

    In this paper we try to define the difference between hierarchical organization and self-organization. Organization is defined as a structure with a function. So we can define the difference between hierarchical organization and self-organization both on the structure as on the function. In the next two chapters these two definitions are given. For the structure we will use some existing definitions in graph theory, for the function we will use existing theory on (self-)organization. In the t...

  17. Hierarchical decision making for flood risk reduction

    DEFF Research Database (Denmark)

    Custer, Rocco; Nishijima, Kazuyoshi

    2013-01-01

    . In current practice, structures are often optimized individually without considering benefits of having a hierarchy of protection structures. It is here argued, that the joint consideration of hierarchically integrated protection structures is beneficial. A hierarchical decision model is utilized to analyze...... and compare the benefit of large upstream protection structures and local downstream protection structures in regard to epistemic uncertainty parameters. Results suggest that epistemic uncertainty influences the outcome of the decision model and that, depending on the magnitude of epistemic uncertainty...

  18. Advanced multivariate data evaluation for Fourier transform infrared spectroscopy

    International Nuclear Information System (INIS)

    Diewok, J.

    2002-12-01

    The objective of the presented dissertation was the evaluation, application and further development of advanced multivariate data evaluation methods for qualitative and quantitative Fourier transform infrared (FT-IR) measurements, especially of aqueous samples. The focus was set on 'evolving systems'; i.e. chemical systems that change gradually with a master variable, such as pH, reaction time, elution time, etc. and that are increasingly encountered in analytical chemistry. FT-IR measurements on such systems yield 2-way and 3-way data sets, i.e. data matrices and cubes. The chemometric methods used were soft-modeling techniques, like multivariate curve resolution - alternating least squares (MCR-ALS) or principal component analysis (PCA), hard modeling of equilibrium systems and two-dimensional correlation spectroscopy (2D-CoS). The research results are presented in six publications and comprise: A new combination of FT-IR flow titrations and second-order calibration by MCR-ALS for the quantitative analysis of mixture samples of organic acids and sugars. A novel combination of MCR-ALS with a hard-modeled equilibrium constraint for second-order quantitation in pH-modulated samples where analytes and interferences show very similar acid-base behavior. A detailed study in which MCR-ALS and 2D-CoS are directly compared for the first time. From the analysis of simulated and experimental acid-base equilibrium systems, the performance and interpretability of the two methods is evaluated. Investigation of the binding process of vancomycin, an important antibiotic, to a cell wall analogue tripeptide by time-resolved FT-IR spectroscopy and detailed chemometric evaluation. Determination of red wine constituents by liquid chromatography with FT-IR detection and MCR-ALS for resolution of overlapped peaks. Classification of red wine cultivars from FT-IR spectroscopy of phenolic wine extracts with hierarchical clustering and soft independent modeling of class analogy (SIMCA

  19. Hierarchical Nanoceramics for Industrial Process Sensors

    Energy Technology Data Exchange (ETDEWEB)

    Ruud, James, A.; Brosnan, Kristen, H.; Striker, Todd; Ramaswamy, Vidya; Aceto, Steven, C.; Gao, Yan; Willson, Patrick, D.; Manoharan, Mohan; Armstrong, Eric, N., Wachsman, Eric, D.; Kao, Chi-Chang

    2011-07-15

    This project developed a robust, tunable, hierarchical nanoceramics materials platform for industrial process sensors in harsh-environments. Control of material structure at multiple length scales from nano to macro increased the sensing response of the materials to combustion gases. These materials operated at relatively high temperatures, enabling detection close to the source of combustion. It is anticipated that these materials can form the basis for a new class of sensors enabling widespread use of efficient combustion processes with closed loop feedback control in the energy-intensive industries. The first phase of the project focused on materials selection and process development, leading to hierarchical nanoceramics that were evaluated for sensing performance. The second phase focused on optimizing the materials processes and microstructures, followed by validation of performance of a prototype sensor in a laboratory combustion environment. The objectives of this project were achieved by: (1) synthesizing and optimizing hierarchical nanostructures; (2) synthesizing and optimizing sensing nanomaterials; (3) integrating sensing functionality into hierarchical nanostructures; (4) demonstrating material performance in a sensing element; and (5) validating material performance in a simulated service environment. The project developed hierarchical nanoceramic electrodes for mixed potential zirconia gas sensors with increased surface area and demonstrated tailored electrocatalytic activity operable at high temperatures enabling detection of products of combustion such as NOx close to the source of combustion. Methods were developed for synthesis of hierarchical nanostructures with high, stable surface area, integrated catalytic functionality within the structures for gas sensing, and demonstrated materials performance in harsh lab and combustion gas environments.

  20. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  1. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  2. Hierarchical screening for multiple mental disorders.

    Science.gov (United States)

    Batterham, Philip J; Calear, Alison L; Sunderland, Matthew; Carragher, Natacha; Christensen, Helen; Mackinnon, Andrew J

    2013-10-01

    There is a need for brief, accurate screening when assessing multiple mental disorders. Two-stage hierarchical screening, consisting of brief pre-screening followed by a battery of disorder-specific scales for those who meet diagnostic criteria, may increase the efficiency of screening without sacrificing precision. This study tested whether more efficient screening could be gained using two-stage hierarchical screening than by administering multiple separate tests. Two Australian adult samples (N=1990) with high rates of psychopathology were recruited using Facebook advertising to examine four methods of hierarchical screening for four mental disorders: major depressive disorder, generalised anxiety disorder, panic disorder and social phobia. Using K6 scores to determine whether full screening was required did not increase screening efficiency. However, pre-screening based on two decision tree approaches or item gating led to considerable reductions in the mean number of items presented per disorder screened, with estimated item reductions of up to 54%. The sensitivity of these hierarchical methods approached 100% relative to the full screening battery. Further testing of the hierarchical screening approach based on clinical criteria and in other samples is warranted. The results demonstrate that a two-phase hierarchical approach to screening multiple mental disorders leads to considerable increases efficiency gains without reducing accuracy. Screening programs should take advantage of prescreeners based on gating items or decision trees to reduce the burden on respondents. © 2013 Elsevier B.V. All rights reserved.

  3. Multivariate return periods of sea storms for coastal erosion risk assessment

    Directory of Open Access Journals (Sweden)

    S. Corbella

    2012-08-01

    Full Text Available The erosion of a beach depends on various storm characteristics. Ideally, the risk associated with a storm would be described by a single multivariate return period that is also representative of the erosion risk, i.e. a 100 yr multivariate storm return period would cause a 100 yr erosion return period. Unfortunately, a specific probability level may be associated with numerous combinations of storm characteristics. These combinations, despite having the same multivariate probability, may cause very different erosion outcomes. This paper explores this ambiguity problem in the context of copula based multivariate return periods and using a case study at Durban on the east coast of South Africa. Simulations were used to correlate multivariate return periods of historical events to return periods of estimated storm induced erosion volumes. In addition, the relationship of the most-likely design event (Salvadori et al., 2011 to coastal erosion was investigated. It was found that the multivariate return periods for wave height and duration had the highest correlation to erosion return periods. The most-likely design event was found to be an inadequate design method in its current form. We explore the inclusion of conditions based on the physical realizability of wave events and the use of multivariate linear regression to relate storm parameters to erosion computed from a process based model. Establishing a link between storm statistics and erosion consequences can resolve the ambiguity between multivariate storm return periods and associated erosion return periods.

  4. Directional outlyingness for multivariate functional data

    KAUST Repository

    Dai, Wenlin; Genton, Marc G.

    2018-01-01

    The direction of outlyingness is crucial to describing the centrality of multivariate functional data. Motivated by this idea, classical depth is generalized to directional outlyingness for functional data. Theoretical properties of functional

  5. The value of multivariate model sophistication

    DEFF Research Database (Denmark)

    Rombouts, Jeroen; Stentoft, Lars; Violante, Francesco

    2014-01-01

    We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ in their spec....... In addition to investigating the value of model sophistication in terms of dollar losses directly, we also use the model confidence set approach to statistically infer the set of models that delivers the best pricing performances.......We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ...

  6. Multivariate survival analysis and competing risks

    CERN Document Server

    Crowder, Martin J

    2012-01-01

    Multivariate Survival Analysis and Competing Risks introduces univariate survival analysis and extends it to the multivariate case. It covers competing risks and counting processes and provides many real-world examples, exercises, and R code. The text discusses survival data, survival distributions, frailty models, parametric methods, multivariate data and distributions, copulas, continuous failure, parametric likelihood inference, and non- and semi-parametric methods. There are many books covering survival analysis, but very few that cover the multivariate case in any depth. Written for a graduate-level audience in statistics/biostatistics, this book includes practical exercises and R code for the examples. The author is renowned for his clear writing style, and this book continues that trend. It is an excellent reference for graduate students and researchers looking for grounding in this burgeoning field of research.

  7. Simplicial band depth for multivariate functional data

    KAUST Repository

    Ló pez-Pintado, Sara; Sun, Ying; Lin, Juan K.; Genton, Marc G.

    2014-01-01

    sample of curves. Based on these depths, a sample of multivariate curves can be ordered from the center outward and order statistics can be defined. Properties of the proposed depths, such as invariance and consistency, can be established. A simulation

  8. Ellipsoidal prediction regions for multivariate uncertainty characterization

    DEFF Research Database (Denmark)

    Golestaneh, Faranak; Pinson, Pierre; Azizipanah-Abarghooee, Rasoul

    2018-01-01

    , for classes of decision-making problems based on robust, interval chance-constrained optimization, necessary inputs take the form of multivariate prediction regions rather than scenarios. The current literature is at very primitive stage of characterizing multivariate prediction regions to be employed...... in these classes of optimization problems. To address this issue, we introduce a new class of multivariate forecasts which form as multivariate ellipsoids for non-Gaussian variables. We propose a data-driven systematic framework to readily generate and evaluate ellipsoidal prediction regions, with predefined...... probability guarantees and minimum conservativeness. A skill score is proposed for quantitative assessment of the quality of prediction ellipsoids. A set of experiments is used to illustrate the discrimination ability of the proposed scoring rule for potential misspecification of ellipsoidal prediction regions...

  9. An Introduction to Applied Multivariate Analysis

    CERN Document Server

    Raykov, Tenko

    2008-01-01

    Focuses on the core multivariate statistics topics which are of fundamental relevance for its understanding. This book emphasis on the topics that are critical to those in the behavioral, social, and educational sciences.

  10. Abstract Expression Grammar Symbolic Regression

    Science.gov (United States)

    Korns, Michael F.

    This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.

  11. Quantile Regression With Measurement Error

    KAUST Repository

    Wei, Ying; Carroll, Raymond J.

    2009-01-01

    . The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a

  12. Multivariable Feedback Control of Nuclear Reactors

    Directory of Open Access Journals (Sweden)

    Rune Moen

    1982-07-01

    Full Text Available Multivariable feedback control has been adapted for optimal control of the spatial power distribution in nuclear reactor cores. Two design techniques, based on the theory of automatic control, were developed: the State Variable Feedback (SVF is an application of the linear optimal control theory, and the Multivariable Frequency Response (MFR is based on a generalization of the traditional frequency response approach to control system design.

  13. Application of multivariate splines to discrete mathematics

    OpenAIRE

    Xu, Zhiqiang

    2005-01-01

    Using methods developed in multivariate splines, we present an explicit formula for discrete truncated powers, which are defined as the number of non-negative integer solutions of linear Diophantine equations. We further use the formula to study some classical problems in discrete mathematics as follows. First, we extend the partition function of integers in number theory. Second, we exploit the relation between the relative volume of convex polytopes and multivariate truncated powers and giv...

  14. From Rasch scores to regression

    DEFF Research Database (Denmark)

    Christensen, Karl Bang

    2006-01-01

    Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....

  15. Testing Heteroscedasticity in Robust Regression

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2011-01-01

    Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf

  16. Regression methods for medical research

    CERN Document Server

    Tai, Bee Choo

    2013-01-01

    Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the

  17. Forecasting with Dynamic Regression Models

    CERN Document Server

    Pankratz, Alan

    2012-01-01

    One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.

  18. A review of multivariate analyses in imaging genetics

    Directory of Open Access Journals (Sweden)

    Jingyu eLiu

    2014-03-01

    Full Text Available Recent advances in neuroimaging technology and molecular genetics provide the unique opportunity to investigate genetic influence on the variation of brain attributes. Since the year 2000, when the initial publication on brain imaging and genetics was released, imaging genetics has been a rapidly growing research approach with increasing publications every year. Several reviews have been offered to the research community focusing on various study designs. In addition to study design, analytic tools and their proper implementation are also critical to the success of a study. In this review, we survey recent publications using data from neuroimaging and genetics, focusing on methods capturing multivariate effects accommodating the large number of variables from both imaging data and genetic data. We group the analyses of genetic or genomic data into either a prior driven or data driven approach, including gene-set enrichment analysis, multifactor dimensionality reduction, principal component analysis, independent component analysis (ICA, and clustering. For the analyses of imaging data, ICA and extensions of ICA are the most widely used multivariate methods. Given detailed reviews of multivariate analyses of imaging data available elsewhere, we provide a brief summary here that includes a recently proposed method known as independent vector analysis. Finally, we review methods focused on bridging the imaging and genetic data by establishing multivariate and multiple genotype-phenotype associations, including sparse partial least squares, sparse canonical correlation analysis, sparse reduced rank regression and parallel ICA. These methods are designed to extract latent variables from both genetic and imaging data, which become new genotypes and phenotypes, and the links between the new genotype-phenotype pairs are maximized using different cost functions. The relationship between these methods along with their assumptions, advantages, and

  19. Air Quality Pattern Assessment in Malaysia Using Multivariate Techniques

    International Nuclear Information System (INIS)

    Hamza Ahmad Isiyaka; Azman Azid

    2015-01-01

    This study aims to investigate the spatial characteristics in the pattern of air quality monitoring sites, identify the most discriminating parameters contributing to air pollution, and predict the level of air pollution index (API) in Malaysia using multivariate techniques. Five parameters observed for five years (2000-2004) were used. Hierarchical agglomerative cluster analysis classified the five air quality monitoring sites into two independent groups based on the characteristics of activities in the monitoring stations. Discriminate analysis for standard, backward stepwise and forward stepwise mode gave a correct assignation of more than 87 % in the confusion matrix. This result indicates that only three parameters (PM_1_0, SO_2 and NO_2) with a p<0.0001 discriminate best in polluting the air. The major possible sources of air pollution were identified using principal component analysis that account for more than 58 % and 60 % in the total variance. Based on the findings, anthropogenic activities (vehicular emission, industrial activities, construction sites, bush burning) have a strong influence in the source of air pollution. Furthermore, artificial neural network (ANN) was used to predict the level of air pollution index at R"2 = 0.8493 and RMSE = 5.9184. This indicates that ANN can predict more than 84 % of the API. (author)

  20. TOURISM SEGMENTATION BASED ON TOURISTS PREFERENCES: A MULTIVARIATE APPROACH

    Directory of Open Access Journals (Sweden)

    Sérgio Dominique Ferreira

    2010-11-01

    Full Text Available Over the last decades, tourism became one of the most important sectors of the international economy. Specifically in Portugal and Brazil, its contribution to Gross Domestic Product (GDP and job creation is quite relevant. In this sense, to follow a strong marketing approach on the management of tourism resources of a country comes to be paramount. Such an approach should be based on innovations which help unveil the preferences of tourists with accuracy, turning it into a competitive advantage. In this context, the main objective of the present study is to illustrate the importance and benefits associated with the use of multivariate methodologies for market segmentation. Another objective of this work is to illustrate on the importance of a post hoc segmentation. In this work, the authors applied a Cluster Analysis, with a hierarchical method followed by an  optimization method. The main results of this study allow the identification of five clusters that are distinguished by assigning special importance to certain tourism attributes at the moment of choosing a specific destination. Thus, the authors present the advantages of post hoc segmentation based on tourists’ preferences, in opposition to an a priori segmentation based on socio-demographic characteristics.

  1. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Directory of Open Access Journals (Sweden)

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  2. Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk

    International Nuclear Information System (INIS)

    Lee, Sangkyu; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam; Faria, Sergio; Kopek, Neil; Brisebois, Pascale; Bradley, Jeffrey D.; Robinson, Clifford

    2015-01-01

    Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0

  3. Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkyu, E-mail: sangkyu.lee@mail.mcgill.ca; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam [Medical Physics Unit, McGill University, Montreal, Quebec H3G1A4 (Canada); Faria, Sergio; Kopek, Neil; Brisebois, Pascale [Department of Radiation Oncology, Montreal General Hospital, Montreal, H3G1A4 (Canada); Bradley, Jeffrey D.; Robinson, Clifford [Radiation Oncology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110 (United States)

    2015-05-15

    Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0

  4. Logistic regression for dichotomized counts.

    Science.gov (United States)

    Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W

    2016-12-01

    Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.

  5. Admissible Estimators in the General Multivariate Linear Model with Respect to Inequality Restricted Parameter Set

    Directory of Open Access Journals (Sweden)

    Shangli Zhang

    2009-01-01

    Full Text Available By using the methods of linear algebra and matrix inequality theory, we obtain the characterization of admissible estimators in the general multivariate linear model with respect to inequality restricted parameter set. In the classes of homogeneous and general linear estimators, the necessary and suffcient conditions that the estimators of regression coeffcient function are admissible are established.

  6. On the relation between S-Estimators and M-Estimators of multivariate location and covariance

    NARCIS (Netherlands)

    Lopuhaa, H.P.

    1987-01-01

    We discuss the relation between S-estimators and M-estimators of multivariate location and covariance. As in the case of the estimation of a multiple regression parameter, S-estimators are shown to satisfy first-order conditions of M-estimators. We show that the influence function IF (x;S F) of

  7. Identification of Civil Engineering Structures using Multivariate ARMAV and RARMAV Models

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Andersen, P.; Brincker, Rune

    This paper presents how to make system identification of civil engineering structures using multivariate auto-regressive moving-average vector (ARMAV) models. Further, the ARMAV technique is extended to a recursive technique (RARMAV). The ARMAV model is used to identify measured stationary data....... The results show the usefulness of the approaches for identification of civil engineering structures excited by natural excitation...

  8. Producing The New Regressive Left

    DEFF Research Database (Denmark)

    Crone, Christine

    members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...

  9. Multivariate Max-Stable Spatial Processes

    KAUST Repository

    Genton, Marc G.

    2014-01-06

    Analysis of spatial extremes is currently based on univariate processes. Max-stable processes allow the spatial dependence of extremes to be modelled and explicitly quantified, they are therefore widely adopted in applications. For a better understanding of extreme events of real processes, such as environmental phenomena, it may be useful to study several spatial variables simultaneously. To this end, we extend some theoretical results and applications of max-stable processes to the multivariate setting to analyze extreme events of several variables observed across space. In particular, we study the maxima of independent replicates of multivariate processes, both in the Gaussian and Student-t cases. Then, we define a Poisson process construction in the multivariate setting and introduce multivariate versions of the Smith Gaussian extremevalue, the Schlather extremal-Gaussian and extremal-t, and the BrownResnick models. Inferential aspects of those models based on composite likelihoods are developed. We present results of various Monte Carlo simulations and of an application to a dataset of summer daily temperature maxima and minima in Oklahoma, U.S.A., highlighting the utility of working with multivariate models in contrast to the univariate case. Based on joint work with Simone Padoan and Huiyan Sang.

  10. Multivariate Max-Stable Spatial Processes

    KAUST Repository

    Genton, Marc G.

    2014-01-01

    Analysis of spatial extremes is currently based on univariate processes. Max-stable processes allow the spatial dependence of extremes to be modelled and explicitly quantified, they are therefore widely adopted in applications. For a better understanding of extreme events of real processes, such as environmental phenomena, it may be useful to study several spatial variables simultaneously. To this end, we extend some theoretical results and applications of max-stable processes to the multivariate setting to analyze extreme events of several variables observed across space. In particular, we study the maxima of independent replicates of multivariate processes, both in the Gaussian and Student-t cases. Then, we define a Poisson process construction in the multivariate setting and introduce multivariate versions of the Smith Gaussian extremevalue, the Schlather extremal-Gaussian and extremal-t, and the BrownResnick models. Inferential aspects of those models based on composite likelihoods are developed. We present results of various Monte Carlo simulations and of an application to a dataset of summer daily temperature maxima and minima in Oklahoma, U.S.A., highlighting the utility of working with multivariate models in contrast to the univariate case. Based on joint work with Simone Padoan and Huiyan Sang.

  11. Analysis hierarchical model for discrete event systems

    Science.gov (United States)

    Ciortea, E. M.

    2015-11-01

    The This paper presents the hierarchical model based on discrete event network for robotic systems. Based on the hierarchical approach, Petri network is analysed as a network of the highest conceptual level and the lowest level of local control. For modelling and control of complex robotic systems using extended Petri nets. Such a system is structured, controlled and analysed in this paper by using Visual Object Net ++ package that is relatively simple and easy to use, and the results are shown as representations easy to interpret. The hierarchical structure of the robotic system is implemented on computers analysed using specialized programs. Implementation of hierarchical model discrete event systems, as a real-time operating system on a computer network connected via a serial bus is possible, where each computer is dedicated to local and Petri model of a subsystem global robotic system. Since Petri models are simplified to apply general computers, analysis, modelling, complex manufacturing systems control can be achieved using Petri nets. Discrete event systems is a pragmatic tool for modelling industrial systems. For system modelling using Petri nets because we have our system where discrete event. To highlight the auxiliary time Petri model using transport stream divided into hierarchical levels and sections are analysed successively. Proposed robotic system simulation using timed Petri, offers the opportunity to view the robotic time. Application of goods or robotic and transmission times obtained by measuring spot is obtained graphics showing the average time for transport activity, using the parameters sets of finished products. individually.

  12. Self-assembled biomimetic superhydrophobic hierarchical arrays.

    Science.gov (United States)

    Yang, Hongta; Dou, Xuan; Fang, Yin; Jiang, Peng

    2013-09-01

    Here, we report a simple and inexpensive bottom-up technology for fabricating superhydrophobic coatings with hierarchical micro-/nano-structures, which are inspired by the binary periodic structure found on the superhydrophobic compound eyes of some insects (e.g., mosquitoes and moths). Binary colloidal arrays consisting of exemplary large (4 and 30 μm) and small (300 nm) silica spheres are first assembled by a scalable Langmuir-Blodgett (LB) technology in a layer-by-layer manner. After surface modification with fluorosilanes, the self-assembled hierarchical particle arrays become superhydrophobic with an apparent water contact angle (CA) larger than 150°. The throughput of the resulting superhydrophobic coatings with hierarchical structures can be significantly improved by templating the binary periodic structures of the LB-assembled colloidal arrays into UV-curable fluoropolymers by a soft lithography approach. Superhydrophobic perfluoroether acrylate hierarchical arrays with large CAs and small CA hysteresis can be faithfully replicated onto various substrates. Both experiments and theoretical calculations based on the Cassie's dewetting model demonstrate the importance of the hierarchical structure in achieving the final superhydrophobic surface states. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Correlation and simple linear regression.

    Science.gov (United States)

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  14. Regression filter for signal resolution

    International Nuclear Information System (INIS)

    Matthes, W.

    1975-01-01

    The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)

  15. Nonparametric Mixture of Regression Models.

    Science.gov (United States)

    Huang, Mian; Li, Runze; Wang, Shaoli

    2013-07-01

    Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.

  16. An architecture for implementation of multivariable controllers

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Stoustrup, Jakob

    1999-01-01

    Browse > Conferences> American Control Conference, Prev | Back to Results | Next » An architecture for implementation of multivariable controllers 786292 searchabstract Niemann, H. ; Stoustrup, J. ; Dept. of Autom., Tech. Univ., Lyngby This paper appears in: American Control Conference, 1999....... Proceedings of the 1999 Issue Date : 1999 Volume : 6 On page(s): 4029 - 4033 vol.6 Location: San Diego, CA Meeting Date : 02 Jun 1999-04 Jun 1999 Print ISBN: 0-7803-4990-3 References Cited: 7 INSPEC Accession Number: 6403075 Digital Object Identifier : 10.1109/ACC.1999.786292 Date of Current Version : 06...... august 2002 Abstract An architecture for implementation of multivariable controllers is presented in this paper. The architecture is based on the Youla-Jabr-Bongiorno-Kucera parameterization of all stabilizing controllers. By using this architecture for implementation of multivariable controllers...

  17. Simplicial band depth for multivariate functional data

    KAUST Repository

    López-Pintado, Sara

    2014-03-05

    We propose notions of simplicial band depth for multivariate functional data that extend the univariate functional band depth. The proposed simplicial band depths provide simple and natural criteria to measure the centrality of a trajectory within a sample of curves. Based on these depths, a sample of multivariate curves can be ordered from the center outward and order statistics can be defined. Properties of the proposed depths, such as invariance and consistency, can be established. A simulation study shows the robustness of this new definition of depth and the advantages of using a multivariate depth versus the marginal depths for detecting outliers. Real data examples from growth curves and signature data are used to illustrate the performance and usefulness of the proposed depths. © 2014 Springer-Verlag Berlin Heidelberg.

  18. Multivariate generalized linear mixed models using R

    CERN Document Server

    Berridge, Damon Mark

    2011-01-01

    Multivariate Generalized Linear Mixed Models Using R presents robust and methodologically sound models for analyzing large and complex data sets, enabling readers to answer increasingly complex research questions. The book applies the principles of modeling to longitudinal data from panel and related studies via the Sabre software package in R. A Unified Framework for a Broad Class of Models The authors first discuss members of the family of generalized linear models, gradually adding complexity to the modeling framework by incorporating random effects. After reviewing the generalized linear model notation, they illustrate a range of random effects models, including three-level, multivariate, endpoint, event history, and state dependence models. They estimate the multivariate generalized linear mixed models (MGLMMs) using either standard or adaptive Gaussian quadrature. The authors also compare two-level fixed and random effects linear models. The appendices contain additional information on quadrature, model...

  19. A MATLAB companion for multivariable calculus

    CERN Document Server

    Cooper, Jeffery

    2001-01-01

    Offering a concise collection of MatLab programs and exercises to accompany a third semester course in multivariable calculus, A MatLab Companion for Multivariable Calculus introduces simple numerical procedures such as numerical differentiation, numerical integration and Newton''s method in several variables, thereby allowing students to tackle realistic problems. The many examples show students how to use MatLab effectively and easily in many contexts. Numerous exercises in mathematics and applications areas are presented, graded from routine to more demanding projects requiring some programming. Matlab M-files are provided on the Harcourt/Academic Press web site at http://www.harcourt-ap.com/matlab.html.* Computer-oriented material that complements the essential topics in multivariable calculus* Main ideas presented with examples of computations and graphics displays using MATLAB * Numerous examples of short code in the text, which can be modified for use with the exercises* MATLAB files are used to implem...

  20. Cactus: An Introduction to Regression

    Science.gov (United States)

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…